Course overview

Learning outcomes and objectives

The course presents various advanced methods within data science for predictive modelling and the use of R. Methods for regression, including non-linear regression and generalized additive models, and methods for classification, including trees, boosting and Support Vector Machines, will be examined. The course will focus on practical use in r, without going into details of the mathematical theory of the methods.

On completion of the course the student should have the following learning outcomes:

Knowledge

  • Knows the basic ideas underpinning carious methods in data science/predictive modelling

Skills

  • Can implement various models within data science/predictive modelling in R
  • Use data science methods on real data sets and perform predictions

General competence

  • Have an overview of how data science methods can be used to analyze larger data sets

Lecture overview

Lecture Subject Exercises Datacamp
1 Introduction and short recap of R and Data preprocessing Recap of R Introduction to Regression in R (ch 1-2)
2 Over-fitting and model tuning, selection and evaluation and multiple regression Multiple regression Supervised Learning in R: Regression (ch 1-2)
3 Non-linear regression GAMs Nonlinear modeling in R with GAMs (ch 1-2)
4 Classification methods Supervised Learning in R: Classification (ch 1-3)
5 Decision Trees and Bagged trees Machine Learning with Tree-Based Models in R (ch 1-3)
6 Random forrest and boosting xgboost Machine Learning with Tree-Based Models in R (ch 3-4)
7 Support vector machines Support Vector Machines in R (ch 1-2)
8 Neural Networks Introduction to TensorFlow in R (ch 1-3)
9 Feature selection Supervised Learning in R: Classification (ch 3, video on automatic feature selection

Litterature

We will use many different sources for teaching you applied predictive modelling. Kuhn and Johnson (2016) and James et al. (2013) are the main references.

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. 1st ed. Springer.
Kuhn, Max, and Kjell Johnson. 2016. Applied Predictive Modeling. 5th ed. Springer.