Course overview

Learning outcomes and objectives

The course presents various advanced methods within data science for predictive modelling and the use of R. Methods for regression, including non-linear regression and generalized additive models, and methods for classification, including trees, boosting and Support Vector Machines, will be examined. The course will focus on practical use in r, without going into details of the mathematical theory of the methods.

On completion of the course the student should have the following learning outcomes:

Knowledge

Knows the basic ideas underpinning carious methods in data science/predictive modelling

Skills

Can implement various models within data science/predictive modelling in R
Use data science methods on real data sets and perform predictions

General competence

Have an overview of how data science methods can be used to analyze larger data sets

Lecture overview

Lecture	Subject	Exercises	Datacamp
1	Introduction and short recap of R and Data preprocessing	Recap of R	Introduction to Regression in R (ch 1-2)
2	Over-fitting and model tuning, selection and evaluation and multiple regression	Multiple regression	Supervised Learning in R: Regression (ch 1-2)
3	Non-linear regression	GAMs	Nonlinear modeling in R with GAMs (ch 1-2)
4	Classification methods		Supervised Learning in R: Classification (ch 1-3)
5	Decision Trees and Bagged trees		Machine Learning with Tree-Based Models in R (ch 1-3)
6	Random forrest and boosting	xgboost	Machine Learning with Tree-Based Models in R (ch 3-4)
7	Support vector machines		Support Vector Machines in R (ch 1-2)
8	Neural Networks		Introduction to TensorFlow in R (ch 1-3)
9	Feature selection		Supervised Learning in R: Classification (ch 3, video on automatic feature selection

Litterature

We will use many different sources for teaching you applied predictive modelling. Kuhn and Johnson (2016) and James et al. (2013) are the main references.

References

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. 1st ed. Springer.

Kuhn, Max, and Kjell Johnson. 2016. Applied Predictive Modeling. 5th ed. Springer.

Data science with R: Applied Predictive Modelling