Statistical Machine Learning (MAT0043)

Table of Contents

Program overview

The course introduces methods and models to extract important patterns and trends from data, and presents basic concepts of machine learning and data mining from a statistical perspective. The course emphasizes selection of appropriate methods and justification of choice, use of programming for implementation of the method, and evaluation and effective communication of results in data analysis reports.

What you will learn

  • Regression
  • Classification via logistic regression and $k$-nearest neighbors
  • Model selection (shrinkage and dimension reduction methods)
  • Polynomial regression, (regression/smoothing/thin-plate) splines, generalised additive models, kernels
  • Regression/classification trees and ensemble methods
  • Support vector machines
  • Introduction to neural networks and deep learning

We also cover important practical considerations such as assessing model performance (e.g., cross-validation) and the bias-variance trade-off. Labs will be implemented using R, but students are also free to use any other programming language of their choosing.

Meet your instructor

Silvia Montagna


Are there prerequisites?

Good knowledge of R (or another programming language) is required. There are no other prerequisites for this course, but students are encouraged to take Statistical Inference and Multivariate Statistical Analysis prior to this course.

How often does this course run?

Every Fall semester.

Refer to the course webpage for more information.