An Introduction to Machine Learning for Public Policy

Preamble

Machine learning has become an increasingly integral part of public policies. It is applied for policy problems that do not require causal inference but instead require predictive inference. Solving these prediction policy problems requires tools that are tuned to minimizing prediction errors, but also frameworks to ensure that models are efficient and fair. ML4PP will introduce the theory and applications of machine learning algorithms with a focus on policy applications and issues. The goals of this course include:

  • Developing a basic understanding of the statistical theory underlying common supervised machine learning algorithms
  • Developing skills necessary to train and assess the performance of selected popular machine learning algorithms for solving public policy problems
  • Gaining an understanding of the benefits and risks of applying machine learning algorithms to public policy problems

The course consists of 6 sessions each consisting of a technical introductory lecture and a hands-on application of the topics to a real-world policy problem. Students will be working with the programming language R, but coding is not the primary focus of the course.

To end the course, we will meet online for a Collaborative Policy Challenge, which will be delivered by a colleague from an International Organisation. In groups of interdisciplinary teams, we will provide a possible solution to the challenge, and get feedback from our peers and policy experts.

Course textbook (e-book available for free):

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning

Course contents

1. Gentle Introduction to R and Rstudio, and Python.

  • Introduction to the course
  • Introduction to the R statistical programming language with the Rstudio IDE
  • Introduction to the Python programming language with Visual Studio Code

Instructors: Stephan, Alex and Michelle (who will give you a warm welcome!)

2. Introduction to Machine Learning for Public Policy

  • Prediction Policy problems
  • Inference vs. prediction for policy analysis
  • Assessing accuracy: bias-variance tradeoff
  • Training error vs. test error
  • Feature selection: brief introduction to Lasso

Instructors: Michelle González Amador

Readings:

Mandatory

Optional readings

3. Classification

  • Logistic regression
  • Confusion matrix
  • Performance metrics: Accuracy, Recall, Precision (…)

Instructor: Dr. Stephan Dietrich

Readings:

Optional Readings

4. Tree-based methods

  • Decision Trees: a classification approach
  • Ensemble learning: bagging and boosting.

Instructor: Dr. Francisco Rosales

Readings:

  • An introduction to Statistical Learning, Chapter 8.

Optional Readings

5. Fair Machine Learning / Ethics

  • Common Machine Learning algorithms in (public policy) action
  • Black box algorithms
  • Biases
  • Ethical challenges

Readings:

Instructor: Dr. Juba Ziani

6. Neural Networks

  • Neural Network Architecture: neurons and layers
  • Inputs and output: the activation function (sigmoid, tahn…)

Instructor: Prof. Dr. Robin Cowan

Optional Readings