Welcome to DSCI 571, an introductory supervised machine learning course! In this course we will focus on basic machine learning concepts such as data splitting, cross-validation, generalization error, overfitting, the fundamental trade-off, the golden rule, and data preprocessing. You will also be exposed to common machine learning algorithms such as decision trees, K-nearest neighbours, SVMs, naive Bayes, and logistic regression using the scikit-learn framework.
Course Learning Outcomes
By the end of the course, students are expected to be able to:
- describe supervised learning and identify what kind of tasks it is suitable for;
- explain common machine learning concepts such as classification and regression, data splitting, overfitting, parameters and hyperparameters, and the golden rule;
- identify when and why to apply data pre-processing techniques such as imputation, scaling, and one-hot encoding;
- describe at a high level how common machine learning algorithms work, including decision trees, K-nearest neighbours, and naive Bayes;
- use Python and the
scikit-learnpackage to responsibly develop end-to-end supervised machine learning pipelines on real– world datasets and to interpret your results carefully.