In the first part of this course, we’ll focus on supervised machine learning.
scikit-learn
framework.Imagine you’re in the fortunate situation where, after graduating, you have a few job offers and need to decide which one to choose. You want to pick the job that will likely make you the happiest. To help with your decision, you collect data from like-minded people. Here are the first few rows of this toy dataset.
supportive_colleagues | salary | free_coffee | boss_vegan | happy? | |
---|---|---|---|---|---|
0 | 0 | 70000 | 0 | 1 | Unhappy |
1 | 1 | 60000 | 0 | 0 | Unhappy |
2 | 1 | 80000 | 1 | 0 | Happy |
3 | 1 | 110000 | 0 | 1 | Happy |
4 | 1 | 120000 | 1 | 0 | Happy |
5 | 1 | 150000 | 1 | 1 | Happy |
6 | 0 | 150000 | 1 | 0 | Unhappy |
from sklearn.dummy import DummyClassifier
model = DummyClassifier(strategy="most_frequent") # Initialize the DummyClassifier to always predict the most frequent class
model.fit(X, y) # Train the model on the feature set X and target variable y
toy_happiness_df['dummy_predictions'] = model.predict(X) # Add the predicted values as a new column in the dataframe
toy_happiness_df
supportive_colleagues | salary | free_coffee | boss_vegan | happy? | dummy_predictions | |
---|---|---|---|---|---|---|
0 | 0 | 70000 | 0 | 1 | Unhappy | Happy |
1 | 1 | 60000 | 0 | 0 | Unhappy | Happy |
2 | 1 | 80000 | 1 | 0 | Happy | Happy |
3 | 1 | 110000 | 0 | 1 | Happy | Happy |
4 | 1 | 120000 | 1 | 0 | Happy | Happy |
5 | 1 | 150000 | 1 | 1 | Happy | Happy |
6 | 0 | 150000 | 1 | 0 | Unhappy | Happy |
sklearn
Let’s train a simple decision tree on our toy dataset.
from sklearn.tree import DecisionTreeClassifier # import the classifier
from sklearn.tree import plot_tree
model = DecisionTreeClassifier(max_depth=2, random_state=1) # Create a class object
model.fit(X, y)
plot_tree(model, filled=True, feature_names = X.columns, class_names=["Happy", "Unhappy"], impurity = False, fontsize=12);
sklearn
supportive_colleagues | salary | free_coffee | boss_vegan | |
---|---|---|---|---|
0 | 0 | 70000 | 0 | 1 |
1 | 1 | 60000 | 0 | 0 |
2 | 1 | 80000 | 1 | 0 |
3 | 1 | 110000 | 0 | 1 |
4 | 1 | 120000 | 1 | 0 |
5 | 1 | 150000 | 1 | 1 |
6 | 0 | 150000 | 1 | 0 |
max_depth
, which limits how deep the tree can go.max_depth=1
max_depth=2
Clicker cloud join link: https://join.iclicker.com/VYFJ
Select all of the following statements which are examples of supervised machine learning
Clicker cloud join link: https://join.iclicker.com/VYFJ
Select all of the following statements which are examples of regression problems
iClicker cloud join link: https://join.iclicker.com/VYFJ
Select all of the following statements which are TRUE.