| ml_experience | class_attendance | lab1 | lab2 | lab3 | lab4 | quiz1 | quiz2 | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | 92 | 93 | 84 | 91 | 92 | 90 | 
| 1 | 1 | 0 | 94 | 90 | 80 | 83 | 91 | 84 | 
| 2 | 0 | 0 | 78 | 85 | 83 | 80 | 80 | 82 | 
| 3 | 0 | 1 | 91 | 94 | 92 | 91 | 89 | 92 | 
| 4 | 0 | 1 | 77 | 83 | 90 | 92 | 85 | 90 | 
| 5 | 1 | 0 | 70 | 73 | 68 | 74 | 71 | 75 | 
| 6 | 1 | 0 | 80 | 88 | 89 | 88 | 91 | 91 | 
 .
.
 {.nostretch fig-align=“center” width=“700px”}
 {.nostretch fig-align=“center” width=“700px”}





Imagine you’re taking a course with four homework assignments and two quizzes. You’re feeling nervous about Quiz 2, so you want to predict your Quiz 2 grade based on your past performance. You collect data your friends who took the course in the past.
Here are a few rows from the data.
| ml_experience | class_attendance | lab1 | lab2 | lab3 | lab4 | quiz1 | quiz2 | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1 | 92 | 93 | 84 | 91 | 92 | 90 | 
| 1 | 1 | 0 | 94 | 90 | 80 | 83 | 91 | 84 | 
| 2 | 0 | 0 | 78 | 85 | 83 | 80 | 80 | 82 | 
| 3 | 0 | 1 | 91 | 94 | 92 | 91 | 89 | 92 | 
| 4 | 0 | 1 | 77 | 83 | 90 | 92 | 85 | 90 | 
| 5 | 1 | 0 | 70 | 73 | 68 | 74 | 71 | 75 | 
| 6 | 1 | 0 | 80 | 88 | 89 | 88 | 91 | 91 | 
X and y is linear.

| review | label | review_pp | |
|---|---|---|---|
| 47278 | First of all,there is a detective story:"légi... | positive | First of all,there is a detective story:"légi... | 
| 19664 | this attempt at a "thriller" would have no sub... | negative | this attempt at a "thriller" would have no sub... | 
| 22648 | What's the matter with you people? John Dahl? ... | positive | What's the matter with you people? John Dahl? ... | 
| 33662 | This is another one of those films that I reme... | positive | This is another one of those films that I reme... | 
| 31230 | I love Ben Kingsley and Tea Leoni. However, th... | negative | I love Ben Kingsley and Tea Leoni. However, th... | 
| 00 | 000 | 007 | 0079 | 0080 | 0083 | 00pm | 00s | 01 | 0126 | ... | zurer | zuzz | zwart | zwick | zyada | zzzzip | zzzzz | â½ | â¾ | ã¼ber | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
| 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
| 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
| 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
5 rows × 38867 columns
array(['00', 'affection', 'apprehensive', 'barbara', 'blore',
       'businessman', 'chatterjee', 'commanding', 'cramped', 'defining',
       'displaced', 'edie', 'evolving', 'fingertips', 'gaffers',
       'gravitas', 'heist', 'iliad', 'investment', 'kidnappee',
       'licentious', 'malã', 'mice', 'museum', 'obsessiveness',
       'parapsychologist', 'plasters', 'property', 'reclined',
       'ridiculous', 'sayid', 'shivers', 'sohail', 'stomaches', 'syrupy',
       'tolerance', 'unbidden', 'verneuil', 'wilcox'], dtype=object)| Coefficient | |
|---|---|
| excellent | 0.637051 | 
| great | 0.501922 | 
| amazing | 0.499925 | 
| perfect | 0.470204 | 
| wonderful | 0.450895 | 
| ... | ... | 
| waste | -0.545904 | 
| terrible | -0.569702 | 
| boring | -0.595568 | 
| awful | -0.687145 | 
| worst | -0.922031 | 
32230 rows × 1 columns
Let’s visualize the 20 most important features.

Finally, let’s try predicting on some new examples.
fake_reviews = ["It got a bit boring at times but the direction was excellent and the acting was flawless. Overall I enjoyed the movie and I highly recommend it!",
 "The plot was shallower than a kiddie pool in a drought, but hey, at least we now know emojis should stick to texting and avoid the big screen."
]
fake_reviews['It got a bit boring at times but the direction was excellent and the acting was flawless. Overall I enjoyed the movie and I highly recommend it!',
 'The plot was shallower than a kiddie pool in a drought, but hey, at least we now know emojis should stick to texting and avoid the big screen.']