Supervised Learning in a nutshell

I recently created a little notebook that describes popular supervised learning algorithms. It can be used as a little cheat sheet when it comes to remembering what these algorithms do. I embedded the notebook here. If you want to fullscreen version then head over to GitHub and open the Gist.

Continue reading “Supervised Learning in a nutshell”


How I created a SNL dataset with Scrapy

How I created a SNL dataset with Scrapy

Not long ago Kaggle got the new dataset feature. Every member of the community can now upload their own datasets for others to play with. This is a very cool thing and there are lots of interesting datasets out there. You can also use Kaggle to promote your dataset. I was thinking about a dataset that I could provide and when I was reading through the LiveFromNewYork subreddit I got the idea: what about a Saturday Night Live dataset? I searched around the web and found the website which has a very comprehensive database. I contacted the creator but got no answer. But I didn’t want to stop my project before it really began so I decided to try to scrape the data from the website. This blog post shows you how I did that and what we can learn from over 40 seasons of hilarious data.

Continue reading “How I created a SNL dataset with Scrapy”