In this lesson, we will learn how to train a Naive Bayes classifier and a Logistic Regression classifier - basic machine learning algorithms - on JSON text data, and classify it into categories.
While this dataset is still considered a small dataset -- only a couple hundred points of data -- we'll start to get better results.
The general rule is that Logistic Regression will work better than Naive Bayes, but only if there is enough data. Since this is still a pretty small dataset, Naive Bayes works better here. Generally, Logistic Regression takes longer to train as well.
This uses data from Ana Cachopo: http://ana.cachopo.org/datasets-for-single-label-text-categorization