Join egghead, unlock knowledge.

Want more egghead?

This lesson is for members. Join us? Get access to all 3,000+ tutorials + a community with expert developers around the world.

Unlock This Lesson
1×
Become a member
to unlock all features

Level Up!

Access all courses & lessons on egghead today and lock-in your price for life.

Autoplay

    Classify JSON text data with machine learning in Natural

    naturalNatural
    ^0.4.0

    In this lesson, we will learn how to train a Naive Bayes classifier and a Logistic Regression classifier - basic machine learning algorithms - on JSON text data, and classify it into categories.

    While this dataset is still considered a small dataset -- only a couple hundred points of data -- we'll start to get better results.

    The general rule is that Logistic Regression will work better than Naive Bayes, but only if there is enough data. Since this is still a pretty small dataset, Naive Bayes works better here. Generally, Logistic Regression takes longer to train as well.

    This uses data from Ana Cachopo: http://ana.cachopo.org/datasets-for-single-label-text-categorization