Join egghead, unlock knowledge.

Want more egghead?

This lesson is for members. Join us? Get access to all 3,000+ tutorials + a community with expert developers around the world.

Unlock This Lesson
1×
Become a member
to unlock all features

Level Up!

Access all courses & lessons on egghead today and lock-in your price for life.

Autoplay

    Break up language strings into parts using Natural

    naturalNatural
    ^0.4.0

    A part of Natural Language Processing (NLP) is processing text by “tokenizing” language strings. This means we can break up a string of text into parts by word, sentence, etc. In this lesson, we will use the natural library to tokenize a string. First, we will break the string into words using WordTokenizer, WordPunctTokenizer, and TreebankWordTokenizer. Then we will break the string into sentences using RegexpTokenizer.