This Lesson is for PRO Members.

Subscribe today and get access to all lessons! Plus direct HD download
for offline use, member comment forums, and iTunes "podcast" RSS feed.
Level up your skills now!

UNLOCK THIS LESSON

Already subscribed? Sign In

HIDE PLAYLIST

Break up language strings into parts using Natural


A part of Natural Language Processing (NLP) is processing text by “tokenizing” language strings. This means we can break up a string of text into parts by word, sentence, etc. In this lesson, we will use the natural library to tokenize a string. First, we will break the string into words using WordTokenizer, WordPunctTokenizer, and TreebankWordTokenizer. Then we will break the string into sentences using RegexpTokenizer.