The ability to reply to discussions is limited to PRO members. Want to join in the discussion? Click here to subscribe now.

Break up language strings into parts using Natural

Break up language strings into parts using Natural

1:25
A part of Natural Language Processing (NLP) is processing text by “tokenizing” language strings. This means we can break up a string of text into parts by word, sentence, etc. In this lesson, we will use the `natural` library to tokenize a string. First, we will break the string into words using `WordTokenizer`, `WordPunctTokenizer`, and `TreebankWordTokenizer`. Then we will break the string into sentences using `RegexpTokenizer`.
Watch this lesson now
Avatar
egghead.io

A part of Natural Language Processing (NLP) is processing text by “tokenizing” language strings. This means we can break up a string of text into parts by word, sentence, etc. In this lesson, we will use the natural library to tokenize a string. First, we will break the string into words using WordTokenizer, WordPunctTokenizer, and TreebankWordTokenizer. Then we will break the string into sentences using RegexpTokenizer.

HEY, QUICK QUESTION!
Joel's Head
Why are we asking?