Classify text into categories with machine learning in Natural

Hannah Davis
InstructorHannah Davis
Share this video with your friends

Social Share Links

Send Tweet
Published 7 years ago
Updated 5 years ago

In this lesson, we will learn how to train a Naive Bayes classifier or a Logistic Regression classifier - basic machine learning algorithms - in order to classify text into categories.

Yonatan Shalev
Yonatan Shalev
~ 4 years ago

Should I split documents into single sentences or use them as is to train text classification model? I was wondering what's the best way to feed the model with training data.

Can i just use the document as is? like this: {"phrase": "First long document with up to 30 sentences", "result": {"label 1": 1}}, {"phrase": "first long document with up to 30 sentences", "result": {"label 2": 1}} {"phrase": "Second long document with up to 30 sentences", "result": {"label 2": 1}}, etc. Or, should I split all documents into sentences and then the data will look like something this: {"phrase": "Sentence 1 out of document 1", "result": {"label 1": 1}}, {"phrase": "Sentence 2 out of document 1", "result": {"label 2": 1}}, etc.

{"phrase": "Sentence 1 out of document 2", "result": {"label 5": 1}}, etc.

{"phrase": "Sentence X out of document X", "result": {"No labels at all": 1}}, etc.

Same question about using the model, should I just apply it on the complete document or should I split it to separate sentences then apply the model on each sentence.

What's the best practice?

Yonatan Shalev
Yonatan Shalev
~ 4 years ago

Also, how do i approach multiple categories classification ?

Markdown supported.
Become a member to join the discussionEnroll Today