Text Classification
- is the process of assigning tags or categories to text according to its content
Text Classification - Subtypes
- sentiment analysis
- topic labeling
- spam detection
- intent detection
- etc
Text Classification - Baseline Algorithm
- tokenization & lemmatization
- feature extraction
- handle negation (e.g. not, didn’t, etc)
- classification using different classifiers:
- knowledge-based
- statistical-based
- Naive Bayes Models
- Naive Bayes Model
- Boolean Multinomial Naive Bayes Model - a modification of Naive Bayes Model
- Maximum Entropy (MaxEnt) Models
- Support Vector Machines (SVM)
- Naive Bayes Models
- hybrid