Natural Language Processing (NLP) - Computational Linguistics

other names include Speech & Language Processing, Human Language Technology, and Speech Recognition & Synthesis
is a subfield of linguistics and artificial intelligence that is concerned with the computer’s ability to read, understand and derive meaning from natural languages
is an interdisciplinary field concerned with the computational modeling of natural language

NLP - Tutorials

NLP Introduction - Lecture Slides - 2012 Stanford Course ~ by Dan Jurafsky & Christopher Manning
NLP with Deep Learning - 2017 Stanford Course ~ by Christopher Manning & Richard Socher
A Comprehensive Learning Path to Understand and Master NLP in 2020
Ultimate Guide to Understand and Implement Natural Language Processing (with codes in Python)

NLP - Subpages

NLP - System Types

rule-based vs statistical -
manual vs automatic -

NLP - Tasks

Audio Related
- Speech Recognition - speech to text
- - Speech Segmentation - the task of separating speech into smaller units
- Speech Synthesis - text to speech
Visual Related - Computer Vision (CV)
- Optical Character Recognition (OCR) - image to text
- Text-to-Image - text to image

Grammar Induction - generate a formal grammar that describes a language’s syntax
Segmentation/Tokenizer:
- Sentence Segmentation (Sentence Boundary Disambiguation) - task of separating a body of text into sentences
- Tokenization (Word Segmentation) - process of breaking a body of text into tokens (e.g. words and/or phrases)
- Morphological Segmentation - the task of separating words into individual morphemes and identifying the classes of morphemes
Normalization - process of normalizing a token (e.g. U.S.A to USA?)
- Lower/Upper Casing -
- Stemming - the task of reducing inflected/derived words to their root form (removing affixes) (e.g. automates automatic automation → automat)
  - porter’s algorithm - the most common english stemmer
- Lemmatization - the task of removing inflectional words and return the lemma (base dictionary form of a word) and grouping together different forms of the same word (e.g. am are is → be | car cars car’s cars’ → car)
  - also takes into consideration the context of the word in order to solve other problems like disambiguation
Part-of-Speech (PoS) Tagging - the task of determining the Part of Speech (PoS) for each word in a sentence
Syntactic Parsing - is a method of syntactic analysis of a sentence (e.g. the task of determining the parse-tree of a given sentence)
- Constituency Parsing - focuses on building out parse-tree of constituents
- Dependency Parsing - focuses on the relationships between words in a sentence (e.g. marking words like primary-objects and predicates)

Word/Phrase Semantics:
- Morphology - components of words that carry meanings aside from actual definition of word (e.g. singular vs plural)
- Lexical Semantics - meaning of individual words (in context)
- Compositional Semantics - meaning of phrases/groups of words (e.g. distinction between Western Europe and Eastern Europe)
Distribution Semantics - theories and methods for quantifying and categorizing semantic similarities between linguistic items based on their distributional properties in large samples of language data
Machine Translation - task of translating a document from one language to another
Information Extraction - the task of extracting information (e.g. entities, relations, events, temporal, etc) from a body of text
- Named Entity Recognition - the task of determining proper names in a body of text
- Relationship Extraction - the task of identifying the relationships among entities in a body of text (e.g. who is married to whom)
- etc
- Textual Entailment Recognition - given 2 text fragments, determine if one being true (either):
  - entails the other
  - entails the other’s negation
  - allows the other to be true or false
Text Classification -
- Sentiment Analysis - the task of determining the sentiment of a body of text or a word
  - positive, neutral, or negative
  - emotion (happy, sad, angry, etc)
  - etc
- Topic Segmentation - the task of separating a body of text into segments each of which are devoted to a topic
- Topic Recognition/Labeling - the task of identifying the topic of text
- Language Detection - determining the language of the text
- Intent Detection - determining the underlying goal/intent of a given text
- Sentence Type Identification -
  - request/command - e.g. open the door
  - statement - e.g. the door is open
  - question - e.g. is the door open?
Disambiguating Ambiguity - the task of disambiguating the ambiguous nature of human language

Automatic Summarization - the task of producing a summary of a body of text
Referring Expressions Detection - a more general task of coreference resolution. the task of identifying “bridging relationships”. (e.g. “he enter the house through the front door” the front door is a referring expression and the bridging relationship to be identified is the fact that the door is of John’s house)
- Co-Reference Resolution - the task of determining which words (“mentions”) refer to the same objects (“entities”). makes use of knowledge about how words like that or pronouns like it or she refer to previous parts of the discourse
- - Anaphora Resolution - a specific type of coreference resolution concerned with matching up pronouns with the nouns or name-entities to which they refer
Question-Answering - given question, determine the meaning of words, then determine the answer. (see Search Engines - Types)
Conversational Agents or Dialogue Systems - superset of question-answering. computer programs that are able to converse with humans in natural language
Discourse Analysis - a number of tasks:
- identifying the discourse structure of connected text
- recognizing and classifying speech-acts in text (e.g. yes-no question, content question, statement, assertion, etc)

／var／log marcus chiu

Explorer

Natural Language Processing (NLP) - Computational Linguistics

Natural Language Processing (NLP) - Computational Linguistics

NLP - Tutorials

NLP - Subpages

NLP - System Types

NLP - Tasks

／var／logmarcus chiu

Explorer

Natural Language Processing (NLP) - Computational Linguistics

Natural Language Processing (NLP) - Computational Linguistics

NLP - Tutorials

NLP - Subpages

NLP - System Types

NLP - Tasks

Audio/Visual Related

Syntax Related

Semantic Related

Discourse Related

／var／log marcus chiu