|
Causal Language Modeling (CLM)
|
- is an autoregressive method where the model is trained to predict the next token in a sequence given the previous tokens
- is used in models like GPT-2 and GPT-3
- is well-suited for tasks such as text generation and summarization
- however, CLM models have unidirectional context, meaning they only consider the past and not the future context when generating predictions.
|
|---|
|
Masked Language Modeling (MLM)
|
- is a training method used in models like BERT, where some tokens in the input sequence are masked, and the model learns to predict the masked tokens based on the surrounding context
- has the advantage of bidirectional context, allowing the model to consider both past and future tokens when making predictions
- useful for tasks like text classification, sentiment analysis, and named entity recognition
|
|---|
|
Sequence-to-Sequence (Seq2Seq)
|
- consist of an encoder-decoder architecture, where the encoder processes the input sequence and the decoder generates the output sequence
- is commonly used in tasks like machine translation, summarization, and question-answering
- seq2seq models can handle more complex tasks that involve input-output transformations, making them versatile for a wide range of NLP tasks
|
|---|