BERT (Bidirectional Encoder Representations from Transformers)
  • is a large language model
  • introduced in 2018
  • developed by Google
  • transformer-based language representation model trained on a large cross-domain corpus
  • applies a masked language model to predict words that are randomly masked in a sequence, and this is followed by a next-sentence-prediction task for learning the associations between sentences
  • because BERT doesn’t have a decoder component, it can’t generate text, which paved the way for GPT models to pick up where BERT left off

BERT - Pyramid of Fundamental Concepts

Resources