◾Contextual LM

🔻Pretrained LM

🔸ELMo(Embeddings from Language Models)

🔸GPT-1 (Generative Pre-trained Transformer)

🔸BERT (Bidirectional Encoder Representations from Transformer)

🔻Encoder-only models (auto-encoding models)

🔸XLNet (eXtended Language Network)

🔸RoBERTa