2023, Gerhard Paaß, Sven Giesselbach, Foundation Models for Natural Language Processing: Pre-trained Language Models Integrating Media, Springer Nature, →ISBN, page 130:
GLaM [51] is an autoregressive mixture-of-experts (MoE) model with up to 1200B parameters.