[B! BLOOM] arrowKatoのブックマーク

arrowKato id:arrowKato

BLOOMに関するarrowKatoのブックマーク (1)

The Technology Behind BLOOM Training
Please note that both Megatron-LM and DeepSpeed have Pipeline Parallelism and BF16 Optimizer implementations, but we used the ones from DeepSpeed as they are integrated with ZeRO. Megatron-DeepSpeed implements 3D Parallelism to allow huge models to train in a very efficient way. Let’s briefly discuss the 3D components. DataParallel (DP) - the same setup is replicated multiple times, and each being
arrowKato 2023/06/13
学習

LLM

BLOOM
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx