Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Pull requests: allenai/OLMo

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

MoE
#639 opened Jun 30, 2024 by Muennighoff Draft
WIP: Scaling laws pipeline
#638 opened Jun 28, 2024 by AkshitaB Loading…
muP implementation
#637 opened Jun 28, 2024 by AkshitaB Loading…
Add option to skip optim steps for 0 grad params
#636 opened Jun 28, 2024 by epwalsh Loading…
Unit tests
#635 opened Jun 26, 2024 by AkshitaB Loading…
Amberish 7B hero run
#629 opened Jun 21, 2024 by epwalsh Draft
Config for Amberish experiments at 1B
#621 opened Jun 12, 2024 by drschwenk Loading…
Running Amber experiments at 7B
#620 opened Jun 12, 2024 by epwalsh Draft
Normal baselines
#618 opened Jun 12, 2024 by AkshitaB Loading…
added git ref to the config keys
#617 opened Jun 11, 2024 by drschwenk Loading…
Chameleon stability experiments
#616 opened Jun 11, 2024 by AkshitaB Draft
Add option to record step size metrics from AdamW
#605 opened Jun 3, 2024 by epwalsh Loading…
Optionally load trainer state
#573 opened May 13, 2024 by Muennighoff Loading…
Reverse weight decay
#567 opened May 3, 2024 by AkshitaB Loading…
1 task done
Add reorder cache for beam search
#526 opened Mar 26, 2024 by cshaib Loading…
Add scripts for Dave
#516 opened Mar 21, 2024 by epwalsh Draft
Scripts for QKV experiments
#510 opened Mar 20, 2024 by AkshitaB Loading…
hf_olmo: support flash attn 2
#471 opened Feb 29, 2024 by wade3han Loading…
integrate mock vision backbone into model
#441 opened Feb 8, 2024 by epwalsh Loading…
DeepSpeed
#384 opened Nov 27, 2023 by Muennighoff Draft
Kebab7
#360 opened Nov 3, 2023 by dirkgr Draft
ProTip! Filter pull requests by the default branch with base:main.