Adds optuna-hpo skill for hyperparameter optimisation#19
Open
chrisvoncsefalvay wants to merge 3 commits intohuggingface:mainfrom
Open
Adds optuna-hpo skill for hyperparameter optimisation#19chrisvoncsefalvay wants to merge 3 commits intohuggingface:mainfrom
chrisvoncsefalvay wants to merge 3 commits intohuggingface:mainfrom
Conversation
Introduces a new skill for hyperparameter optimisation of LLM fine-tuning using Optuna with cloud GPU backends. Train the future without getting bankrupted in the process! Features: - Distributed HPO with local Optuna orchestrator and cloud trial execution - Support for Huggingface Jobs, Modal and Runpod - TPE sampler with MedianPruner for efficient search - Configurable search spaces (standard, LoRA, comprehensive) - Budget management with cost tracking and warnings - SQLite persistence with optional Hub sync - Gradio dashboard for visualisation - User clarification flow before launching expensive GPU jobs Files added: - hf-optuna-hpo/plugin.json - Skill metadata - hf-optuna-hpo/skills/optuna-hpo/SKILL.md - Main documentation - hf-optuna-hpo/skills/optuna-hpo/scripts/ - Python implementations - hf-optuna-hpo/skills/optuna-hpo/references/ - Reference documentation TODO: - Extend to Runpod, maybe Modal if there's demand - Nicer Gradio dash design
Collaborator
|
This is very cool. Do you have any examples of using this? |
Author
Absolutely, I use it for research projects all the time to do HPO sweeps on a subset straight from Claude. I could use a tool like WandB Sweeps, but what this gives me is the ability to just push the jobs off to HF Jobs. Go for lunch, have a (op)tuna sandwich, come back to hyperparameters on budget. Was hoping to properly introduce it in a blogpost, but won't get to it till later this week. That'll hopefully have some screenshots and everything. |
Adds some minor improvements to increase reliability.
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds hyperparameter optimisation with Optuna for cloud GPU backends. Train the future without getting bankrupted in the process!
Features:
Files added:
TODO: