Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

kerashub integration#4786

Closed
divyashreepathihalli wants to merge 2 commits intogoogle:mainfrom
divyashreepathihalli:kh-adk
Closed

kerashub integration#4786
divyashreepathihalli wants to merge 2 commits intogoogle:mainfrom
divyashreepathihalli:kh-adk

Conversation

@divyashreepathihalli
Copy link

@divyashreepathihalli divyashreepathihalli commented Mar 10, 2026

This will bring Gemma models instantly to ADK.
Screenshot 2026-03-10 at 2 40 25 PM

  1. Description of Change:

Problem:
The ADK framework currently lacks native support for running locally-hosted large language models (LLMs) via KerasHub. This limits users who want to leverage open-source models (Gemma, Llama, Mistral, etc.) that can be run locally without cloud API dependencies.

Solution:
Added a new KerasHubLlm class that integrates KerasHub CausalLM models with the ADK framework. This integration:

Provides a seamless interface to use any KerasHub-compatible model (Gemma, Llama, Mistral, GPT-2, OPT, Falcon, etc.)
Handles model-specific prompt formatting using chat templates tailored for each model family
Supports tool/function calling by converting tool declarations into text instructions
Enables streaming responses via async generators
Works locally without cloud API calls
Key components added:

KerasHubLlm class - Main LLM integration inheriting from BaseLlm
Prompt formatting helpers - Model-aware prompt formatters for Gemma, Llama, Mistral, and generic fallback
Tool injection - Converts tool declarations to text instructions for local models
Function call parsing - Extracts structured function calls from model text output
Example script - Demonstrates basic generation and tool-calling capabilities
Testing Plan
Unit Tests:
I have added or updated unit tests for my change.
All unit tests pass locally.
Pytest Results:

Manual End-to-End (E2E) Tests:
Setup:

Instructions:

Run the example script to verify basic functionality:
Verify tool-calling works with function declarations
Test with different model variants (gemma3, llama, mistral)
Confirm streaming responses work with async generator
Results:
The example successfully demonstrates:

✅ Basic text generation with Gemma 3 model
✅ Async streaming responses
✅ Function/tool calling with JSON extraction from model output
Checklist
I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules. (N/A for this feature)
Additional Context
Files Modified:

kerashub_llm.py (505 lines) - Main implementation
init.py - Exported KerasHubLlm
kerashub_example.py (231 lines) - Usage examples
Key Features:

✅ Model-agnostic prompt formatting with fallback
✅ Automatic function call extraction from text
✅ Support for multiple Keras backends (JAX, TensorFlow, PyTorch)
✅ Async/await support for streaming
✅ Kaggle integration for model downloads
Known Limitations:

Local models have longer latency compared to API-based solutions
Memory requirements vary by model size and backend
Function calling relies on text parsing rather than native tool APIs

@adk-bot adk-bot added the models [Component] Issues related to model support label Mar 10, 2026
@adk-bot
Copy link
Collaborator

adk-bot commented Mar 10, 2026

Response from ADK Triaging Agent

Hello @divyashreepathihalli, thank you for your contribution!

This looks like a great new feature. According to our contribution guidelines, all PRs, other than small documentation or typo fixes, should have an Issue associated with them. Could you please create a new issue for this feature or link to an existing one if it already exists?

This helps us track new features and bugs more effectively. Thanks!

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a KerasHubLlm class that integrates KerasHub CausalLM models with the ADK framework. This enhancement allows users to run locally-hosted large language models (LLMs) via KerasHub, supporting models like Gemma, Llama, and Mistral without cloud API dependencies. The integration includes model-specific prompt formatting, tool/function calling support, and an example script to demonstrate its capabilities.

Highlights

  • KerasHub Integration: Introduces a new KerasHubLlm class to integrate KerasHub CausalLM models with the ADK framework, enabling local execution of LLMs like Gemma, Llama, and Mistral.
  • Model-Agnostic Prompt Formatting: Implements model-aware prompt formatters for Gemma, Llama, and Mistral, with a generic fallback, ensuring compatibility with various KerasHub-compatible models.
  • Tool/Function Calling Support: Enables tool/function calling by converting tool declarations into text instructions, allowing local models to leverage external tools.
  • Example Script: Includes an example script demonstrating basic generation, agent integration, and tool-calling capabilities with the KerasHub LLM.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Activity
  • Added a new KerasHubLlm class for integrating KerasHub models.
  • Implemented model-specific prompt formatting for Gemma, Llama, and Mistral.
  • Added support for tool/function calling by converting tool declarations into text instructions.
  • Included an example script demonstrating basic generation and tool-calling capabilities.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces KerasHub integration, allowing the use of local LLMs within the ADK framework. A critical security vulnerability has been identified due to hardcoded credentials in an example script, alongside potential prompt injection risks and a possible crash in LLM output parsing. Additionally, improvements are suggested for exception handling and the use of the asyncio library to enhance robustness and adherence to modern Python practices.

Comment on lines +98 to +176
def _format_gemma_prompt(
contents: list[types.Content],
system_instruction: Optional[str] = None,
) -> str:
parts = []
if system_instruction:
parts.append(f'<start_of_turn>user\n{system_instruction}<end_of_turn>')

for content in contents:
text = _extract_text_from_content(content)
if not text:
continue
role = 'model' if content.role == 'model' else 'user'
parts.append(f'<start_of_turn>{role}\n{text}<end_of_turn>')

parts.append('<start_of_turn>model')
return '\n'.join(parts)


def _format_llama_prompt(
contents: list[types.Content],
system_instruction: Optional[str] = None,
) -> str:
parts = []
if system_instruction:
parts.append(f'[INST] <<SYS>>\n{system_instruction}\n<</SYS>>\n')

for content in contents:
text = _extract_text_from_content(content)
if not text:
continue
if content.role == 'model':
parts.append(f'{text} </s>')
else:
parts.append(f'[INST] {text} [/INST]')

return '\n'.join(parts)


def _format_mistral_prompt(
contents: list[types.Content],
system_instruction: Optional[str] = None,
) -> str:
parts = []
first_user = True
for content in contents:
text = _extract_text_from_content(content)
if not text:
continue
if content.role == 'model':
parts.append(f'{text}</s>')
else:
if first_user and system_instruction:
text = f'{system_instruction}\n\n{text}'
first_user = False
parts.append(f'[INST] {text} [/INST]')

return ' '.join(parts)


def _format_generic_prompt(
contents: list[types.Content],
system_instruction: Optional[str] = None,
) -> str:
parts = []
if system_instruction:
parts.append(f'System: {system_instruction}\n')

for content in contents:
text = _extract_text_from_content(content)
if not text:
continue
if content.role == 'model':
parts.append(f'Assistant: {text}')
else:
parts.append(f'User: {text}')

parts.append('Assistant:')
return '\n'.join(parts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The prompt formatting functions (_format_gemma_prompt, _format_llama_prompt, _format_mistral_prompt, and _format_generic_prompt) directly concatenate user-supplied content into the final prompt string without sanitizing model-specific control tokens (e.g., <start_of_turn>, [INST], Assistant:). An attacker can craft input containing these tokens to break out of the user message block and inject arbitrary instructions or model responses, potentially leading to unauthorized tool execution or manipulation of the agent's behavior.

Remediation: Implement a sanitization step that escapes or removes model-specific control tokens from user-supplied text before it is incorporated into the prompt.

Comment on lines +293 to +295
parsed = json.loads(json_candidate)
name = parsed.get('name') or parsed.get('function')
params = parsed.get('parameters') or parsed.get('args')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

In _try_extract_function_call, the code calls .get() on the result of json.loads(). If the LLM generates a valid JSON string that is not an object (e.g., a list [1, 2, 3] or a string "not an object"), json.loads() will return a type that does not have a .get() method. This will raise an AttributeError, which is not caught by the existing try...except block (which only catches json.JSONDecodeError, KeyError, and TypeError). This can cause the entire generation process to crash when processing unexpected model output.

Remediation: Update the try...except block to also catch AttributeError, or explicitly verify that the parsed variable is a dictionary before calling .get().

    parsed = json.loads(json_candidate)
    if not isinstance(parsed, dict):
      return
    name = parsed.get('name') or parsed.get('function')
    params = parsed.get('parameters') or parsed.get('args')

Comment on lines +73 to +75
except Exception:
# KerasHub support requires: pip install keras keras-hub
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic Exception is too broad and can hide unrelated errors during module import. It's better to catch the more specific ImportError to handle cases where keras-hub or its dependencies are not installed.

Suggested change
except Exception:
# KerasHub support requires: pip install keras keras-hub
pass
except ImportError:
# KerasHub support requires: pip install keras keras-hub
pass

)

try:
loop = asyncio.get_event_loop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

asyncio.get_event_loop() is deprecated since Python 3.10. It's recommended to use asyncio.get_running_loop() instead, as it's safer and more explicit about getting the loop for the current OS thread.

      loop = asyncio.get_running_loop()

@rohityan rohityan self-assigned this Mar 11, 2026
@rohityan rohityan closed this Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants