Conversation
|
Response from ADK Triaging Agent Hello @divyashreepathihalli, thank you for your contribution! This looks like a great new feature. According to our contribution guidelines, all PRs, other than small documentation or typo fixes, should have an Issue associated with them. Could you please create a new issue for this feature or link to an existing one if it already exists? This helps us track new features and bugs more effectively. Thanks! |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces KerasHub integration, allowing the use of local LLMs within the ADK framework. A critical security vulnerability has been identified due to hardcoded credentials in an example script, alongside potential prompt injection risks and a possible crash in LLM output parsing. Additionally, improvements are suggested for exception handling and the use of the asyncio library to enhance robustness and adherence to modern Python practices.
| def _format_gemma_prompt( | ||
| contents: list[types.Content], | ||
| system_instruction: Optional[str] = None, | ||
| ) -> str: | ||
| parts = [] | ||
| if system_instruction: | ||
| parts.append(f'<start_of_turn>user\n{system_instruction}<end_of_turn>') | ||
|
|
||
| for content in contents: | ||
| text = _extract_text_from_content(content) | ||
| if not text: | ||
| continue | ||
| role = 'model' if content.role == 'model' else 'user' | ||
| parts.append(f'<start_of_turn>{role}\n{text}<end_of_turn>') | ||
|
|
||
| parts.append('<start_of_turn>model') | ||
| return '\n'.join(parts) | ||
|
|
||
|
|
||
| def _format_llama_prompt( | ||
| contents: list[types.Content], | ||
| system_instruction: Optional[str] = None, | ||
| ) -> str: | ||
| parts = [] | ||
| if system_instruction: | ||
| parts.append(f'[INST] <<SYS>>\n{system_instruction}\n<</SYS>>\n') | ||
|
|
||
| for content in contents: | ||
| text = _extract_text_from_content(content) | ||
| if not text: | ||
| continue | ||
| if content.role == 'model': | ||
| parts.append(f'{text} </s>') | ||
| else: | ||
| parts.append(f'[INST] {text} [/INST]') | ||
|
|
||
| return '\n'.join(parts) | ||
|
|
||
|
|
||
| def _format_mistral_prompt( | ||
| contents: list[types.Content], | ||
| system_instruction: Optional[str] = None, | ||
| ) -> str: | ||
| parts = [] | ||
| first_user = True | ||
| for content in contents: | ||
| text = _extract_text_from_content(content) | ||
| if not text: | ||
| continue | ||
| if content.role == 'model': | ||
| parts.append(f'{text}</s>') | ||
| else: | ||
| if first_user and system_instruction: | ||
| text = f'{system_instruction}\n\n{text}' | ||
| first_user = False | ||
| parts.append(f'[INST] {text} [/INST]') | ||
|
|
||
| return ' '.join(parts) | ||
|
|
||
|
|
||
| def _format_generic_prompt( | ||
| contents: list[types.Content], | ||
| system_instruction: Optional[str] = None, | ||
| ) -> str: | ||
| parts = [] | ||
| if system_instruction: | ||
| parts.append(f'System: {system_instruction}\n') | ||
|
|
||
| for content in contents: | ||
| text = _extract_text_from_content(content) | ||
| if not text: | ||
| continue | ||
| if content.role == 'model': | ||
| parts.append(f'Assistant: {text}') | ||
| else: | ||
| parts.append(f'User: {text}') | ||
|
|
||
| parts.append('Assistant:') | ||
| return '\n'.join(parts) |
There was a problem hiding this comment.
The prompt formatting functions (_format_gemma_prompt, _format_llama_prompt, _format_mistral_prompt, and _format_generic_prompt) directly concatenate user-supplied content into the final prompt string without sanitizing model-specific control tokens (e.g., <start_of_turn>, [INST], Assistant:). An attacker can craft input containing these tokens to break out of the user message block and inject arbitrary instructions or model responses, potentially leading to unauthorized tool execution or manipulation of the agent's behavior.
Remediation: Implement a sanitization step that escapes or removes model-specific control tokens from user-supplied text before it is incorporated into the prompt.
| parsed = json.loads(json_candidate) | ||
| name = parsed.get('name') or parsed.get('function') | ||
| params = parsed.get('parameters') or parsed.get('args') |
There was a problem hiding this comment.
In _try_extract_function_call, the code calls .get() on the result of json.loads(). If the LLM generates a valid JSON string that is not an object (e.g., a list [1, 2, 3] or a string "not an object"), json.loads() will return a type that does not have a .get() method. This will raise an AttributeError, which is not caught by the existing try...except block (which only catches json.JSONDecodeError, KeyError, and TypeError). This can cause the entire generation process to crash when processing unexpected model output.
Remediation: Update the try...except block to also catch AttributeError, or explicitly verify that the parsed variable is a dictionary before calling .get().
parsed = json.loads(json_candidate)
if not isinstance(parsed, dict):
return
name = parsed.get('name') or parsed.get('function')
params = parsed.get('parameters') or parsed.get('args')| except Exception: | ||
| # KerasHub support requires: pip install keras keras-hub | ||
| pass |
There was a problem hiding this comment.
Catching a generic Exception is too broad and can hide unrelated errors during module import. It's better to catch the more specific ImportError to handle cases where keras-hub or its dependencies are not installed.
| except Exception: | |
| # KerasHub support requires: pip install keras keras-hub | |
| pass | |
| except ImportError: | |
| # KerasHub support requires: pip install keras keras-hub | |
| pass |
| ) | ||
|
|
||
| try: | ||
| loop = asyncio.get_event_loop() |
This will bring Gemma models instantly to ADK.

Problem:
The ADK framework currently lacks native support for running locally-hosted large language models (LLMs) via KerasHub. This limits users who want to leverage open-source models (Gemma, Llama, Mistral, etc.) that can be run locally without cloud API dependencies.
Solution:
Added a new KerasHubLlm class that integrates KerasHub CausalLM models with the ADK framework. This integration:
Provides a seamless interface to use any KerasHub-compatible model (Gemma, Llama, Mistral, GPT-2, OPT, Falcon, etc.)
Handles model-specific prompt formatting using chat templates tailored for each model family
Supports tool/function calling by converting tool declarations into text instructions
Enables streaming responses via async generators
Works locally without cloud API calls
Key components added:
KerasHubLlm class - Main LLM integration inheriting from BaseLlm
Prompt formatting helpers - Model-aware prompt formatters for Gemma, Llama, Mistral, and generic fallback
Tool injection - Converts tool declarations to text instructions for local models
Function call parsing - Extracts structured function calls from model text output
Example script - Demonstrates basic generation and tool-calling capabilities
Testing Plan
Unit Tests:
I have added or updated unit tests for my change.
All unit tests pass locally.
Pytest Results:
Manual End-to-End (E2E) Tests:
Setup:
Instructions:
Run the example script to verify basic functionality:
Verify tool-calling works with function declarations
Test with different model variants (gemma3, llama, mistral)
Confirm streaming responses work with async generator
Results:
The example successfully demonstrates:
✅ Basic text generation with Gemma 3 model
✅ Async streaming responses
✅ Function/tool calling with JSON extraction from model output
Checklist
I have read the CONTRIBUTING.md document.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that prove my fix is effective or that my feature works.
New and existing unit tests pass locally with my changes.
I have manually tested my changes end-to-end.
Any dependent changes have been merged and published in downstream modules. (N/A for this feature)
Additional Context
Files Modified:
kerashub_llm.py (505 lines) - Main implementation
init.py - Exported KerasHubLlm
kerashub_example.py (231 lines) - Usage examples
Key Features:
✅ Model-agnostic prompt formatting with fallback
✅ Automatic function call extraction from text
✅ Support for multiple Keras backends (JAX, TensorFlow, PyTorch)
✅ Async/await support for streaming
✅ Kaggle integration for model downloads
Known Limitations:
Local models have longer latency compared to API-based solutions
Memory requirements vary by model size and backend
Function calling relies on text parsing rather than native tool APIs