Improve Code Tool, Sandbox and Eval #1120

debanjum · 2025-02-17T10:28:49Z

Improve Code Tool, Sandbox

Improve code gen chat actor to output code in inline md code blocks
Stop code sandbox on request timeout to allow sandbox process restarts
Use tenacity retry decorator to retry executing code in sandbox
Add retry logic to code execution and add health check to sandbox container
Add E2B as an optional code sandbox provider

Improve Gemini Chat Models

Default to non-zero temperature for all queries to Gemini models
Default to Gemini 2.0 flash instead of 1.5 flash on setup
Set default chat model to KHOJ_CHAT_MODEL env var if set

sabaimran

🎁 excited to have better reliability for the code execution feature

src/khoj/processor/tools/run_code.py

src/khoj/utils/helpers.py

src/khoj/processor/tools/run_code.py

src/khoj/processor/conversation/prompts.py

src/khoj/utils/initialization.py

…tainer

Add price of gemini 2.0 flash for cost calculations

gitguardian · 2025-03-09T11:08:41Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
15894534	Triggered	Generic High Entropy Secret	`45fb85f`	docker-compose.yml	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

- Specify E2B api key and template to use via env variables - Try load, use e2b library when E2B api key set - Fallback to try use terrarium sandbox otherwise - Enable more python packages in e2b sandbox like rdkit via custom e2b template - Use Async E2B Sandbox - Parallelize file IO with sandbox - Add documentation on how to enable E2B as code sandbox instead of Terrarium

It may mitigate the intermittent invalid json output issues. Model maybe going into repetition loops, non-zero temp may avoid that.

Simplify code gen chat actor to improve correct code gen success, especially for smaller models & models with limited json mode support Allow specify code blocks inline with reasoning to try improve code quality Infer input files based on user file paths referenced in code.

Simplify code log to set default_use_model during init for readability

Previously was encoding E2B code execution text output content as b64. This was breaking - The AI model's ability to see the content of the file - Downloading the output text file with appropriately encoded content Issue created when adding E2B code sandbox in #1120

debanjum force-pushed the try-improve-code-tool-usage branch from a7662b1 to d7bf13d Compare February 18, 2025 03:01

sabaimran approved these changes Feb 18, 2025

View reviewed changes

debanjum force-pushed the try-improve-code-tool-usage branch from 1020dcb to 9cb38ce Compare February 28, 2025 11:41

debanjum and others added 5 commits March 7, 2025 13:48

Log eval run progress percentage for orientation

f13bdc5

Add retry logic to code execution and add health check to sandbox con…

4a28714

…tainer

Use tenacity retry decorator to retry executing code in sandbox

ecc2f79

Stop code sandbox on request timeout to allow sandbox process restarts

701a7be

Default to gemini 2.0 flash instead of 1.5 flash on Gemini setup

b4183c7

Add price of gemini 2.0 flash for cost calculations

debanjum force-pushed the try-improve-code-tool-usage branch 2 times, most recently from 3c133f9 to eb74fba Compare March 9, 2025 11:08

debanjum added 5 commits March 9, 2025 18:23

Default to non-zero temperature for all queries to Gemini models.

8305fdd

It may mitigate the intermittent invalid json output issues. Model maybe going into repetition loops, non-zero temp may avoid that.

Set default chat model to KHOJ_CHAT_MODEL env var if set

94ca458

Simplify code log to set default_use_model during init for readability

Improvements based on code feedback

c133d11

debanjum force-pushed the try-improve-code-tool-usage branch from eb74fba to c133d11 Compare March 9, 2025 12:53

debanjum merged commit 9751adb into master Mar 9, 2025
10 checks passed

debanjum deleted the try-improve-code-tool-usage branch March 9, 2025 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Code Tool, Sandbox and Eval #1120

Improve Code Tool, Sandbox and Eval #1120

debanjum commented Feb 17, 2025

sabaimran left a comment

gitguardian bot commented Mar 9, 2025 •

edited

Loading

Improve Code Tool, Sandbox and Eval #1120

Improve Code Tool, Sandbox and Eval #1120

Conversation

debanjum commented Feb 17, 2025