Skip to content

Improve Code Tool, Sandbox and Eval #1120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 9, 2025
Merged

Conversation

debanjum
Copy link
Member

Improve Code Tool, Sandbox

  • Improve code gen chat actor to output code in inline md code blocks
  • Stop code sandbox on request timeout to allow sandbox process restarts
  • Use tenacity retry decorator to retry executing code in sandbox
  • Add retry logic to code execution and add health check to sandbox container
  • Add E2B as an optional code sandbox provider

Improve Gemini Chat Models

  • Default to non-zero temperature for all queries to Gemini models
  • Default to Gemini 2.0 flash instead of 1.5 flash on setup
  • Set default chat model to KHOJ_CHAT_MODEL env var if set

@debanjum debanjum force-pushed the try-improve-code-tool-usage branch from a7662b1 to d7bf13d Compare February 18, 2025 03:01
Copy link
Member

@sabaimran sabaimran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎁 excited to have better reliability for the code execution feature

@debanjum debanjum force-pushed the try-improve-code-tool-usage branch from 1020dcb to 9cb38ce Compare February 28, 2025 11:41
@debanjum debanjum force-pushed the try-improve-code-tool-usage branch 2 times, most recently from 3c133f9 to eb74fba Compare March 9, 2025 11:08
Copy link

gitguardian bot commented Mar 9, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
15894534 Triggered Generic High Entropy Secret 45fb85f docker-compose.yml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

debanjum added 5 commits March 9, 2025 18:23
- Specify E2B api key and template to use via env variables
- Try load, use e2b library when E2B api key set
- Fallback to try use terrarium sandbox otherwise
- Enable more python packages in e2b sandbox like rdkit via custom e2b template

- Use Async E2B Sandbox
- Parallelize file IO with sandbox
- Add documentation on how to enable E2B as code sandbox instead of Terrarium
It may mitigate the intermittent invalid json output issues. Model
maybe going into repetition loops, non-zero temp may avoid that.
Simplify code gen chat actor to improve correct code gen success,
especially for smaller models & models with limited json mode support

Allow specify code blocks inline with reasoning to try improve
code quality

Infer input files based on user file paths referenced in code.
Simplify code log to set default_use_model during init for readability
@debanjum debanjum force-pushed the try-improve-code-tool-usage branch from eb74fba to c133d11 Compare March 9, 2025 12:53
@debanjum debanjum merged commit 9751adb into master Mar 9, 2025
10 checks passed
@debanjum debanjum deleted the try-improve-code-tool-usage branch March 9, 2025 13:20
debanjum added a commit that referenced this pull request Mar 19, 2025
Previously was encoding E2B code execution text output content as b64.

This was breaking
- The AI model's ability to see the content of the file
- Downloading the output text file with appropriately encoded content

Issue created when adding E2B code sandbox in #1120
debanjum added a commit that referenced this pull request Mar 19, 2025
Previously was encoding E2B code execution text output content as b64.

This was breaking
- The AI model's ability to see the content of the file
- Downloading the output text file with appropriately encoded content

Issue created when adding E2B code sandbox in #1120
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants