Skip to content

server: streaming of tool calls and thoughts when --jinja is on #12379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 102 commits into from
May 25, 2025
Merged
Show file tree
Hide file tree
Changes from 74 commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
16c9c63
add common_regex w/ support for partial final matches
Mar 12, 2025
6dcff43
add common_json w/ support for truncated json healing
Mar 12, 2025
a95fe78
renaming: string_find_partial_stop (moved to common.cpp)
Mar 12, 2025
ce2f593
add common_chat_msg_diff
Mar 12, 2025
cd3681d
partial common_chat_parse
Mar 12, 2025
9462365
refactor parser w/ optionals
Mar 12, 2025
6ed8a8f
server: wire chat diffs in stream mode
Mar 12, 2025
eaeed7d
fix trigger of thinking models (must happen after thoughts are closed)
Mar 13, 2025
d6e680a
nits + docs
Mar 14, 2025
64ea080
fix functionary v3.2 raw python!
Mar 14, 2025
c46d4da
rename: common_chat_syntax (now contains format)
Mar 14, 2025
4358d5d
rm common_regex.at_start
Mar 14, 2025
f477288
Merge remote-tracking branch 'origin/master' into tool-diffs
Mar 14, 2025
e0202b3
fix gcc compilation
Mar 14, 2025
f840e3a
fix unreachable code warning after [[noreturn]] annotation
Mar 14, 2025
af7391e
fix / refactor test-regex-partial
Mar 14, 2025
449917b
fix test-chat
Mar 14, 2025
b428b5c
rm spaces
Mar 14, 2025
668fc90
fix command r7b partial parsing (lacked args path)
Mar 14, 2025
b48ab23
Update test_chat_completion.py
Mar 14, 2025
aefc8a4
refactor + test chat parser (try_consume_json_with_dumped_args, liter…
Mar 15, 2025
22428a4
return partial msg from server
Mar 15, 2025
5b9c5a4
refactor partial json
Mar 15, 2025
3fbe84f
don't return empty <think></think>
Mar 15, 2025
d4cb7fe
test_tool_call: allow comment lines in now-multiline code strings (fo…
Mar 15, 2025
31f5eb2
accommodate yet another deepseek r1 distill fantasy syntax (<|tool▁ca…
Mar 15, 2025
bddc65a
rm space
Mar 15, 2025
ea3bf03
nit: fix python type
Mar 15, 2025
f3bfbc6
refactor test-chat-parser
Mar 15, 2025
bb7b9fe
fix QwQ 32B tool call parsing after thoughts (hermes2)
Mar 15, 2025
f0ea330
fix thinking models + tool calls (</think> not part of trigger's capt…
Mar 15, 2025
7856949
reinstate tool call id logic, keep track of previously generated ids
Mar 15, 2025
2412b5d
better logs for triggers
Mar 15, 2025
02913b0
fix msg diff test
Mar 15, 2025
c5c3482
try_consume_regex: basic tests + fix non-partial case
Mar 15, 2025
af79da0
chat-parser: test+fix finish, incomplete methods
Mar 15, 2025
562800f
normalize args in test-chat
Mar 15, 2025
ddeb318
consume spaces after parse_json_tool_calls
Mar 15, 2025
6c3f87e
Revert "fix thinking models + tool calls (</think> not part of trigge…
Mar 15, 2025
e2cef66
fix required tool calls w/ thinking models that have pre-opened think…
Mar 15, 2025
7a61eca
fix thinking model's initial trigger (take 2) + test qwq's template
Mar 15, 2025
2f55571
refactor chat parser (rm incomplete)
Mar 15, 2025
303f640
test groups of common_chat_msg_parser.try_consume_regex
Mar 15, 2025
e9540ad
run most test_tool_call tests in stream + non-stream modes
Mar 15, 2025
a818114
make functionary v3.2 parsing more strict (differentiate first match …
Mar 16, 2025
5031366
send final diff from server, to close off raw python arguments
Mar 16, 2025
dae6a28
nit: spaces
Mar 16, 2025
f026cb0
fix diff aggregation logic in make_any_request
Mar 16, 2025
e7f9d3e
fix test_chat_completion_with_timings_per_token & test_logprobs_stream
Mar 16, 2025
165b525
add missing functional import for gcc compilation
Mar 16, 2025
9d4a6f1
fix typo in test_calc_result
Mar 16, 2025
64b4039
fix thoughts parsing logic
Mar 16, 2025
fbba5da
support partial content streaming in Generic mode
Mar 16, 2025
4dcd653
strip reasoning (now that tags are strings and not regexes)
Mar 16, 2025
56156b7
run test_thoughts in stream mode too
Mar 16, 2025
5dfa2f7
r1: avoid partial call triggers from spaces
Mar 16, 2025
91a5084
fix test_thoughts / refactor expectations
Mar 16, 2025
4f78d44
fix partial json crashes
Mar 16, 2025
ea57e47
fix test-chat's unparsed thought expectation
Mar 16, 2025
1d25178
Merge remote-tracking branch 'origin/master' into tool-diffs
Mar 23, 2025
42cb16f
fix partial json crash after comma
Mar 23, 2025
37b4a3a
fix test-chat.cpp
Mar 23, 2025
13d725d
fix gcc build of test
Mar 23, 2025
a40aead
Merge remote-tracking branch 'origin/master' into tool-diffs
Mar 26, 2025
329d943
Merge remote-tracking branch 'origin/master' into tool-diffs
Apr 1, 2025
e63e542
Merge remote-tracking branch 'origin/master' into tool-diffs
Apr 3, 2025
21cd34c
fix regex-partial (drop reluctant repetitions conversions)
Apr 4, 2025
5f0450d
partial regex: allow newlines in prefixes
Apr 4, 2025
36ecb01
tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5)
Apr 4, 2025
68eeff1
Update function-calling.md
Apr 4, 2025
12deff6
nit: spaces
Apr 4, 2025
d0a686b
Update tool_bench.py
Apr 4, 2025
a604b2d
Merge remote-tracking branch 'origin/master' into tool-diffs
Apr 4, 2025
90789cd
Inject date_string in llama 3.x + test it & functionary v2
Apr 5, 2025
71435cf
Inject date_string in llama 3.x + fix for functionary v2
Apr 5, 2025
543b73e
add missing chrono include
Apr 7, 2025
e3c372c
move/fix detection of functionary v3.1 before llama 3.x, fix & test t…
Apr 7, 2025
387611a
Merge branch 'date' into tool-diffs
Apr 7, 2025
01a3e31
Merge remote-tracking branch 'origin/master' into tool-diffs
Apr 7, 2025
59b87c5
move string_find_partial_stop & string_ends_with to common
Apr 7, 2025
ff35374
add common_regex (supports partial matches)
Apr 7, 2025
869e1a9
Update test-regex-partial.cpp
Apr 7, 2025
6f109fa
Update common/common.cpp
ochafik Apr 18, 2025
908e12f
Update common/regex-partial.cpp
ochafik Apr 18, 2025
868b442
Update common/regex-partial.cpp
ochafik Apr 18, 2025
2ea5f5c
Update common/regex-partial.h
ochafik Apr 18, 2025
b275da3
partial regex: add missing iterator end checks
Apr 18, 2025
9b620e5
string utils: use string_views
Apr 18, 2025
5c99bdc
direct throw to avoid ggml.h include
Apr 18, 2025
e051be6
regex-partial: replace missed ggml_asserts
Apr 18, 2025
afce553
Merge remote-tracking branch 'origin/master' into partial-regex
May 14, 2025
c879a57
Merge branch 'partial-regex' into tool-diffs
May 14, 2025
ad07a3b
Merge remote-tracking branch 'origin/master' into tool-diffs
May 15, 2025
573e8c3
fix merge
May 15, 2025
d6e1d5b
Merge remote-tracking branch 'origin/master' into tool-diffs
May 15, 2025
6946a83
Merge remote-tracking branch 'origin/master' into tool-diffs
May 15, 2025
224101b
chat-parser: remove input from exception (llm output may contain PII)
May 16, 2025
6ddda10
Merge remote-tracking branch 'origin/master' into tool-diffs
May 16, 2025
8886c24
disable failing tests from test_tool_call.py
May 16, 2025
810c4c3
json-partial: add comments
May 17, 2025
f0d5df2
Merge remote-tracking branch 'origin/master' into tool-diffs
May 23, 2025
40951c8
Merge remote-tracking branch 'origin/master' into tool-diffs
May 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions common/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -58,19 +58,25 @@ add_library(${TARGET} STATIC
base64.hpp
chat.cpp
chat.h
chat-parser.cpp
chat-parser.h
common.cpp
common.h
console.cpp
console.h
json-schema-to-grammar.cpp
json.hpp
json-partial.h
json-partial.cpp
llguidance.cpp
log.cpp
log.h
minja/chat-template.hpp
minja/minja.hpp
ngram-cache.cpp
ngram-cache.h
regex-partial.cpp
regex-partial.h
sampling.cpp
sampling.h
speculative.cpp
Expand Down
Loading
Loading