-
Notifications
You must be signed in to change notification settings - Fork 617
Adding EOS Tokens to Qwen Models #2512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2512
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 3d979b7 with merge base fb6d4cd ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
A gentle ping on this. |
Thanks for the ping, sorry this fell through the cracks! The none is coming from the mis-use of the bos_id here. bos_id is not actually defined for Qwen2 so it's showing up as none. |
Interesting. Can I get a quick review on the changes I have made so far, just to be sure about my implementation. |
Code looks nice and clean and I don't see any obvious errors (with the exception of the bos_id thing I already pointed out to you). Of course, the real test will be the unit tests :) |
Hey @ariG23498 , @joecummings , I am new to torchtune and OSS, I was thinking of taking Qwen2_5. This PR will solve the issue for both Qwen2 and qwen2_5 ? I think qwen2_5's tokenizer is inheriting from qwen2, so no need to implement for qwen2_5? |
@joecummings I think the PR is in a good state to review. The tests pass on my end. @ysurs I checked the Qwen 2.5 tokenizer, it does inherit Qwen 2's tokenizer, but the edits still need to be made I guess, as the current tokenizers API would have changed. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2512 +/- ##
==========================================
- Coverage 64.26% 60.70% -3.56%
==========================================
Files 427 427
Lines 25988 25997 +9
==========================================
- Hits 16700 15781 -919
- Misses 9288 10216 +928 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Hi @joecummings, The PR was missing wrong docstrings, which was fixed. Can you trigger the tests, so that I can be certain about the changes? |
If you look at the contributing guide, you should be able to run these tests very quickly on your local machine to ensure that you're passing the linter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @ariG23498! Two small comments, but other than that this looks good to me
mask.append(mask[-1]) | ||
if add_end_tokens: | ||
tokens = tokens + [self.eos_id] | ||
mask = mask + [mask[-1] if mask else True] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noob q: why is this different than what we're doing for Llama3? ref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we added True
to mask if there was an end token this part of the test fails
expected_mask = [True] * 67 + [False] * 121 |
due to the fact that it searches for False
.
Thanks for catching this, I wanted to ask about the test itself, but forgot.
Do you think I should change this test?
Hi folks! Let me know if I need to work on something to get this to the finishing line 🤗 |
Fix #2481
Current status 👇
Not very sure why
None
comes as a token. Am I missing something obvious here? Any help would be great.After the initial review, I will add tests.