Skip to content

bindings.ruby : fix test failures in test_whisper #2955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 28, 2025

Conversation

danbev
Copy link
Collaborator

@danbev danbev commented Mar 27, 2025

This commit updates the parallel tests to use 2 processors instead of the number of processors on the system. It also comments out the setting of the log callback to an empty lambda as this causes a segfault when enabled.

The motivation for the change to the number of processors is that if one has a large number of processors, for example I have 16 on the machine I used to test this, this would cause the following warning to be printed:

whisper_full_with_state: input is too short - 680 ms < 1000 ms. consider padding the input audio with silence

This is logged from:

int whisper_full_with_state(
        struct whisper_context * ctx,
          struct whisper_state * state,
    struct whisper_full_params   params,
                   const float * samples,
                           int   n_samples) {
   ...
    if (seek_end < seek_start + 100) {
        WHISPER_LOG_WARN("%s: input is too short - %d ms < 1000 ms. consider padding the input audio with silence\n", __func__, (seek_end - seek_start)*10);
        return 0;
    }

This will return early and there will be no segment callbacks to be invoked which in turn will cause the tests to fail.

danbev added 2 commits March 27, 2025 11:01
This commit updates the parallel tests to use 2 processors instead of
the number of processors on the system. It also comments out the setting
of the log callback to an empty lambda as this causes a segfault when
enabled.

The motivation for the change to the number of processors is that if one
has a large number of processors, for example I have 16 on the machine I
used to test this, this would cause the following warning to be printed:
```console
whisper_full_with_state: input is too short - 680 ms < 1000 ms. consider padding the input audio with silence
```

This is logged from:
```c++
int whisper_full_with_state(
        struct whisper_context * ctx,
          struct whisper_state * state,
    struct whisper_full_params   params,
                   const float * samples,
                           int   n_samples) {
   ...
    if (seek_end < seek_start + 100) {
        WHISPER_LOG_WARN("%s: input is too short - %d ms < 1000 ms. consider padding the input audio with silence\n", __func__, (seek_end - seek_start)*10);
        return 0;
    }
```
This will return early and there will be segment callbacks to be invoked
which in turn will cause the tests to fail.
This commit fixes the following warnings in the Ruby tests:
```console
/whisper/bindings/ruby/tests/test_segment.rb:52:
warning: ambiguity between regexp and two divisions:
wrap regexp in parentheses or add a space after `/' operator
```
And also adds a '_' prefix to some unused variables to avoid warnings.
@ggerganov ggerganov requested a review from KitaitiMakoto March 27, 2025 10:34
@KitaitiMakoto
Copy link
Collaborator

@danbev
Thank you for the pull request.
I realized your motivation and setting the number of processors to 2 is okay.
But could you restore the comment-out of empty logging setting? The fact that you have encountered a segfault means that someone else might one day. This setting exists to prevent the case.
I want you to investigate and fix the problem around the segfault if you can. But if you cannot do it because, for example, you're not familiar with C++ or so, I will do it.

@danbev
Copy link
Collaborator Author

danbev commented Mar 27, 2025

I want you to investigate and fix the problem around the segfault if you can

No problem, I'd be happy to take a look at this 👍

@KitaitiMakoto I've reverted the change now but I'm no longer getting segfaults when running the tests like I did earlier. today.

One thing that changed is that I rebased my branch to include the latest ggml sync to master, to make sure things still worked. But with those latest changes in ggml I can't reproduce the segfaults and more.

The commit reverts the commenting out of the Whisper.log_set call in
the test_whisper.rb tests.

I'm no longer getting segfaults when running the tests with this
which was the case earlier. One theory could be that I rebased this to
include the latest ggml sync to master to make sure things still worked.
With the latest changes in ggml, I can't reproduce the segfaults.
@KitaitiMakoto
Copy link
Collaborator

KitaitiMakoto commented Mar 27, 2025

@danbev
Thank for the change. Your branch works fine on my machine, too.

@ggerganov
I've approved. Will you push the merge button, or shall I do it as a merge policy of this repo? (I'm asking for the future case.)

@ggerganov
Copy link
Member

@KitaitiMakoto Feel free to merge now or in the future. Just make sure to use "Squash and merge" mode.

@KitaitiMakoto
Copy link
Collaborator

Okay, I will.

@KitaitiMakoto KitaitiMakoto merged commit 46d6e0a into ggml-org:master Mar 28, 2025
48 checks passed
@danbev danbev deleted the bindings-ruby branch April 1, 2025 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants