Skip to content

whisper_full_get_token_text crash by run in a loop in swift #1652

Open
@bitsmakerde

Description

@bitsmakerde

Hi,

I have the problem that the translation with files which are bigger as 30 seconds are terrible, in the swiftUI example.

to solve this problem I want split my files in 30 seconds parts and loop over them to get the best result with init prompts too.

But if I have more as one file my app crash and I have no idea to fix it.

I get this error:

Selecting 8 threads
About to run whisper_full
WHISPER_ASSERT: /Users/andre/Documents/GitHub/whisper.cpp/whisper.cpp:4531: n_logits == ctx.vocab.n_vocab

here how I loop the my files:

func transcribeSample() {
        if let sampleUrl {
            let segments = [sampleUrl, sampleUrl]

            for segment in segments {
                print("segment: \(segment)")
                transcribeAudio(segment)
            } 
        } else {
            messageLog += "Could not locate sample\n"
        }
    }

And here my config

func fullTranscribe(samples: [Float]) {
        // Leave 2 processors free (i.e. the high-efficiency cores).
        let maxThreads = max(1, min(8, cpuCount() - 2))
        print("Selecting \(maxThreads) threads")
        let myString = "ähm, ah, äh, ahm, äähm, ääh, ähh, uhm, eh, ehm, hmm, mm, mhm, mmm, uh, um. I mean, mean."
        var prompt: UnsafePointer<CChar>?

        myString.withCString { cStr in
            // cStr ist ein UnsafePointer<CChar>, der auf den C-String zeigt
            prompt = UnsafePointer<CChar>(cStr)

            // Beispielhafte Nutzung des Zeigers
        }

        var params = whisper_full_default_params(WHISPER_SAMPLING_GREEDY)
        "de".withCString { en in
            // Adapted from whisper.objc
            params.print_realtime = true
            params.print_progress = false
            params.print_timestamps = true
            params.print_special = true
            params.translate = false
            params.language = en
            params.n_threads = Int32(maxThreads)
            params.offset_ms = 0
            params.no_context = true
            params.single_segment = false
            params.suppress_blank = false
            params.initial_prompt = prompt

            params.max_len = 1
            params.split_on_word = true
            params.token_timestamps = true

            whisper_reset_timings(context)
            print("About to run whisper_full")
            samples.withUnsafeBufferPointer { samples in
                if whisper_full(context, params, samples.baseAddress, Int32(samples.count)) != 0 {
                    print("Failed to run the model")
                } else {
                    whisper_print_timings(context)
                }
            }
        }
    }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions