Closed
Description
Describe the bug
Hello
I am trying to integrate the whisper API into my Flask app. However I get the following error when I input the received file from the flask endpoint, I get the following error:
openai.error.InvalidRequestError: Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']
However, loading the file in the interactive console works fine.
In [16]: r = openai.Audio.transcribe('whisper-1',open('../Downloads/sample.mp3','rb'))
In [17]: r
Out[17]:
<OpenAIObject at 0x192993c6750> JSON: {
"text": "This episode is actually a co-production with another podcast called Digital Folklore, which is hosted by Mason Amadeus and Perry Carpenter. We've been doing a lot of our research together and our brainstorming sessions have been so thought-provoking, I wanted to bring them on so we could discuss the genre of analog horror together. So, why don't you guys introduce yourselves so we know who's who? Yeah, this is Perry Carpenter and I'm one of the hosts of Digital Folklore. And I'm Mason Amadeus and I'm the other host of Digital Folklore. And tell me, what is Digital Folklore? Yeah, so Digital Folklore is the evolution of folklore, you know, the way that we typically think about it. And folklore really is the product of basically anything that humans create that doesn't have a centralized canon. But when we talk about digital folklore, we're talking about..."
}
To Reproduce
- Create a Flask App.
- Add an end point that receives an valid audio file.
- pass the bytes data of the file to
openai.Audio.transcribe
method through 'request.files[fileName].stream.read()`.
Code snippets
The end point code:
with tempfile.TemporaryFile() as temp_file:
temp_file.write(audio_file)
transcript_read = openai.Audio.transcribe("whisper-1", temp_file)
return transcript_read
the FFprobe info of the file:
ffprobe version 4.4.1-full_build-www.gyan.dev Copyright (c) 2007-2021 the FFmpeg developers
built with gcc 11.2.0 (Rev1, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libglslang --enable-vulkan --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 56. 70.100 / 56. 70.100
libavcodec 58.134.100 / 58.134.100
libavformat 58. 76.100 / 58. 76.100
libavdevice 58. 13.100 / 58. 13.100
libavfilter 7.110.100 / 7.110.100
libswscale 5. 9.100 / 5. 9.100
libswresample 3. 9.100 / 3. 9.100
libpostproc 55. 9.100 / 55. 9.100
Input #0, mp3, from '.\Downloads\sample.mp3':
Metadata:
title : Monsters in the Static
comment : We look at the subgenre of analog horror, where something sinister might be lurking in the horizontal lines and vertical holds of those old VHS tapes.
lyrics-ENG : <p>In the subgenre of analog horror, there’s something sinister or supernatural lurking in the horizontal lines and vertical holds in those old VHS tapes. Filmmaker <a href="https://wnuf.bigcartel.com/">Chris LaMartina</a> explains why he wanted his mov
album : Imaginary Worlds
genre : Podcast
date : 2020
encoder : Lavf58.76.100
Duration: 00:00:50.05, start: 0.025057, bitrate: 128 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 128 kb/s
Metadata:
encoder : Lavc58.13
OS
Windows 11
Python version
Python v10.5
Library version
0.27.2