How to record audio & convert to text - OpenAI Whisper API

Introduction to Audio Recording and Transcription with OpenAI Whisper API

Thank you to everyone who left a comment on our last OpenAI whisper API video. The recurring theme in the comment section was: can you show how to record audio in Bubble and then send it over to the OpenAI whisper API and get an AI-generated transcript back and save that into your Bubble app? Well, that is exactly what this app that I'm going to show you does. So let's dive right into the page.

Using Bubble's Audio Recorder

We're using Bubble's own audio recorder and visualizer. There are other ones available in the plug-in store to record audio, but this one's just free. I would point out that this saves the audio as a wave format, which might mean that you end up generating slightly larger audio files than something that an audio recorder that saves it as mp3, but anyway, it works fine for this demonstration.

Setting Up the Audio Recording Workflow

I've got the audio recorder elements on the page and I've got two buttons. If we go into my start/stop workflow, we'll see that there is an action of start/stop audio recorder A. I then need a second action. I've got a save button, and this is an action that the plug-in gives you: upload content of audio recorder. This is referring to saving what has been recorded to your Bubble storage, which is part of AWS s3. Anyway, it means basically it's part of your app's storage.

Saving the Audio Recording

I then want to be able to retrieve that at some point, so I have a data type called audio recording. I have a file field of type file, and I insert into that the results of step one. So that is my way of saving the file. This saves into my Bubble app and then this saves me with a way for my database to be able to retrieve that file.

Displaying Recorded Audio Files

I then have below a repeating group that shows all of the data type entries for audio recording. I print the audio recordings file URL. If I hop back to it here, you can see it. This is where the file is actually hosted, but do notice that it doesn't start with HTTPS colon. We're going to need to add that in.

Generating Transcripts with OpenAI Whisper API

Lastly, I've got a button here where I run a workflow that I've labeled "get transcript". Then I save the results of the OpenAI whisper API response as text into my data, my audio recording as listed in the repeating group. If you want help on how to get to this point here, do check out our previous video. I've put a link in the comment section, and that will show you everything you need to do in order to get this into the Bubble API connector.

Formatting the Audio File URL for OpenAI Whisper

The end result of that is that you can end up with a workflow action where you put a dynamic link for the file. Notice that I've got HTTPS colon then the audio recordings files URL. Believe me, I've just spent 15-20 minutes making sure that I get all the right formats in place. Effectively for OpenAI whisper, you need to provide them with a publicly accessible audio file or video file in one of these formats here, and so that's what we're doing with this app.

Testing the Audio Recording and Transcription

Let's give it a test. If I click start, I can say, "I'm testing the OpenAI whisper API," and I can click stop. I can click save, and Bubble is now saving it. It does take a little bit of time on this. I did notice, and I think it's this last one here. In fact, let me be absolutely sure. I'm gonna say order sort by date created descending, and that way I know it will be my top one. Now I click generate transcript, and this is the call to whisper API. There we go: "I'm testing the OpenAI whisper API." I think that's basically perfect to what I said.

Recap and Troubleshooting

Quick recap: The ways that I'm setting this up in order to demonstrate are getting the file formatted correctly in here. This is because it's a file type field in Bubble, so I can say file URL. In fact, let me show you the data structure there. I just have my audio recording file of type file and then transcript of type text. Then I save the response. Of the response from OpenAI whisper, I get two choices, and so I choose text.

Initially, I tried to set this up all in one workflow. Basically, I would click start, then stop, then save it to my database, then send it straight off to whisper, the whisper API. But I found that I kept getting an error back saying that the file that I was providing to whisper wasn't acceptable, wasn't in the right format. So that's why I broke it out into this repeating group table.

If I was to go on a hunch, I would say that sometimes my workflow was submitting the file to whisper before the file was actually properly accessible. We're talking about fractions of a second here, but I think that Bubble might have been passing on the file URL just a little bit before it was actually accessible, and that was what was causing the error in whisper. So by breaking it up into a save command and then a separate generate transcript command, I was able to work around that and get my transcripts back.

Conclusion

If you have any questions about this process or anything to do with OpenAI or Bubble, please do leave a comment below. If you're really stuck, we provide Bubble coaching. Do go to our website to find out more details about that.

How to record audio & convert to text - OpenAI Whisper API

With just this tutorial learn...

Introduction to Audio Recording and Transcription with OpenAI Whisper API

Using Bubble's Audio Recorder

Setting Up the Audio Recording Workflow

Saving the Audio Recording

Displaying Recorded Audio Files

Generating Transcripts with OpenAI Whisper API

Formatting the Audio File URL for OpenAI Whisper

Testing the Audio Recording and Transcription

Recap and Troubleshooting

Conclusion

Table of contents

The best way to learn Bubble.io?

Build No Code Confidently

Find every solution in one place

Dive deep into every detail

Frequently Asked Questions

Still have questions?