OpenAI's Whisper API Release
As well as exciting news about the ChatGPT API earlier on in the week, there has also been a release of the Whisper API, and Whisper is OpenAI's speech-to-text model, and you can now access this via an API, which means that you can take recordings of someone's voice of their speech through Bubble, and you can get a transcript back from the OpenAI Whisper API, and I'm going to show you how to do that in this video.
Setting Up the API Connector in Bubble
So I'm going to go into my Bubble app, and I have the API connector installed. These are some of the previous tutorials I've recorded, but for now I'm going to create a new API, so this will be called OpenAI, and then I know that OpenAI authenticates through a private key in the header, we'll come back to that. And we need to set up an API call, so let's call this "get transcription", and I'm going to make it an action so that it will appear in a workflow.
Configuring the API Call
I then start basically translating this, so I'm going to copy this endpoint, and paste it in there and make it a post command, and then...so this is slightly different. If you've seen any of my other API third party tutorials, you will notice that it's not a data field or parameter, it is a form. And so in Bubble, we have to say that the body type is form data. Now there is a header which says content type, multi-part data form, so I'm going to copy that. And split it up, make sure I remove the colon, there's no mysterious spaces in there, good.
Adding Parameters to the API Call
And then I can start adding in parameters. So one parameter is model, and then whisper-1. And then I need file, and then I give it a file location, so I add another parameter, this is file. And so it's not private, because I want to be able to create a workflow where I can edit this, but I mean I can insert a file that's been freshly uploaded, for example, and then I need to tick send file.
Testing the API Call
And I'm just going to drag in a very short voice memo I just took on my phone and air dropped to my Mac of me saying a sentence. So you can see the upload is in place there, that's all so I can initialize the call and I can test that the API is working. So there's one thing missing, and that is an API key. So I go over into my openAI account, and I create a new secret key, and I copy that. And I will be deleting this before this video gets uploaded. I place it in there, and let's just check I'm not missing something I am, I need to add in bearer before the key value. Okay, right, I think I have everything in place now to test this out.
Reviewing the API Response
So it's returning JSON, and I can check that that's correct, because I'm expecting a response like this. And yeah, let's give this a test. So initialize call, Bubble is uploading the file to openAI, and I get this back. And that is my voice note, recording a test voice note to test openAI transcription.
Conclusion and Next Steps
So that is it in under 4 minutes. That is the technical side of setting up the Bubble API connector to send an audio file to the whisper API by openAI, and getting a transcript back. And to be honest, it seems very, very quick, and it's a very impressive service. Do leave a comment if you would like a follow-up video where we can build this into a front-end form, so you could create a service that allows a user to upload an audio file, and then you send it off and get the transcript. But I just wanted to tackle the technical side now to help people get going with the whisper API.