Generate an AI transcript in Bubble with Speaker Labels - Part 2
In this Bubble tutorial we demonstrate how to use the AssemblyAI transcription and speaker labelling API with backend workflows to loop through every 'utterance' returned by AssemblyAI.
Unlock the power of AI transcripts: Learn how to separate speakers and save utterances in Bubble!
Master backend workflows: Discover how to schedule API workflows on lists and iterate through JSON responses effortlessly.
Transform raw JSON into organized repeating groups - see how in this step-by-step tutorial!
Generating a Transcript with Speaker Labels in Bubble.io
Welcome back to part 2 of our mini-series looking at how to use AssemblyAI to generate a transcript based on the audio file and the key bit that we're doing here is separating out the different speakers. In this part, I'm going to be showing how to take the JSON response that we generated in part 1 which separates out the conversation or the speech into different utterances and we're going to be running that through a back-end workflow to iterate through all of the utterances and save them to our database so that in the end we get a repeating group of our transcript broken up by speaker.
Setting Up the Back-End Workflow
I'm going to click save. This is one of the transcripts that I've generated and I'm going to go into back-end workflows. So I'm going to create a workflow and call it save single utterance. It doesn't need to be public. And then the key parts of the utterance that I want to save are going to be start, end, I believe these are timestamps, text and speaker. So I'm going to say start, end, text, and speaker. And so start and end are going to be numbers.
Creating Data Types for Utterances and Transcripts
And then I'm going to create a data type for handling all this. I'm going to call it utterance. And I also need a data type for grouping them together so I'm going to call that transcript. Okay, and then utterance. So utterance is going to have those fields. We're going to have start, end, end, is it end or finish? End. Speaker and text.
Building the Workflow
So that when this workflow runs, this back-end workflow, I'm going to create a new thing. Create an utterance. Add in all the fields. Connect it up to the data that goes into the workflow. And I need to pass one more bit of data in which is going to be my transcript because I need a way of grouping it all together assuming my app is going to be generating more than one transcript at a time. So transcript. Okay. And then I'm going to group them by adding a transcript field.
Creating a Simple UI
Let's build some really simple UI just to get us going with this. So Speaker, Labels. And to keep this video short, I'm not going to go through the upload process and the initial call to AssemblyAI. That's covered in previous videos. I'm simply going to put in an input and this is going to be my transcript ID. I'm going to add a button.
Designing the Layout
Now, I'm designing using fixed layouts. I always say that's a big no-no. Columns and rows. I'm just designing it quickly though. That's why I'm taking shortcuts there. And then we'll have a repeating group. And in fact, I'll put this into a group which will contain a transcript. The repeating group is going to be a utterance this. Do a search for utterance where transcript equals parent group's transcript. That's my way of displaying either relevant utterances.
Customizing the Repeating Group
Let's get rid of number of rows. And then in utterance I'm going to say speaker. Speaker. Let's change this into a row. We'll make this bold. Copy and paste it. We'll make this just and then this is not bold and this is going to be our text. Taking up the remainder of the space. And oh, one of the reasons for saving the additional data, like start and end, is I think that those are going to be useful for ordering. So we'll order just by the start stamp and then I'll get rid of the min height and add in just so that it's really clear for us.
Setting Up the Generate Text by Speaker Workflow
So, let's call this generate speaker generate text by speaker. So I'm going to add in the workflow and what this is going to do is do the retrieve action for AssemblyAI. So, have I got that in or what do I call it? Get process transcript. And the ID is going to be from my input. And for the purpose of this demo, I'm just going to take the same ID and make it the initial content for the input.
Implementing the Back-End Workflow
And then this is where the back-end workflow comes into play. So I'm going to schedule API workflow on list and this is one way in Bubble. In fact, I think it's the most robust way in Bubble to iterate through an uncertain amount of data. We don't know how many utterances it's going to return. So we can't just say do this to utterance 1, do this to utterance 2 and so on. We need to be able to iterate through x number of utterances.
Configuring the Workflow Steps
So the type of thing we're going to go through is text. Or is it? It's not. It is going to be utterance. And then the list to go through is the result of the call. Save single utterance. I just had to take a moment there to work out what wasn't working and I found that in my initialised call to fetch the transcript, my utterance list was empty for some reason. So by making sure that that is set to get processed transcript utterance, it means that I can successfully fill in. The type of thing is the get processed transcript utterance and I'm getting the data, the list from step 1.
Finalizing the Workflow
So this means that it's all blue, it's not red, it's accepted data format. I'm going to say run it right away and then I can begin to fill in the data for each utterance that we're saving. So is this, when we get this it's referring to the single one and it's basically going to loop through them. Then we can say text speaker and then transcript. Ah, I need to create a transcript data, create a new thing. And I'm going to display this transcript into the group transcript, that's just so that I can get the right data into my repeating group. View result of step 2 in fact I'm going to move this up the top. So step 1 there and then transcript is the result of step 1. Ah, okay, I think that I've now got every step in place.
Testing the Workflow
Let's test it. So if this works, this process is going to retrieve a already processed transcript from AssemblyAI and then it's going to use the backend workflow to loop through all of the utterances and save them to our database. Let's give it a go. I think I forgot update label, speaker, text. Ah, okay and because I was displaying data in a group I'm just going to run it again. Okay, and there you have it.
Conclusion
So I've taken a lot of shortcuts there but I'm trying to demonstrate the technical side of how you would use a backend workflow to loop through all of the utterances to get back and now we have something that looks closer to a script because we've got each utterance split by the speaker who says it.
Ready to Transform Your App Idea into Reality?
Access 3 courses, 400+ tutorials, and a vibrant community to support every step of your app-building journey.
Start building with total confidence
No more delays. With 30+ hours of expert content, you’ll have the insights needed to build effectively.
Find every solution in one place
No more searching across platforms for tutorials. Our bundle has everything you need, with 400+ videos covering every feature and technique.
Dive deep into every detail
Get beyond the basics with comprehensive, in-depth courses & no code tutorials that empower you to create a feature-rich, professional app.
Valued at $80
Valued at $85
Valued at $30
Valued at $110
Valued at $45
14-Day Money-Back Guarantee
We’re confident this bundle will transform your app development journey. But if you’re not satisfied within 14 days, we’ll refund your full investment—no questions asked.
Can't find what you're looking for?
Search our 300+ Bubble tutorial videos. Start learning no code today!
Frequently Asked Questions
Find answers to common questions about our courses, tutorials & content.
Not at all. Our courses are designed for beginners and guide you step-by-step in using Bubble to build powerful web apps—no coding required.
Forever. You’ll get lifetime access, so you can learn at your own pace and revisit materials anytime.
Our supportive community is here to help. Ask questions, get feedback, and learn from fellow no-coders who’ve been where you are now.
Absolutely. If you’re not satisfied within 14 days, just reach out, and we’ll issue a full refund. We stand by the value of our bundle.
Yes, this is a special limited-time offer. The regular price is $350, so take advantage of the discount while it lasts!