AssemblyAI Webhook Duplication in Bubble Database
Here's something interesting that I've learned when using AssemblyAI to generate transcripts of our videos for an internal app I've been building for Planet No Code. Effectively, we submit an mp3 file of our videos and we use their service to convert the mp3 into a transcript. We get timestamps, chapters, and then we use those timestamps to generate our chapter headings that we can paste into our YouTube descriptions. But the issue I was having was that when AssemblyAI seemed to have generated the transcript successfully, I was checking that in the logs. I was noticing that every action after that was running multiple times in my Bubble app, and that included running some rather expensive Claude AI prompts.
Multiple Workflow Executions
I was sometimes getting, it would run five times per MP3 file that I submitted that kind of start of the process. And then I worked out why. So if we dig into the AssemblyAI documentation, and right the way down here on webhooks, we see that they will try and respond to a resend the webhook up to ten times, waiting 10 seconds. Now, here's something you might not know about Bubble, which is that if I go into my backend workflows, this is the workflow that runs when AssemblyAI has finished in the transcript. I get back a transcript id, and then I would recall one of Assembly AI's APIs and I would get back my transcript content.
AssemblyAI Mini Series
We do cover all of that in a mini series on AssemblyAI. I'll put a link down in the description to access that. But to get into the detail here, what was happening was I had all of these generate content workflow actions, some of these taking 30 seconds to run, because I would send the whole transcript over and I'd ask for a lot of data back. I had all of that in this workflow here. Here's the issue.
Bubble's Webhook Response Behavior
Bubble will only respond to an inbound webhook with a success status when the workflow has completed, which means that if you put too much content in the endpoint that you're using to be notified when AssemblyAI has finished generating the content and the transcript is ready, it will fail and it will wait 10 seconds and it hasn't got a response. So it will retry, and it will retry and it will retry. So this has cost me not a large amount of money, but it's cost me more than I was expecting in anthropic top ups of my account, because it was running five, six times for each MP3 file that I was putting through this process.
The Solution: Minimizing Endpoint Content
So the fix is to really reduce down what you've got in this endpoint. So I'm doing the bare minimal, and then I'm actually putting the rest of it in another backend workflow that I send data through to, and it moves onto here. Here's actually where I fetch my paragraph data. And then all of this, which can take actually about two to three minutes, is in a completely separate workflow. So this bit here can execute really fast. And I no longer get duplicates in my database after using AssemblyAI to generate a transcript.