Introduction to Web Scraping with Page2API
In this video, I'm going to demonstrate how to web scrape into your Bubble application using the web scraper Page2API. I tried several different web scraper APIs for a recent client project and found that Page2API offers the best integration with the Bubble API connector plugin. That's what I'll be demonstrating to you now.
Setting Up the API Call
Let's head into the Bubble API connector and install this plugin if you haven't already. We'll add another API for Page2API and make a call. For this demonstration, we'll be scraping the h1 tag, which is a common target for web scraping as it's usually the most important header on a web page. We'll call our API "scrape h1".
Understanding API Documentation
To fill out our API call, we need to dig into the Page2API documentation. When looking at API documentation, I find the easiest section to translate into Bubble is the cURL section. We need to make a post call, and in the header, we'll declare the content type as application/JSON.
Configuring the API Call in Bubble
In Bubble, we'll set up the call as follows: 1. Set the call type to POST 2. Add the content type header 3. Copy the API endpoint URL 4. Add the necessary JSON to the body of the call
It's important to change the call type from "Data" to "Action" to enable making this call in a workflow.
Customizing the API Call
We'll edit the JSON in the body to target only the H1 tag and make the URL dynamic. We'll also specify that we want to get back text data instead of HTML. After making these changes, let's test the API call.
Testing the API Call
The web scraping process takes a few seconds to complete. One reason for this is that the scraper uses a real browser, which makes the scraping more reliable and less likely to be blocked by websites.
Implementing Web Scraping in Your Bubble App
Now, let's demonstrate how to add this functionality to your Bubble app design. We'll create a repeating group that shows a list of websites, with an input for the URL and a "Scrape" button.
Creating the Workflow
When the button is clicked, we'll: 1. Call our API 2. Reset the input 3. Add the scraped data to our database
We'll create a new data type called "Website" to store our scraped H1 tags and URLs.
Testing the Web Scraping Functionality
Let's refresh our app and try it out. Remember that web scrapers aren't very intelligent, so you need to provide them with correct URLs. In a real app, you'd want to add error handling and input validation to ensure users enter valid URLs.
Handling Errors and Edge Cases
We tested a few different scenarios, including deliberately incorrect URLs, to see how our app handles errors. It's important to build in error handling and user guidance to make the web scraping process as reliable as possible.
Conclusion
This tutorial demonstrates how to implement web scraping in your Bubble app using Page2API. Remember to consider error handling, input validation, and user experience when implementing this feature in your own applications.