By Victor @August, 20 2021
In today’s world where most of us depend on buying products online, it takes a lot of manual effort to find out on which website the price tag is lowest. So what most of us do is go to one of the most popular websites like Amazon or eBay and buy those products. What if we could easily develop a price comparison tool that can compare the prices from different websites and can then show any user the optimal prices and associated information about that product from different websites in a single place. That is what we are going to do in today’s project.
In this tutorial we will focus on the below to achieve our goal
Fetching price data from three different websites
Processing data including cleaning it for our purpose
Program to send Notifications about price change
Using webautomation.io for speeding up Scraping
Web Scraping is a process of collecting relevant information from a particular webpage and then exporting that information in a proper format according to our needs.
Python package for web scraping: Beautiful Soup is a python library that helps in extracting data out of markup languages like HTML and XML.
Other python packages involved: requests
Note: We recommend using google colab / jupyter notebook as editor for this project, although it is not mandatory.
Step 1: Install prerequisites :
Install Python (https://www.python.org/downloads/)
Install requests ()
Step 2: Import packages :
Step 3: Go to the product page of different websites and get the URL :
Step 4: Populate headers :
To get user-agent , google my user agent,
Now go to the Amazon page, right-click on the product title, and inspect,
You will get the following screen after clicking on inspect
As you can see in the HTML source code, element with id productTitle contains the title of the product,
This will get us the product title but the data should be cleaned to process further, As we can see the data has HTML tags.
To remove tags,
We got the product title, which is stored in variable product_title
Similarly when we click on price tag and do inspect we get the following html source code,
Here, id priceblock_ourprice contains the price tag. So to fetch the price we need following code,
Now we have the product price from amazon in variable amazon_product_price
In the same manner we will get the price tags from other two ecommerce websites as well.
Visit to onbuy page, right click on the product price and inspect ,
We get the following html elements from inspect,
As you can see this layout is a little bit different. Here we will have to fetch the price tag from a class element as opposed to span in Amazon’s case.
So to fetch data from class element in html,
For Wexphotovideo :
Wexphotovideo has the same layout as onbuy. So we can repeat same process here,
Get html data,
Clean and extract price from html tags,
Removing currency symbols and converting prices from string to float for comparison.
Company and URL contain the website name and URL for the product which has the minimum price.
We can write a function to send the notification to our mail IDs using SMTP.
Now when we have the prices of data, it is easier to use a bar chart to compare the prices instead of looking at the numbers. Visualization becomes more useful as the number of data points increases.
We have shown here how easy it is to visualize price data from three different websites using a python library called matplotlib. We are using matplotlib bar chart to Visualize the different prices here.
How good can it be to get a notification about any price change that interests you? We have shown in the following code how one can write a simple python script to get notifications via email.
The script here sends a notification about the company with the lowest price with a link that can be used to buy the product. Variable body in the code can be changed according to our needs.
We can schedule this above code to run periodically and send us notifications whenever the price falls.
Alternatively, if you just want a plug-and-play solution where you can just enter the URL and you get the data without even writing a line of code, WebAutomation is just the tool for you.
Try an easy-to-use, pre-built scraper from https://webautomation.io . All you have to do is enter the starting URL of web pages you want to scrap and it will give you the data you want in a nice and clean format that is downloadable.
Steps To Follow:
1 . Sign up for a free trial here https://webautomation.io/account/sgn/
2. You can use a readymade scraper for popular websites like amazon for free at https://webautomation.io/pde/amazon-department-product-scraper/80/
3. You can scrape any link with the help of raw data extractor. This extractor will help you to extract all html sources of visited links.
4. You can also use the API to get structured data https://webautomation.io/api/redoc/
We aim to make the process of extracting web data quick and efficient so you can focus your resources on what's truly important, using the data to achieve your business goals. In our marketplace, you can choose from hundreds of pre-defined extractors (PDEs) for the world's biggest websites. These pre-built data extractors turn almost any website into a spreadsheet or API with just a few clicks. The best part? We build and maintain them for you so the data is always in a structured form. .
You should login to leave comments.