Understanding starter urls

Starter URLs are the URLS that the crawler will start its journey when crawling a website. It should contain the elements/data that you require from the site. The crawler will then look for similar elements in the website starting from this URL. If you would like your scraping to start from multile places, you can specify multiple start URLs

Starter URLs are best as Search URLs or category listings pages. If the listing page or search url is paginated the crawler will follow the pagination to find similar pages with the same information specified

E.g A Search url or category page for a best selling books on a online book retailer could look like www.bookwebsite.com/fiction/best-selling-100

Save Costs, Time and Get to market faster

Build your first online custom web data extractor.

Leave a comment:

You should login to leave comments.