Extractor Quality Score

We have created a free tool which gives inidicators about the quality of your extractor, this is a good tool to know if you have built an effiecient extractor. This tool will always be available during the testing of your extractor while building through the wizard

 

What makes a good data extractor?

A good extractor is one which extracts as much relevant data from a webpage using the lowest amount of resources. Building a good extractor will ensure that you get the exact data you want and spend the lowest amount of your request allowance

 

Here are the factors that make a good extractor:

 

Xpath Definitions

Importance: High

 

Start Urls Definitions

Importance: High

 

Rows per request

Importance: Very High

 

Dropped row per request

Importance: High

This is the amount of rows removed from your result relative to the pages found by your extractor. For example if you specify that your extractor must return rows with a price, the extractor will drop any pages found without a price in the page. The best way to keep this metric low is by adding start urls and link extraction rules to limit the extractor to pages that contain “price”

 

Row count

Importance: High


 

Column per row

Importance: Very High

 

Empty columns

Importance: High

 

Successful request rate

Importance: High

 

Redirect request rate

Importance: High

 

Enqueued/queued rate

Importance: High

Are you ready to start getting your data?

Your data is waiting….

Leave a comment:

You should login to leave comments.