Usually you only need some subset of the data per page load if you invest some time looking at dev tools you can probably find the API call you need and save yourself a few MB.
>Providers like Oxylabs can be quite restrictive, preventing access to many of the common sites that scrapers choose to target.
Most of them seem pretty reasonable?
"Entertainment & streaming" - who's trying to scrape netflix's library?
"Banking and other financial institutions" / "Government websites" / "Mailing" - seems far more likely it'll be used for credential stuffing than for "scraping".
"Ticketing" - seems far more likely that it'll get used by scalpers than for scraping
The main targets of scraping - e-commerce sites (for price comparisons) and social media networks (for user generated content) are fine to scrape. Is there some use case I'm missing here? Is there a huge contingent of people wanting to scrape ticketmaster or bank of america?
I used the term "scrapers" pretty loosely, but yes, in many cases they are more bad actors than actual scrapers. However as they say the list may include other sites, I suspect Oxylab adds sites to the list at the site owners' requests (Amazon, Target, etc are likely to be on those lists)
Damnit I didn't even knew this existed with such insane pricing.
"Residential proxies is based on traffic and purchase model. Pay as you go model starts at $7.35 per GB, and can be discounted as low as $1.84 per GB when purchased in bulk."
Yeah but if you’re just scraping it’s only a kb or 2 per request. Once or twice a day to check the price of an item would let you track thousands of items for years for just 8$
Depends on the site. Some sites are sending down several megabytes of Javascript or images per request. Some sites even send down massive JSON payloads to page through instead of doing it iteratively.