Providers like Oxylabs can be quite restrictive, preventing access to many of th...

gruez · 2025-04-13T21:47:23 1744580843

>Providers like Oxylabs can be quite restrictive, preventing access to many of the common sites that scrapers choose to target.

Most of them seem pretty reasonable?

"Entertainment & streaming" - who's trying to scrape netflix's library?

"Banking and other financial institutions" / "Government websites" / "Mailing" - seems far more likely it'll be used for credential stuffing than for "scraping".

"Ticketing" - seems far more likely that it'll get used by scalpers than for scraping

The main targets of scraping - e-commerce sites (for price comparisons) and social media networks (for user generated content) are fine to scrape. Is there some use case I'm missing here? Is there a huge contingent of people wanting to scrape ticketmaster or bank of america?

bdcravens · 2025-04-14T03:04:52 1744599892

I used the term "scrapers" pretty loosely, but yes, in many cases they are more bad actors than actual scrapers. However as they say the list may include other sites, I suspect Oxylab adds sites to the list at the site owners' requests (Amazon, Target, etc are likely to be on those lists)

faizshah · 2025-04-13T20:32:38 1744576358

Hmm that’s unfortunate. I’m actually scaling up a data journalism project this month which is why I’ve been looking at these.

I’m curious if you can suggest a happy medium between curl-impersonate on VPS (dirt cheap) vs residential proxy ($8/gb)?

Personally I’m not trying to hit any of those common sites.