IIUC that was for illegally downloading ebooks and other media -- it had nothing to do with training per se. Scraping publicly accessible data is generally legal, although Microsoft/LinkedIn clearly think they have enough of a leg to stand on to at least litigate this.
Not an expert but there was a court ruling in the US I think last year where circumventing login protection through bot operated accounts when the login is intended for human use was ruled as violation of CFAA. The current state of litigation in the US seems to be that scraping public facing data/websites has been considered as permissible by the courts but data behind a login intended for humans is not. I think there's still a split between the circuits, so this will go through some years of appeal yet.
A product called "Linkedin Intro" that was killed within 6 months due to backlash and significant security flaws. It was somehow creating a reverse imap proxy to intercept your email traffic, and "decorate" emails with someone's linkedin profile.
I don't recall all of the specific details, but I just remember reading about it at the time and how they bypassed some of iOS security protections to do it. Adn that they didn't get perma-banned from the various app stores back then is beyond me. It's a huge part of why I avoid installing apps on my phone in general.
I don't recall all of the specific details, but I just remember reading about it at the time and how they bypassed some of iOS security protections to do it. Adn that they didn't get perma-banned from the various app stores back then is beyond me. It's a huge part of why I avoid installing apps on my phone in general.
I don't recall all of the specific details, but I just remember reading about it at the time and how they bypassed some of iOS security protections to do it. Adn that they didn't get perma-banned from the various app stores back then is beyond me. It's a huge part of why I avoid installing apps on my phone in general.
If they really want to put a dent into this, go after the biggest players scraping LinkedIn: PeopleDataLabs and Apollo.io (and no, taking down their company page does not count)
The dispute was settled because Pear agreed to slightly alter its logo, instead of continuing full litigation (maybe because of resources / dollars it would consume)
If that’s going to happen with a small fish then it was certainly going to happen against a big fish. Cheaper, faster, and easier to attack a smaller business first. There is literally no reason to go after a big dog unless they did something particularly egregious and/or distinct that you can anchor your argument with. Unless your goal is just to waste their time and that of their lawyers I guess, though I think we would all assume the goal is to win ultimately.
Even the legal filing and motions can help shape a case since they get rulings and such back. If a judge rejects a motion, maybe they need to approach it a different way when they go after big fish.
Only way this is not beneficial is if software company settle or gets dismissed right away.
Or, go after the small fish who can’t afford to have a biglaw team on retainer, bulldoze them to get a legal precedent set, and then use the example to extract concessions from the bigger players.
Because they either have side deals with the big names, or they want to set precedent for going after them.
Not trying to be a conspiracy theorist here, but my bet is on having a deal with the big players, we allow you to scrape us (or we give you a pipe you can consume out of), and you pay us in monetary or non-monetary terms; like how many business exchanges work
I've heard a lot of people cite this case as proof that scraping is legal, but it seems like the decision kept going back and forth in appeals, and I never understood what precedent it set, if any, around the legality of scraping.
This one seems different from the (correct) ruling in favor in hiQ Labs, where the courts were quite clear that scraping the public Internet was completely legal.
This is a case of a company creating millions of fake user accounts, so they’re behind the login wall and not on the public side of the Internet anymore. At least, that’s how I’m reading this.
Not sure your point, because of course you can. But when you make that account you agree to terms. Those terms do not permit you to take the data presented to be stored in your own database to monetize on your end. Make your own website to collect data. You’re being obtuse about this. Is it deliberate?
Oh dear, my office has been scraping LinkedIn forever. We use it to make visual networks of contacts in our industry, and relate that to whom we have working for the company. oops.
The Chrome extension approach may shift some (most?) of the risk to the end user, since technically they are now the one scraping. Theoretically getdex would be relatively better off in this arrangement, while putting their customers into a legal gray area.
They also make it difficult to destroy. Try deleting your post or comment history, and you can only do it slowly one by one, with only a few sketchy tools for making it faster that go against their terms of service.
I think most users don't want their data to be used by anyone and everyone. I sure don't. If one user needs access to their own data, they can always export it and take it where they please.
For most people the dangers of openness (see Cambridge Analytica), the lack of upside and the lack of security in small players mean that walled gardens are the best solution for the majority of people.
This lawsuit is exactly why people trust walled gardens to keep their data walled off. Because I trusted LinkedIn, not ProAPI and whatever malicious actors they sell to.
> This lawsuit is exactly why people trust walled gardens to keep their data walled off. Because I trusted LinkedIn, not [...]
Obviously LinkedIn is also in the business of selling the data about you, and also access to you.
LinkedIn just doesn't like this other company leeching off that data LinkedIn got about you, and then competing with LinkedIn in making money off that data (including access).
Selling data inside their walled garden in a way I am OK with in exchange for a free service.
Not a 3rd party selling my information to a scam farm in a foreign land that has no laws that will use all of that information to extract money from my parents.
But linkedin is doing so in accordance with the legal agreement you have with them, which I am able to exit at any time and instruct them to remove my data. I can't do this for every company that illegally (in many jurisdictions) hordes information about me.
You're currently on one of the very few sites with no delete/edit button for your own content (after a short initial period.) It's the only site I can think of that hoards my data like that. Which is why I only post anonymous throwaway content here.
I think trusting data you post publicly to only remain exactly where you publish it is naive at best. I think it's much more sensible to think that as soon as you put something public, it will exist somewhere forever, and it's foolish to believe otherwise.
I don't even trust LinkedIn, but it's not like I can sue them for offering antisocial terms, let alone force them to a negotiation table. It's just a shitty situation all around. At the very least they should pay me to use the site if they're making money off of it.
If everyone has access to your data it becomes even more worthless and you will definitely not get aid for it. At least now I can keep it somewhere and they can use it to fund engineers to keep the service up, lawyers to make sure your data stays safe, etc.
You are free to leave and delete your data, unlike if everyone has access to it then it is out there in perpetuity.
You definitely can't sue a data broker to pay you/stop using your data.
I sure do! If LinkedIn can't market my resume to open roles then letting recruiters roll their own scrapers against it is the next best thing. I understand that LI owns my data, I just wish they were effective in using it!
I don't see how I (or LinkedIn) is making you do anything? LinkedIn is a place I can post data. I choose to do so in an attempt to market my resume. I fully expect that the data I post on LinkedIn's server becomes and is the property of LinkedIn, and wish it was more effective at extracting value from it?
Because LinkedIn is less effective than I'd like, I support 3rd parties scraping the data I posted there, again on the hope that they'd be more successful at marketing that data, which I would benefit from as the data is my resume.
Well maybe I can get that company to backup my LinkedIn posts because it is utterly broken to download anything about my profile to make a backup.
There is an API option but endpoints from documentation just return 404. There is Data Privacy "download my data" I wanted really data like my posts, photos not crappy CSV having basic properties. In the end there is "View the rich media" but also I have to click one by one and there is no text for posts on the images - I can do that going one by one of my posts and copy pasting. It sucks despite "your data belongs to you" texts on the labels.
These are my posts I have personal attachment to what I wrote.
Most of what I wrote I have in my notes anyway — but still if they say it is my data and I can always download it, I really want to download it and not like that someone just puts up lies on their website like "data is yours you can always download it".
basically, linkedin is just pissed off they weren't getting a cut of the profits this small company made on linkedins (already public?) data.
The winners here are the law firms on both the plaintiff and defendant sides. Drag this through the court system for as long as possible. PR. PR. PR. Then settle out of court for an "undisclosed amount."
This is the mafia equivalent of "sending a message" in corporate land. Yawn.
They're owned by Microsoft and poorly managed. Hundreds of people get locked out daily and can no longer access or change their OWN data. I say, let the scrapers take them down. We need to stop the walled in gardens of data these companies DONT own - it's the user's data.
I somehow want both parties to lose.
reply