I've worked in the education sector—at least in the US there are well known data protection laws that schools very much do know about and attempt to comply with. It's not quite HIPAA levels of serious, but they do take it seriously, and as another commenter notes Google actually does comply.
I remember a teacher telling us that parents should not check their kids Google classroom accounts because it would be a violation of the other students’ privacy. I understand what they were saying but there’s no way I’m not checking my kid’s Google classroom account. Ridiculous.
I was offered a job by a large ed tech company that has all sorts of data including parent teacher communications, grades, hourly attendance, etc. For millions of students enrolled in k12 programs in the USA. I initially accepted but then they wanted me to build an early warning system to predict whether a student would have a behavioural problem on the next day. I quit because I do not find it moral to build panopticons for children. But I'm sure they found someone to replace me.
Good job. I’ve quit a job similarly over ethical issues. But this is the problem with software engineering lacking an ethical standard: there’s always Bob, three desks down, who is willing to build the Torment Nexus. One person’s ethical stand is meaningless.
Thanks. When I quit that job I didn't really have any other option lined up. But I had just finished my PhD and someone there at the university offered me a postdoc for six months and then I won a little grant that let me stay like another year and then I won another one for 3 years, and I started working at an AI company in my city. But at the time I was like, fuck now what am I going to do? Haha.
lmao! I apologize, I really do. I know Dang says NO to snark but "data protection laws" for students?! Despite lobbying against it, my school uses the King of All Evil software suites GoGuardian.
"GoGuardian Beacon continuously monitors online activity across school-issued devices, search engines, web apps, Gmail, and more to proactively detect concerning behavior." - What data is being protected? It's being collected and analyzed by everyone but the child's parents.. All you have to do is whisper s a f e t y . . . and data protection is tossed out the window.
I just hope you people are being paid to defend these immoral monstrosities. Google, Microsoft, Meta etc comply with nothing. They just pay a miniscule fine when outed a decade later.
As another data point - I’ve never looked at YouTube for database content, but every single DBA I’ve encountered in the corporate world pronounces SQL as ‘sequel’.
Anecdotal, but fair, since I can say the opposite. I've largely heard it said as S Q L, ro the point when someone verbally calls it "sequel" it takes me an extra moment to remember what they are referring to. I rationalize it thus; we don't make words from our other acronyms that aren't already pronouncable words to some degree. Exploring that, we don't say "fibby" when we mean FBI or "see ya" when we mean CIA (maybe we should, though?). DARPA, on the other hand, is pronouncable on its own, so we'll say the word instead of D A R P A. Another one is HIPAA, which is interesting, since its pronouncable, but people often get the acronym wrong as HIPPA when writing/typing it (FAFSA being another good example) although that is more US-centric.
Anyway, our treatment of saying the word instead of saying the letters of the acronym is interesting to think about, I guess. I reckon there's a lot of cultural and linguistic influence there. Perhaps someone smarter than me can unpack it.
It is interesting. I always thought it began as ’sequel’ and was then abbreviated, but I wasn’t around when it started back in the 1960’s or whenever that was… might go google it haha
This is not the case for me. I used some time on SO today and found only bad answers and gave up and started reading the source code of the examples from the library that I'm using.
I think that people stopped contributing. I stopped some time ago as the site stopped motivating me to contribute.
Usually, for scarping tools you need to point where content and other metadata are located. My parser is universal and works with every site out of the box. It's automatically understands where crucial information is located and then trying to parse it.
Can you elaborate on how it does that? My knee jerk reaction is an llm api call which, if true, would make me immediately suspicious (so I guess don't elaborate unless it isn't that lol)
Right now my parser is using the combination of open-sourced parsers and combines the best results that they produce. These parsers also use different approaches. Some of them have hardcoded patterns and keywords that they are using for searching in the DOM structure. Some of them uses their own classification ML models. What about LLM, I have plans to try it too, at least for websites that cannot be parsed with existing tools. Also I am thinking about to create my own ML model that will trained on a huge amount of HTML files (but this option is too expensive for me so far)