Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Companies must stop using Google Analytics (imy.se)
646 points by pseudotrash on July 4, 2023 | hide | past | favorite | 456 comments


My sister-in-law (girlfriends brothers girlfriend, not that it matters) recently studied for a data analytics certification. Actually several.

The entire course (located on here: https://medieinstitutet.se) is based on Google Analytics.

Now her entire value is tied to the use of Google Analytics, she will almost certainly fight very hard to ensure that these skills remain relevant, nobody would want to retrain for 6-12mo on new analytics systems (or, god forbid, not be an analyst at all!).

I think we don't really assess the amount of lock-in we allow when we learn something that supposedly makes our lives simpler. Google Analytics was sold as a solution to you making your own analytics, because that's hard! and the cost is that google gets your information too- which most webmasters don't care about individually.

However now we're in a situation where at least a few thousand people depend on this precise tool existing, and will be economically useless if it is banned.

Personally I find this astonishingly foolish of the people who train exclusively on these tools instead of first principles and primitives.

That said; we also have "Cloud Engineer" as a job title, so I'm not sure we will learn this lesson.


If the ability to work with the things taught in that course is so dependent on GA, then I dare question the term "data analytics" for that certification. Data analytics is a general area of expertise, that is not bound to GA. Perhaps the certification should be called "Google Analytics Certification", instead of implying more knowledge and skill than is actually there.

Data analytics has some statistics in it, these days probably a pinch of training ML models and using them and understanding them in the basics as well. Source: I did some work for a company specializing in creating courses for actually learning data analyst skills, as a preparation for switching careers towards data analyst jobs. I myself helped creating course content. The course is officially certified for job-seeking people as a means of learning a new job.


Exactly.

It's interesting to observe how the existence of a mediocre course in Sweden is leveraged to make the popularity of Google Analytics a major concern.

And maybe the course is not even that Google-dominated. Looking at the content here https://medieinstitutet.se/utbildningar/digital-analytics-di..., they use both Google Analytics and Adobe Analytics and mention other tools like Hotjar.


This level of dependence on certain tools is neither rare nor unprecedented.

Your run-of-the-mill business drone will be trained on Word/Excel/Outlook and be hard to impossible to retrain on anything else (either because of actual stupidity or resistance to change). This already starts at school where "Informatik" is often just learning where to click in Microsoft products.

Similarly, tradespeople often specialize in certain tools and products. Your average car repair guy will often be forced to specialize in one brand of car. Your home appliance guy will preferrably sell and repair one brand of washing machine, dryer, dishwasher.


'actual stupidity'... gotta love the general disdain of the 'run of the mill business drone'... It's funny my wife is a run of the mill business drone. She thinks IT is a bunch of assholes. I would say she is probably right. Way to keep things going.


and IT "assholes" think the run of the mill business drones are "assholes" as well. Their inability to be effective at their jobs tend to make IT lives worse because they can't understand what IT workers do but IT workers can understand what the basic run of the mill business drones do...and their work tends to be a bunch of pointless meetings.

Yeah, I work at a corporate office and have made it a mission to see what kind of work they do, and majority of the time...it's pointless meetings and meetings that involve pointing to IT workers and saying "do this". I check many of their daily schedules, and see what kind of stuff they talk about in meetings...just wow. Am I an "asshole"? Sure you can call me that, but I can call them useless in turn because I wonder how many more qualified people out there who can replace these workers.


> but IT workers can understand what the basic run of the mill business drones do...and their work tends to be a bunch of pointless meetings.

What an ironic comment.

Sorry, any early-career worker looking down their nose at anyone else (or pretending to have any idea what their job entails, especially because they “looked at a calendar”) might as well go back to middle school. They definitely need to grow up.


While true, it’s very fair to say that many (possibly most) meetings in a corporate environment contain a lot of absolutely pointless time wasting.

I say this with the perspective of someone who has slowly had to have many more of these meetings added to his calendar over the years.

Some are very important, some are reasonable but often bloated, but so many are a waste of time.

At the very least: they’re 5-10 mins of work spread over an hour. It’s occasionally maddening.


I find that a lot of engineers don't understand good communication, and definitely don't understand the value of relationships.

When I was really junior, I'd go to meetings and think that the vast majority of the time was wasted. As I became more senior, I realized that a lot of that wasted time is for providing context, relationship building, and alignment. You may not need those things for your current task, but your leadership and partner teams may need these things.

Yes, a lot of meetings could be emails, and a lot of meetings could be better run (agendas and objectives in the invite, action items assigned at the end), but unless you're working somewhere awful, most meetings probably have a reasonable purpose and aren't all filler. Lots of jobs require way more meetings, and probably aren't filled with context relevant to you.

Looking down on non-engineering positions is a personality trait I associate with inexperience. It's absolutely something I'd consider when denying a promo.


> When I was really junior, I'd go to meetings and think that the vast majority of the time was wasted. As I became more senior, I realized that a lot of that wasted time is for providing context, relationship building, and alignment. You may not need those things for your current task, but your leadership and partner teams may need these things.

100% agree. In early or IC roles, it's easy to think "just let me go do X" (or worse, "talking about X or Y is a waste of time when X is the obvious answer") without seeing the bigger picture that there's tremendous value in making sure other teams are aware of what X is, why it's important, and having a chance to weigh in or ask questions. Certainly there are valid complaints about some people's meetings, but those shouldn't overshadow the alignment/communication value meetings can have.


> I realized that a lot of that wasted time is for providing context, relationship building, and alignment.

A lot of that seems to be about office politics, which historically been something which engineers and office workers in general has disliked. It is a generally unhappy fact that relationship building and office alignments is what dictate who get promoted, who get raises, who get the desired assignments and who don't.

It might be true that those who refuse playing that game is associated with inexperience. In my experience, employees who get tired of it generally leave large companies, which leaves behind only inexperience employees or those who enjoy the game.


Looking down on, yes. Suffering nonsense meetings silently? No.

My entire point was that there is a major difference between the two & that while the instinct to look down on others for this organizational symptom is immature, it’s not unfounded or without basis to highlight the issue: they’re blaming the wrong thing however.

And I often find that in those types of meeting communication & relationship building is the absolute last thing that is happening. Most of these meetings are CYA, checklist, type meetings.

Meetings that literally only exist to allow someone to demonstrate they had a meeting about something.

Worse, the actual communication that is happening is usually in side channels.


I'm guessing you're junior, or you'd have more control over these meetings, or have the ability to decline ones that you believed weren't going to be good use of your time.

Having good meeting culture requires everyone involved to improve it. If you want meetings to be better, set them up, add an agenda and objectives, and run the meeting so that it's effective. If you can't run the meeting, if it doesn't have an agenda or objectives, ask the person who created it for them. Ask for action items at the end of the meeting, if no one is calling for them. If it's mostly status meetings, propose a better process to track and communicate status.

If you're working through side channels, you're part of the problem.

Calling people assholes, rather than improving the situation, is an indicator of inexperience.


This is the most wrong, yet condescending and naive, comment I’ve encountered on HN so far.

I’m actually impressed.

Call me in 20 years, maybe you’ll have learned something.


Getting real "toxic positivity" vibes off your whole line of arguments...


Being proactive in changing the situation isn't toxic positivity, and being empathetic with folks in different jobs isn't toxic positivity either.

Assuming people are assholes, and blaming them for situations is just run of the mill toxic. Working through side-channels rather than addressing a problem is also run of the mill toxic.


Nah your stuff came across as blind tolerance which is just pointless unconstructive and solely to placate people's feelings (in such instances unjustifiable/irrational).

Just because Joe or Judy "feels" a certain way doesn't mean it should actually have bearing on anything. Really...

Enough placating those with the least logic and self control.


This attitude is the kind of engineer stereotype we can live without.

People's feeling matter in the long term, because it's the difference between them wanting to work with you, and them being forced to work with you. If I had to pick between a genius coder with awful people skills, and an average coder with exceptional skills, I'd essentially always pick the average one.


Someone's feeling do matter in the long term, which is where it should be dealt with, not in the immediate.

Do you not see the difference?


People's feelings about you are based on a series of accumulated interactions. You can't just "deal with it" later, because that's now how people work. This isn't like tech debt where you can accumulate some and then spend some timing working it down later. If you're consistently an asshole to people, it doesn't matter if you take a little time now and then to try to repair that. Good relationships require taking people's feelings into account and acting consistently.


The context of the discussion was accommodating feeling during meetings, right?

Well that's absolutely not the place for it.

If your little fee fees get hurt you keep it to yourself and focus on the task at hand. Then after the task is complete you either pull the offender aside to address the problem or you bring it to a superior to be addressed. It in no way should have any bearing on the work at hand.

This is basics of work place decorum, right?


> and IT "assholes" think the run of the mill business drones are "assholes" as well. Their inability to be effective at their jobs tend to make IT lives worse because they can't understand what IT workers

Have you ever dealt with an average IT department in a non-tech company? This attitude doesn't help anyone and I really want to believe that only a small minority of tech people think of any other worker anywhere as an "asshole".


I'm a Sr. SE at a large medical device company and can confirm that our IT dept is filled with assholes. We do everything we can to keep systems out of their hands because they are so difficult to work with compared to every other part of the company.

I get they have to deal with a bunch of technical inepts constantly falling for phishing attacks and occasionally teams will make outrageous requests to them that simply can't be done, but their attitude is terrible.

If you ask for something simple but "scary", like a firewall or internal network change, they will immediately assume you are just some idiot and speak dismissively to you in a very obvious manner. It's extremely frustrating because they won't even bother to read your emails that justify the change and will just invent some unrelated excuses about why they can't or say they will get back to you later (they don't).

Ironically the only way to get anything done through them is to have my team members create a bunch of duplicate tickets (1 per person), and schedule multiple pointless meetings with them that essentially just consist of me reading my emails to them out loud.

Non-technical teams in the company get the same treatment but lack the technical background to counter them. Frequently I've had team leaders come to me to get a second opinions on the stuff IT tells them and it bothers me how much they seem to clearly exaggerate the difficulty of things. To the point where I can't help but wonder if they are just pretending to know what they are doing, and use their better-than-you attitude to mask their own ineptitude.

So overall I feel the negative reputation of IT departments is earned.


> and it bothers me how much they seem to clearly exaggerate the difficulty of things.

Scotty Engineering principle at work. I'm no stranger to that, it's often enough the only strategy keeping higher management from completely swamping you with work.


It's not that it usually is the Scotty Engineering principle. It is more often a hedge against unforeseen complications, because if you give a realistic estimate that would be true for 90% of the tickets, the 10% will come to bite you. So to not get chewed up for providing realistic 90%-true estimates, you give 99.9%-true estimates that are far higher, but with just a .1% chance of being screwed instead of a 10% chance. Which is, at 10 tickets per day, getting chewed up daily vs. once every two weeks or so.


reminds me of the saying "don't tax you, don't tax me, tax that guy behind the tree".

Its always us vs them.


[flagged]


What are some typical everyday scenarios where this is applicable to a real business?

Please include all the steps that chatgpt would independently take, like deciding what meetings to schedule, attending them, and presenting at them for example, and specify who would verify chatgpt’s correctness.


Sadly, citing off topic ChatGTP woes and reducing Americans to a broad generalization is not a sign of intelligence either.


Statistically speaking, assuming a normal distribution of intelligence, there is a specific percent where this generalization will start to apply.

Now, I don't know what is this percent, but let's give it a name: the ChatGPT percentile cut line.

My intuition says this line sits to the left of the median, so in a sense you are right, meaning that fewer than 50% of people have a measurable intelligence lower than this line.

However, it can also be higher than 5%, and this means many millions of people can be easily replaced by an automation tool, without any bad consequence.


I find it hard to believe that the intelligence you assume is required to do a person's job is an accurate indication of that person's intelligence.


Those people probably aren't office workers


I someone can't be retrained, and it's not because they're being stubborn, what other reason do you have?

Is it better or worse to assume they're being a problem on purpose?

I won't say 'average' but I will say 'common enough to make changing software a huge issue'.



Many devs are the same, hence loyalty to a language


Unless its Webdev, because from the outside looking in it seems there’s some new fad moving through every few months.


Who's the a-hole:

The person trying to make things better and seeks out knowledge at great cost

Or

The person who is willfully ignorant and actively refuses knowledge handed on a platter?

Because one of those is the typical IT pro and the other is the worker drone....


Don’t worry, the people in IT don’t like the other people in IT. Devs think that security are a bunch of a-holes, DBAs hate the devs, etc etc.


lol

Seriously, some looking down the nose comments


> Your average car repair guy will often be forced to specialize in one brand of car.

Not because of skill issues, but maybe forced because they work for a dealership that sells a particular make exclusively or, less often, a specialty shop; most “car repair guys” outside of those environments have to be generalists.

> Your home appliance guy will preferrably sell and repair one brand of washing machine, dryer, dishwasher.

IME, the sales are done by shops that carry many brands, and delivery, installation, repair donw by firms that often have relations with the retailers and handle whatever you get from them, including multiple unita of different brands that come together with the same team. They may also have relations with the manufacturers, but those don’t seem usually to be exclusive.


> Not because of skill issues, but maybe forced because they work for a dealership that sells a particular make exclusively or, less often, a specialty shop; most “car repair guys” outside of those environments have to be generalists.

Yes, I'd expect any car guy to be able to change your tires. Or change your oil. But even resetting the oil-change alarm or tire-pressure sensor can be a hurdle here:

Manufacturers also use skill issues to their advantage to bind tradespeople. Modern cars do need manufacturer-specific diagnostic devices that used to be unobtainable for independent shops. Since that practice has been largely forbidden by the authorities, now the software, cabling, and diagnostic output are made intentionally hard to understand without having taken the corresponding lessons that the manufacturer provides for a modest fee.


Perhaps you're using a broad/generic example to try and make the point but I'll say this:

If a seasoned mechanic is unable to figure out how to reset the Maintenance Reminder or look up how to sync Tire Pressure sensors, run away.

In the same way that one can use knowledge of one programming language as a means to leapfrog into other languages, other skilled trades are similar. Perhaps there's something that could be said about an ICE mechanic trying to dabble on Electric but that's not the point you're making. So yeah. I know you're trying to make a point about lock in, but when I think of people I want to hire for tasks who might say "Oh, sorry, you have a Volkswagen and I only know how to work on GMC" I wouldn't take my GMC to them either. It shows a fundamental lack of skill in that they don't understand the broader concepts and their universal applications. If I, a programmer, can figure out my Volkswagen, my GMC, my Mazda, my Nissan, certainly a mechanic can. If my appliance repair specialist can only do Whirlpool when I ask for help on a Bosch that's red flags.

One might specialize. Sure. But to refuse? Weird. But I fear I might be getting lost in the weeds here because its all about the approach. "Sorry, too busy to take on work on things that aren't my specialty": yep, understood. "Sorry, I don't know <model> I only know <other model>" bad.


I know mechanics in particular can be quite chauvinistic.

In the US, for a very long time, you had to find an "import specialist" mechanic, even long past the point where Japanese brands had gone mainstream. Part of this might have been because of the availability of metric tools at the time; my family had a set of metric wrenches specifically because they had to do occasional light maintenance on their early Datsuns and Toyotas.

I can recall that the mechanic in my neighbourhood was decidedly unwilling to service a new Hyundai in the late '90s. He complained they were 'disposable'.


Mechanics are in business to be profitable.

Specialized items require specialized tools. Specialized tools, like all other tools, require maintenance and they change.

A shop dealing with domestic produced automobiles can significantly reduce profit-bleed by not servicing vehicles that require special tools, special diagnostics, special machines, etc.

It's simply a math equation. Do I serve enough of these vehicles daily/quarterly/yearly to make these expenditures profitable for me? The shops you're referring to answered no to that question.


Try to get a French car in Poland, most shops are used to deal with VW, Audi.

It is not impossible but if you try to go to a random shop you found on Google and fix your Citroen or Renault you might be surprised.


It could just be a parts issue. A lot of mechanics will work on pretty much any car, if they have the parts, but if it's not a popular model then they don't want it sitting in their shop for a week while they order parts.

I guess some mechanics will prefer to work with a smaller number of models, because they're much faster if they're familiar with the model, but new models come out every year, and they need to learn how to fix those. If a mechanic can learn to fix the newest VW, they can learn to fix the newest Renault, it just might not be worth their time if they have enough work to do.


I'm not from Poland but I hope this is hyperbolic. Parts delivery is once or twice a day delivery in most cities in Europe, no matter if it is Renault, VW or a Kawasaki motorcycle. Of course a part can have longer delivery time but not because it is a Renault instead of a Audi. At least not at any reputable delivery business in modern parts of Europe (which I would think Poland is a part of even though I haven't been there since the 90's).


So that is circling back to original topic.

I believe person learning GA could learn any other analytics tool. It is just not worth their time.


That's often certain German cars in the US. IE some shops will just not work on modern Minis. Plenty of shops will avoid weird, niche cars.

Any mechanic can fix a Citroen, but is it worth the floor time it'd take to get the parts and figure out french quirks vs working on something they know that they'd make the same money in a third the time.

Having done shade tree work on various cars, I'd totally turn down any Subaru engine bay work if I was already close to swamped.


Once upon a time I was the proud owner of a third gen RX7, proud that is until the powertrain warranty ended and I had to go outside the dealership network for minor repairs. Basically your options were... go back to the dealership and pay unsubsidized warranty rates (double or triple what independent mechanic charged for "normal" engine work), or go to questionable looking characters running "performance" shops who wanted to side port everything they could get on a lift. And still pay double or triple the normal mechanic rate.


I think you are missing at least one important point. The reason the mechanic can’t work on the other brand of car is not knowledge or skill, but equipment. It costs money to buy the full suite of equipment required to correctly service a particular manufacturer’s vehicles. It often makes much more sense to specialize and make more use of fewer expensive tools than to have tools for everything and have only marginally more business.


Well, that's a very American way of doing things that most places simply don't do. The norm is for tools to be compatible with all brands and at best it is an added option to unlock or a pay per use. You can do 99% of the diagnostics with a Wish Bluetooth dongle and a free android app if you wish, since by law it is an open standard.

Most specialty equipment costs less than a mechanic can earn in a day. You even order the parts from the same company no matter if the bumper is for a Mazda, VW, or an Alfa. Or a Kawasaki motorcycle for that matter. This lock-in behavior is, luckily, mostly illegal.


Diagnostics yes, but try to reset the warning light on a German car after fixing the fault. This can apply even when replacing seemingly purely mechanical parts (I think some suspension parts on bmw/Mercedes). You will need vag-com for any vw/Audi and something else proprietary for Mercedes or bmw


I have only done so on a Seat, with no trouble. Seems to me that if this is true it is against EU law.


There’s actually a far better reason they don’t retrain. There aren’t better tools. Or the tools are basically carbon copies (Google Docs and Sheets).

Show me a product that provides so much more value for my team than Excel that it would be worth a retrain.


My first office software was WordPerfect and Lotus 1-2-3 on DOS. Then Lotus Smartsuite, Microsoft Office, OpenOffice, now LibreOffice, with occasional bits of Google Docs. I never had to be retrained.

Here's the thing, for 90% of use cases, those are all effectively equivalent. You really shouldn't need any retraining whatsoever. A spreadsheet's a spreadsheet, and a word processor is a word processor. You type the text in the box and then hit print or whatever. Nothing new to learn.

Now admittedly, there are some power user features which are different, which is why I said they're only 90% equivalent. But most people don't use those anyway. Yet they will intensely oppose using a different but 90% equivalent thing because they haven't spent years being trained to use it - even though it's almost exactly the same thing they're using.

It's just a weird and bizarre mental hangup that seems to be natural to many humans.

If you're in tech, you will see the same thing with programming languages, frameworks, applications, etc. And it's on both sides, not just the users, but also the people hiring them too. "Oh, you've only worked with WordPress, you haven't been trained in Drupal?" "Oh, that's PHP, I only work in Python." "Well we're looking for a Ruby developer, not a C# developer." "That's React, I only know Vue.js"

It's mostly all general-purpose programming languages, libraries, and frameworks. Sure some details are different. There's a bit of a learning curve. But if you are actually capable with one, then picking up another nearly equivalent alternative should not be viewed as some impossibly complex thing that will take years of retraining.


You are both right and wrong. It depends on the intelligence of the person you are dealing with.

There is a thing called "intelligence". There are several definitions of it, the one I'd like to use here is "the ability to infer general principles and common workings from small isolated samples and apply those principles and workings".

So if you are sufficiently intelligent, you can infer, from observing a few (or even one) doorhandle being pushed, that his is the general way to open doors. You can then apply this principle maybe even to different doors, windows, rotating knobs, etc. The fewer samples you need to learn and the broader your application range after learning, the more intelligent you are. In the stupidest case, one only learns to open one specific kind of door in one specific way, like a cat might.

You are writing from the point of view of someone sufficiently intelligent to derive the working principles of software and apply it to other software packages that generally serve the same purpose. However, there are people who are not intelligent enough to do that. Those people do get by by just following instructions, learning by rote which buttons to click for which purpose. Those people are the "door opening cats" of the office application world.

Less intelligent people like those do exist (50% do have an IQ<100 after all...), they do get jobs and they can be successful within limits. Just as Stackoverflow/ChatGPT-copy&paste-programmers do get by somehow.

Which is why I'm also a fan of intelligence-test-type job application processes. The ability to learn, for higher-level jobs, is far more important than preexisting knowledge. And intelligence is the best known predictor for the ability to learn.


Mathcad is the only thing that comes to mind. It is very unique and a massive productivity boost for engineering calcs.


> There aren’t better tools. Or the tools are basically carbon copies

LibreOffice, free open source software. Just as good as Excel and... free.


LibreOffice is a perfectly functional spreadsheet for basic spreadsheet tasks. It is, however, not in the same league as Excel for more advanced spreadsheet tasks. Sometimes as a user you just want to be able to use the software that works better, you know?


Yeah, that is the trade-off; if you keep using the proietary software that you're familiar with, you don't have to relearn, but you'll be more dependent on 1 company. But if you spend some time up-front into learning a free software alternative, you won't be dependent on a single company.

But yeah, for some things there are not yet complete free software alternatives, but the gap is really getting small these days. I think for most common use cases for most people, LibreOffice is more than enough. If you need more advanced features that are only available in the propietary variant, try to find ways to not be dependent on that feature, or find an alternative way to do it with free software. If you are creative, there are usually many ways how you can do your work using free software.

But yeah it does take some dedication to this idea. But what you get back is probably, on the long run, gonna be of more benefit to you for the future than if you would invest it in learning the propietary software. Unless you really have a special love for the propietary software company, and you are sure you want to grow more and more into their ecosystem and have no need to have any personal freedom over the software you use.


Yeah, but for retraining to be worth it then in the future your people have to be better or more productive than they would be with Excel and it'll also impact every new hire you make that needs to do the same retraining. Just as good is not good enough. That the tool is free is rarely a strong enough reason especially if there are licenses around already.


Home appliance repair is not a specialized job. Maybe you have some sort of memory about the Maytag repair man, but that is just branding, just like DeBeers with their diamond monopoly.

Home appliance repair is knowing the general layout of each major appliance, a bit of specialized knowledge on how to dismantle them without braking them, and which modules are responsible for which function, and finally knowing how to source the replacement parts and knowing which common items are best kept on hand for convenience.

Very few people are doing board level repair for appliances, especially when the replacemeent boards are typically very inexpensive for any modern machine (save for main boards and for induction plates in induction stoves, which are typically so expensive that it is smarter to replace the entire appliance rather than repair the broken module).


> Your run-of-the-mill business drone will be trained on Word/Excel/Outlook and be hard to impossible to retrain on anything else (either because of actual stupidity or resistance to change).

Change purely for the sake of change is bad and people are right to resist it.

The vast majority of businesses have no compelling reason to switch off Microsoft Office. Would the world be a better place if there were more feasible options? Maybe. But that's not the concern of either a random business or its employees.


I'm old enough to remember when every young self-taught sysadmin was getting their MCSE or A+ certs.


> That said; we also have "Cloud Engineer" as a job title, so I'm not sure we will learn this lesson.

Comparing something as vast and broadly reaching as "Cloud" is a disengenous comparison to something as specific as a tool like Google Analytics, kind of a wierd comparison IMO. The entire pattern of tech is towards the "cloud" - even if you refuse to use the big ones like Amazon, Microsoft, or Google, it's still technically "cloud" if it's managed servers (wherever they may be).

To be honest I never got into the hype behind Google Analytics and I'm glad that I never spent more than 5 minutes at a time dropping the occasional tag on sites I built. (I've also never worked for anyone big enough where the analytics ultimately proved useful or valuable anyway). These tags are now easy to remove by deleting a few lines of code. I really wonder if the larger orgs really should have spent the extra few hours / weeks of development to develop an in house solution all along...


> Comparing something as vast and broadly reaching as "Cloud" is a disengenous comparison to something as specific as a tool like Google Analytics

Is it? Tons of cloud people I know are very narrowly specialized and certified on AWS or Azure. They certainly don't ever apply for jobs using the other...

I'm sure they could retrain. But I'm also sure they don't want to.


At this point, being a "cloud" engineer essentially means that you're good at understanding and adapting to new services / value propositions, since they become available quite regularly. It doesn't mean, "really good at EC2, and incapable of learning more."


If I were reviewing resumes and saw someone list themselves as a "cloud engineer", I would make very sure their skills listed the cloud provider used where I was hiring. That title would make me assume they were a specialist in a cloud until I saw otherwise.


You know that the big clouds (and to a lesser extent the smaller ones) have more in common than they are different?

They use different jargon, they love to market themselves on their differences (because why compete on price?), but the fundamentals are really very similar and the skills transfer.

Anyone who’s telling you that Cloud X is vastly different from Cloud Y is either trying to sell you something, or has gotten their knowledge from someone who was selling them something.


That says more about you than them, though. You’re projecting your own inability to translate knowledge into new systems and assume the same of others.


I'm not projecting my inability, I'm assuming incompetence. I'm a cynic, not insecure.

I wouldn't blanket block a resume that said "cloud engineer", I'd just make sure to probe that they aren't just an "AWS engineer" or "Azure engineer".


That is entirely reasonable.

A title I have frequently worked under is “Cloud Engineer”.

I’m strongest in AWS, secondly Azure. But I also am extensively using Linode’s platform and Kubernetes too.

My best friend’s title has also been “Cloud Engineer” and he is exclusively in AWS. He doesn’t really know more about Azure than he needs to get things connected to AAD.

How anyone could know that without asking eludes me. If you’re hiring for a position, you have a responsibility to know.


I work at AWS Professional Services, people move around among AWS ProServe, Google/GCP, and Microsoft/Azure all of the time.

Heck plenty of people come into ProServe with no AWS experience. But they know their areas of specialty well and it only takes a couple of months to use AWS specific services.


> it's still technically "cloud" if it's managed servers (wherever they may be).

Thank you for this. No more second thoughts about pointing cloud.example.com at our local HPE rack.


You can learn the most generic course out there and still turn out clueless, you can also learn the most specific tool and come out and be able to generalize. As far as I can tell this depends entirely on the individuals ability to be resourceful. In that respect a GA specific course might help them Get their first job faster so maybe that’s not a bad thing at all.


If you already have the background to understand computation and software, you're: a) not one of the people at risk of making this bad decision or getting stuck there; b) already in possession of much more valuable skills than knowing some software suite.

Getting this background requires a non trivial amount of time. It's easy to take our ability to generalize different computer based tools when you already understand digital computer architecture and know a few programming languages. The vast majority of people do not start from such a broad base of knowledge when choosing some software tool to learn.


> Now her entire value is tied to the use of Google Analytics, she will almost certainly fight very hard to ensure that these skills remain relevant

Or this could be the right time to check one of the self hosted alternatives (Matomo, Snowplow, etc), apply what she learned about Google Analytics, learn how to do it on those systems and sell her skills on two different classes of customers: the ones that will keep using Google Analytics, the ones that will try alternatives, at least not to be fined if not out of genuine compliance with the local laws.


As someone whose total compensation went up 600% over 5 years as a 'cloud engineer' from a sysadmin background, I think your comment is kinda silly.

Those skills are very transferable to new products and very little of the worth I bring is from my certs


Sysadmin background is akin to first principles when it comes to Cloud.

I also have a sysadmin background and my journey to the cloud has been "I can learn new things that make things easier or just use some pretty standard Linux VMs at any time".

Most new entrants to cloud learn the following:

* an object storage system (GCS, S3)

* A functions as a Service system (Cloud functions, Lambda) -- if you are lucky, Cloud Run; since that also gives you Docker.

* Message systems (Pub/Sub, SQS)

* Maybe an orchestrator (GKE, ECS), but only surface level.

* Some very minor information on how to create and access VMs; but it's clumsy, since you have to also learn Linux, this is unused.

For me, I can always fall back to my foundational knowledge of DNS, Networking, Linux and the systems I used to run, like databases, app services, mail systems, queue systems etc;

For people who are trained only on FaaS and SQS they do not have foundational knowledge to fall back on. That's not to say they can't get it, but it's not helping them make money and it's usually not taught, and worse: it's not something you ever reach for- and people typically learn through failure or by doing.

For me: Cloud just makes my life easier.

But I can also use an iPad as a consumption device; if I was only ever given iPads I would not be able to write C++ or Perl. That's just the nature of exclusively using simplified tools and abstractions.


What you are describing is a "developer doing devops" arrangement, or maybe some junior guy with an associate-level certification. A well paid Cloud Engineer is 100% expected to know architecture, Linux, networking and all the things you mention.


there are engineers who specialise on systems that have been in industry long enough to be called senior that have never touched anything non-cloud.

Most people who are “devops” with <8 YoE are unlikely to have touched non-cloud systems. Worse still, whether you want to admit it or not: some people are "DevOps" with no prior developer or sysadmin experience. (Since the term "Systems Administrator" is out of vogue but the need for systems administrators has never gone away.)

There are so many bootcamps for this too, and they mainly focus on AWS skills.

Here's a few bootcamps that people might decide to take to break into "DevOps" of which none are assuming prior knowledge, though techworld with Nana does teach a little Linux.

https://www.techworld-with-nana.com/devops-bootcamp

https://clarusway.com/aws-devops/

https://www.udemy.com/course/aws-devops-bootcamp/

https://techproeducation.com/courses/aws-devops-engineering/

https://aws.amazon.com/training/classroom/devops-engineering...


I know where you're coming from but the reality is that a lot of those things are increasingly less relevant. I'm not sure what real advantage I get today from having done racking and cabling of physical servers back in the day. In a managed Kubernetes world, I haven't leveraged my ability to run a massive pool of Linux servers in a long time (I kind of miss that btw). For all intents and purposes you can be a great DevOps / Cloud Engineer / SRE or whatever you want to call it without ever seeing a non-cloud system.


I think we're in agreement which is the entire point of the thread we're in.

You don't need certain skills today; instead you can use higher order systems instead. That doesn't mean there's no value in understanding (to use a programmer example) a linked list.

Equally knowing how a queue system works from the OS to bytes on a wire can make a world of difference in some contexts.

You can live in the higher order world and use the tools that make life simple (google analytics, in the case of this thread) but you are jailed to not understanding the systems that they are made from and while you are exposed to some concepts not everything transfers cleanly. "What is the PostgreSQL equivalent of a ML.PREDICT in Google Spanner!".

To give another contrived example; a huge reason people learn Latin or complete computer science courses is not because they will be speaking Latin or using Comp Sci concepts; it is because it sets a foundation for learning other systems, a sort of proto-field that permits you to see the relationship building blocks on which other systems exist.


This is terrifying.


> will be economically useless if it is banned.

Economically set back, maybe. "Useless" (with its implication of permanence) is way OTT.


How many can afford to spend 6 months re-training, and even worse: how many companies are willing to hire people who have no experience in their specific tool.

I don't think it's permanent, but it does make them economically useless until such a time as they retrain.


You are stretching credibility when you talk about 6 months to retrain from GA to some other analytics tool. 1-2 weeks is much more realistic timeline considering standard industry terms and similarity between tools.


if it was a 1year bootcamp, then I think saying 6mo is.. "fine".

Consider transitioning from Excel to LibreOffice or Google Sheets. On the surface it's the same, but doing advanced things requires considerable time investment and is very uncomfortable.


>Consider transitioning from Excel to LibreOffice or Google Sheets. On the surface it's the same, but doing advanced things requires considerable time investment and is very uncomfortable.

Silly hypothetical. I can't imagine a scenario where a company heavily utilizes advanced Excel, and then decides they want to use Google Sheets instead.

Besides, we're programmers, and learning new tools all the time. Things are deemed obsolete regularly.


> I can't imagine a scenario where a company heavily utilizes advanced Excel, and then decides they want to use Google Sheets instead.

I can't imagine a company that has built it's foundations on AWS migrating off of AWS. Such an endeavour would be more painful than transitioning spreadsheet tool by at least multiple orders of magnitude on basically every metric you can come up with.

That's also a broad definition of programmer. Most people (even programmers) come in a few categories:

1) People just solving a problem, tinkerers and explorers, people who are not really programmers first but it solves a need to get further work done.

2) People who just want a job that pays; lots of these, bootcamp folks mostly though I don't mean to make it sound negative -- nothing wrong with people that just want a decent paying job.

3) People who learned enough skills as teenagers to be well paid and are coasting or specialising in that area. I know lots of people like this, I believe on some level that even I am like this, though generally curious I tend to mainly focus on my area and only expand slighty around it and slowly. If you swapped out Linux for VAX I would be terribly displeased. See also: SystemD

4) People who love to learn about computers and how they work. This is probably the rarest person, and I was this person in my teenage years. It doesn't matter to this kind of person the economic viability of a project: the only thing that matters is that they do something. This is the people who make GameBoy Colour games in 2023. Or the people writing console emulators or doing DemoScene.

The majority of people don't keep learning, they learn their area and improve upon it.

I firmly believe that an AWS Cloud Engineer (or AWS programmer) would strongly prefer to move to another AWS shop.


> If you swapped out Linux for VAX I would be terribly displeased.

Having used both I’d say that the differences between Unix and VMS are much greater than the differences between Excel and Google Sheets.

A better comparison might be between Linux and BSD.


A company that relies on VBA scripts


If Google analytics was banned companies would have no choice but to hire and retrain people.


There's a saying: You'll never get rich harvesting in someone else's garden. It's not just Google Analytics, all Google products can be killed off at Google's whim any time, and it's not just google. That's why it's so important to work with tools you could own.


What if you don't want to get rich and just want to work enough to be comfortable and then come home and do other things besides tending a garden? In that case pay me to tend your garden so I can ignore it when I am not on the clock.


So are you suggesting that the standard company that uses 39 different SaaS products (https://financesonline.com/saas-software-statistics/) should give up all the products that meet their needs and only use free software?


This is a strawman: I didn't say a word about free software.


Or be ready to move to a new garden.

Off the grid Homesteaders aren't more profitable than people who engage in the compromise of society.


Wow, job for people who only knew google analytics exist? I thought at minimum SQL is required, and today's analysts also need to know python/pandas at minimum.

Imagine being a 'frontend developer' who can only use squarespace.


> Imagine being a 'frontend developer' who can only use squarespace.

More likely: Imagine being a 'frontend developer' who can only use React.


It's sad because it's a reality for many


Seems horribly limiting regardless of how we slice it.


IDK, I used to worked with "Drupal consultants" who barely knew PHP. They delivered the sites they were asked to, usually on time and within budget.


Drupal was the only accessible door to me to this entire industry in 2010, with no CS knowledge or available mentors. I was one of those people and I barely fed my family, and I eventually learned PHP, JS, the hosting stack, automating the hosting stack, Varnish and HAProxy, Linux sysadmin, how the internet works, and a decade of topics since.

We all need a door into this stuff, a place to be dropped in to start putting it together. Maybe the OP’s sister in law is totally out of luck, or maybe she’s now got a few of the hundreds of tools she’s going to need to build out a career. Luckily she has a brother in law in the industry, hopefully he’s the helpful type.


And they could migrate to WordPress if all theirs customers had to migrate.


There are people who are trained and know mainly Wordpress, so not a surprise. However, if they can transition to a second product, they will be able to generalize to a n-th as well.


When I got my AWS certification it didn't promise any applicability to Azure or GCP. But I brought skills, and took knowledge away, that would apply. Some of this is just on the practitioner.


> That said; we also have "Cloud Engineer" as a job title, so I'm not sure we will learn this lesson.

As long as you don't depend too much on highly vendor-specific stuff, most of the stuff a "cloud engineer" uses day-to-day is just the same fundamentally - EC2/Azure VM/GCE, ECS/Azure Container Apps/Cloud Run, Security Group/Azure Network Security Group/Google Firewall, whatever. Different names, same or very similar stuff.


I can tell you thats not the case…


I’m curious what you’re working with that is significantly different from one CSP to the next. I put most of my effort into AWS, but when I get staffed on a Google Cloud project, there isn’t much that I can’t figure out pretty quickly, especially with IaC managing everything. As long as the CSP has good docs, I find it relatively easy to move from one to the next.

There are certainly some cases where that breaks down, but it’s usually in specialized areas that I’d have to do some upskilling on in my preferred cloud anyway.

The benefit of the cloud is its service and resource (i.e. building block) oriented nature. There’s a level of transparency to cloud-based services that just didn’t really exist before.


> I’m curious what you’re working with that is significantly different from one CSP to the next.

I work here:

https://aws.amazon.com/professional-services/

There is a lot more to any of the cloud providers than just VMs and networking. I haven’t done anything hardly with a raw EC2 instance in 5 years except for one or two deployment pipelines. AWS alone has 130 services. True many of them are hosted versions of open source products

I work with call centers (Connect), Athena (Apache Presto), Step functions, and I have done some IOT work and of course Lambda and a lot more. I don’t do anything with traditional VMs. My specialty is “application modernization” meaning my work is a combination of DevOps and traditional application development using AWS services.

There are all kinds of specialties within any of the major cloud providers.


> I work with call centers (Connect), Athena (Apache Presto), Step functions, and I have done some IOT work and of course Lambda and a lot more.

Well, Lambda has a multitude of competitors (although to my knowledge they are only competing on the principle of serverless computing, so you'll still have to re-write scripts using these), and same for IoT integration.

The rest I'd say is pretty exotic stuff... and thanks for mentioning AWS Connect, that looks like something I'll have a deeper look into - do I get it correct that this is something like a combination of JIRA Service Desk/OTRS, some form of SIP telephony service plus a webchat and AI assistant?


It was originally the call center software that Amazon Retail used. It was ported to become an AWS service. For text to speech it uses Lex - the AWS version of Alexa. For other integrations you have it call a Lambda.

It’s the standard type of software you use when calling into a call center with a mixture of automated help and operators.

Like I said above, if you know your specialty well, it’s not hard to map your expertise to AWS services. It took me two years from never opening the AWS console but having literally decades of software development/architecture experience to working at AWS in consulting. I worked at a 60 person startup before.

I’m more challenging the notion that all any of the cloud providers offer is a bunch of VMs and the surrounding networking infrastructure.


> I’m more challenging the notion that all any of the cloud providers offer is a bunch of VMs and the surrounding networking infrastructure.

Granted, I'm biased because I work at a development-focused shop so my experience is the development/infra side of AWS and Azure as well as a healthy load of legacy on-prem servers (I leave my fingers off of GCP though, heard too many horror stories). We follow KISS - so just from a quick grep through our Terraform files it's almost all EC2, S3, Cloudfront, ELB, ACM, RDS, EFS, Beanstalk, ECS and EKS plus Cloudwatch for logging/monitoring, well wrapped in modules. That's stuff one can find pretty much everywhere, especially as most of our workloads are shifting to EKS.

The things you use are IMHO more targeted for specialist use cases, and I can clearly see the value-add... I'd pay good money to never have to see JIRA again in my life.


For practical things it's hard to teach in the abstract - you have to do - and doing means choosing some sort of toolchain - typically the most popular is a frequent choice.

As an example imagine learning programming - without some real world practice. And if you do some real world practice you have to choose which tool to use.

I take your broader point - but I think it's inevitable that most courses of this type are based around a particular tool chain.

Most Data Science courses use Python for example.


Choosing a tool doesn't mean locking yourself into it forever and refusing to ever use anything else though.

Well before I even started my first development job, I had used BASIC, Turbo Pascal, Assembler, C, Euphoria, Java, C++, Javascript, and Visual Basic.

My first dev job didn't use any of those, however. I had to ramp up on PHP, SQL, Perl, Python, and a little Ruby. Took two weeks to become productive, albeit not a master.

Over the years since then, I've used a wide variety of other tools (languages, frameworks, compilers, editors and IDEs, etc.) I can't imagine where I would be if I still insisted on using BASIC and writing code like

10 PRINT "Hello!"

20 GOTO 10

In the end, they're all just tools to do the job. You don't refuse to use a screwdriver just because you learned to use a hammer first.


I see your point to a certain...eh point. During my studies we also had our ERP course accompanied with some specific tasks on one tool (don't ask me for the name now, its been some time). BUT it was accompanied only, we usually had the concepts presented beforehand. If you want to go a step further...show an alternative from time to time.

But I don't agree with the python comparison. Python is only a language and even Numpy/Pandas still need you to know the concepts and knowledge attained using them are definitely transferable.


It comes down to the course - a course that just teaches a cargo cult like set of steps is a bad course, one that involves a proper discussion of the fundamentals is a good one.

All I'm saying, the fact that a course uses a particular tool chain isn't the determinant factor to whether a course is good or not.


>It comes down to the course - a course that just teaches a cargo cult like set of steps is a bad course, one that involves a proper discussion of the fundamentals is a good one.

totally agree here

>All I'm saying, the fact that a course uses a particular tool chain isn't the determinant factor to whether a course is good or not.

I totally agree! My comment was related to the commented mentioned that the whole 6 month training is worthless if GA get's blocked

I guess we aren't that far apart :)


Naively I would assume that the purpose of a data analyst is about presenting the relevant information to a company. How that data is collected, stored, processed and indexed is the role of system administrators, web developers and database designers.

It seems very inappropriate to allow a data collecting tool to dictate what information is relevant for a specific company.


Once upon a time I might have agreed with this take, but having years worth of battle scars on me now - the interface that any company’s data presents about its operations is entirely coupled to the implementation of how it’s collected and the assumptions with which the system generating it is built. This is a good thing, because there is very little new under the sun, especially in business.

However, most young businesses will waste tons of time and money reinventing the wheel of these systems and trying to customize them to their business’ unique needs, but the much more effective path is to really (re)think through your business process and figure out how to align it with the grain of the tool instead. This option is only obvious to those with experience in failing to execute on the former option, unfortunately.

To your point, an effective analyst doesn’t just present data, they have to understand the entire world around that data - tooling, people, processes.


Not really important to the rest of your comment, BUT — Sister-in-law means they have to be a “sister in the law”, a girlfriend has no legal basis in fact.

All that said, there are other analytics systems out there mixpanel, amplitude, roll your own, etc. they might not be quite as full-featured but 95% if the value comes from a few features everyone has


> Personally I find this astonishingly foolish of the people who train exclusively on these tools instead of first principles and primitives.

The argument for Universities right here.

I always find the opinion that Universities should make students job ready to be naive, even if well intentioned. There's a place for certifications that focus on job readiness, and there needs to remain a place that focuses on first principles and primitives.

I went to University about a decade into my career as a programmer to fill in the pot holes and absorb the first principles and primitives. I advocate that route every time I can. It's great if people can get a certification and start working with GA right away, and they have a place to level up their career with the money they make if they want to.


Many people depend on YouTube or Instagram to make money. If YouTube and Instagram bans them - and it happens - they lose the ability to make money.


Then they will learn an important lesson.


Don't specialize in something that you are good at?


What is the lesson here?


The business lesson is if you are entirely dependent on somebody else you are not in a good position. Especially if your interests aren't aligned.

When people are banned, it's the platform deciding the reputational risk of association isn't worth the money you bring in. Given reputation impacts can be huge - it's hard to see how you'd ever be the right side of that equation.

Even worse, the platform may decide it's not even economic to make sure each banning is fair....

Ultimately I suspect the only way to rebalance the balance of power is to use collective power.

So having large number of friends on the platform that will campaign on your behalf, taking out insurance ( another pooled method ), or even having formal Unions.

In essence that puts some of the economic cost of getting the decisions right onto the platform users ( as the friends/union does the work, and makes the case ). Pooled insurance has a similar economic basis ( platform users bear the cost of the insurance ).


Don't put all your eggs in one basket. We're taught that for everything from investing to dating.


This gets far more difficult as one competitor in an industry nears monopoly status.

Lets say for example that you somehow make $120,000 a year over expenses with instagram (don't ask how, it's just an example). This is far more than you previously made in your last job by double. The problem is it takes nearly 100% of your working time to make this income on that single platform. Any less amount of effort and your income drops significantly. Now, you are in a trap where you cannot split your efforts between platforms, you have to go full in on one.

Your solution would be to make far less money... um, safely? Whereas a far more realistic solution would be to ensure that you don't live to close to the edge of your means and put 1/3rd of your income back in savings in case the day the platform fails/kicks you occurs.


You could define all your eggs in one basket a different way.

So you could think about not operating as an individual who can be picked off, but operating in a collective way - either through friends, insurance, or unions.

ie what's your support network if you are dropped through no fault of your own.


If you save half your income, then for every year you spend on Instagram, you can spend another year figuring out what to do after Instagram collapses.


Own your customer relationship.


> However now we're in a situation where at least a few thousand people depend on this precise tool existing, and will be economically useless if it is banned.

If a 6-12 month training has no skill transferable to another analytics tool, I strongly suspect the training was useless to begin with. Other analytics tools are not so dramatically different from GA that you'd lose all methodology on what to monitor, how to conduct a study, etc.

To make an analogy, you don't suddenly become useless if you move from Java to C#.


Java to C# is pretty close though to be fair.

Going from Haskell or Scheme to Rust or even Python is going to take some time before you're completely comfortable with all the built-in's the standard libraries, the "pythonic" or "rustic" way of writing, tools and so on.

It's a lot of hidden things, you're not completely useless of course, but it's not like you write "production quality" code and have the ability to work completely independently or be an SME (like you probably were) within 1 month or even 2. It's a lot of little work to get back to where you were professionally.

Because it's not just a training course that is lost, it's all the incidental knowledge that was picked up on the job too.


I chose the languages used in the comparison with that in mind. Moving from GA to e.g. Matomo or Plausible does not require to completely reshape the way you think about your problems, you don't have to change the way you work, you just have to learn how your new tool implements it.

(Also Haskell to Rust is pretty straightforward, the typesystem knowledge you learn in Haskell usually means the harder parts of learning rust are made easy. Having done that transition, 1 month is reasonable to be productive in Rust)


IT is full of those tools that have so many undiscoverable but (arcanelly) documented problems that there's an entire market for people that spent years studying them.

The side effect is that for each of those tools, there's an army of people that spent years studying them, and will push them at every opportunity they can.

And interestingly, those tools keep being pushed at places even when perfectly fine alternatives exist that won't give you almost any of those problems and don't require any specialization.


what you wrote is mostly true but also partially incorrect: many competencies are transferrable.

there are privacy-compliant products in that sense, unless you've been literally told "click here and click there" you should be able to employ old concepts with new tools.


Given GP's paragraph:

> Personally I find this astonishingly foolish of the people who train exclusively on these tools instead of first principles and primitives.

I wouldn't be surprised for this to be a "click here and click here" kind of training. I've seen a painful amount of those. And then, when the inevitable "new and improved ui" comes along, these people are lost and require a new training.


These are essentially "IT Factory Workers". To be honest, I think there's value in that, with the same economics of "traditional" factory workers.


You'll be surprised at how many competent (or looked like) people that cannot connect the dots between technologies / tools. They excel at one tools and will having a very hard time migrating to new one since they cannot connect the concept and similarity between both.

Not all, but there's many.


There was a time when "Six Sigma" was all the statistical analytics rage for everything and anything corporate. However, if one already knew statistics they'd encounter a franchise branded school of terminology, different formulas than the accepted standards, and their own separate "z-tables" that apparently backed in corrections for their use of non-standard formulas. It was a hugely successful re-branding of standard statistics with a branded hierarchy of made up human hierarchies and Scientology-level made up technical terms. All the "Six Sigma Black Belts" - an actual title in corporate pointy head world - are 1000% useless now, unless they are at some dinosaur still following that nonsense.

This is how corporations and the will to profit undermines first principal knowledge and leaves a wake of fake education that ultimately needs to be unlearned or unwisely held as a fragment of useful adrift in an island of potential non-logical nonsense


> Personally I find this astonishingly foolish of the people who train exclusively on these tools instead of first principles and primitives.

Even if we would train them on first principle primitives, recruiters don't view it that way. That's even true in the software dev world. If you don't have 3 years of experience in Java, then it doesn't matter that you're a 5 years experienced software engineer in all kinds of languages.


> That said; we also have "Cloud Engineer" as a job title, so I'm not sure we will learn this lesson.

I don't bill myself as such, but I am basically a "Cloud Engineer". My expertise is in no way dependent on a particular cloud, and I've done work in GCP, AWS, Azure, Rackspace, and even a private cloud or two. A VM is a VM, Postgres is still (more or less) PG regardless of who is hosting it. Sure, there are specifics, but even cloud-specific stuff really doesn't differ too much, and it's pretty easy to find the common memes between the two. I can assign a role to VM in AWS, an IAM service account to a VM in GCP, and an "identity" in Azure. All 3 then permit the VM to make API calls to the respective cloud. All 3 fetch their access token in basically the same (but incompatible, of course) ways: HTTP request to a magic link-local IP.

A lot of the concerns I deal with, such as "can we survive an outage? what types?" depend on concepts like failure domains that apply equally to a cloud or to a datacenter.

But at some point, I had to dip my toe into a new cloud. I started a new job, and they used this thing called "Azure", and at that point, I'd never heard of it before. But you approach it with an open mind and the right balance of "some of my old knowledge might be relevant, but this new thing might also work differently and I should be prepared to build a separate mental model around it if the old knowledge is leading me astray."

… and I'd expect the same from someone doing "data analytics"; I'd expect something like "math is math, how I collect the data might change, what APIs I use to process it might change but the math is the same."


>> Google Analytics was sold as a solution to you making your own analytics, because that's hard

It's not hard at all, it's just that we have become too lazy and mentally dependent on big tech companies!

If all you want is user tracking, a few lines of JavaScript is all you need on the frontend. A popular WordPress plugin named jetpack gives you almost all data needed for site analytics, for example.

There are other tools too like tableau and python based tools like pandas and numpy which help you with all kinds of analysis.

Humble techies are everywhere with their tools, you just have to trust them a little bit, that's all! It's almost like trusting your Uncle Joe's pizza dude next door instead of the familiar Domino's or McDonald's. It takes a while but you'll eventually discover there's no difference.


Some business area pivots could be into user analytics or marketing analytics. Product management even.

Data engineering could be another path to explore.

I'd view your sister-in-law's certification course as more of a first step than an end. It could open doors but still have to stay relevant with broader skills.


> Personally I find this astonishingly foolish of the people who train exclusively on these tools instead of first principles and primitives.

Completely agree. So it is their personal decision. It has been forever.

People getting Macromedia Dreamweaver certifications instead of web development, and so on.


> Personally I find this astonishingly foolish of the people who train exclusively on these tools instead of first principles and primitives.

That's an elitist perspective that doesn't include the average worker making a living by knowing their tools and not much else.


Please tell me more about what you think is elitist here.

My personal situation is possibly the least background elite possible and even I know that first principles are important in a field that is shifting -- which happens to be most fields, just tech is a bit faster at churning.


That's just as true as it used to be with proprietary statistics tools such as SPSS, Minitab, SAS, STATA, JMP, etc. They used to own the market, pre-cloud; and all the university courses and commercial trainings. Eventually, people migrated off those platforms in favor of the current cloud-vendor ones (or else migrated to code in R or Python, or even MATLAB).

> However now we're in a situation where at least a few thousand people depend on this precise tool existing, and will be economically useless if it is banned.

Not really, there is no GoogleAnalytics-industrial complex yet, but yes apparently they have quite a lot of lock-in on nonprofits. I see this story as a privacy regulatory story driven by the EU and GDPR. They will order GoogleAnalytics to fix violations, and then Google roll another version of GA. Customers who want to take a stronger stance on privacy would migrate off GA.

I doubt there is anyone whose entire livelihood depends on GoogleAnalytics (I don't think your relative's "entire value" does, for example) and even if there was, they could reskill in the medium-term, but anyway you could make the same comments about certification, lock-in and perverse incentives about AWS, or plenty of other companies in previous decades.


This is why I suspect most cloud certifications are garbage. Most often they are just teaching you their product offerings. Even the implementation details aren't that useful because they change so often.

Compared to something like a Cisco networking certification: The CCNA will cover practical use of their products, sure, but they're also going to teach you subnetting, both standard and Cisco proprietary routing protocols and how they work, in theory, as well as how to employ them in practice. I've mostly moved on from using Cisco products day to day, but all of the understanding was directly translatable to any other platform I've worked with.


> Now her entire value is tied to the use of Google Analytics, she will almost certainly fight very hard to ensure that these skills remain relevant

On one level it's an important observation, on another it's mundane: DBAs will fight one another over Oracle vs DB2 vs SQL Server. Traditional bare-metal DBAs will fight RDS. C programmers are upset by Rust. People who invested a lot of effort in shell scripting for SysV init dislike systemd or s6.


This is almost like U.S. high schools, which almost exclusively require students to have Texas Instruments TI-84 series calculators in math class.


I had a TI-85, which was just a little different than the required calculator and didn't do everything quite like the book described. That was good. It meant that I had to learn to program it to do the things it didn't do like the 5 number summary and something else. Which meant I actually learned that stuff pretty well.


With a student:teacher ratio around 30:1 I'm not sure what the alternative would be


Probably, not totally wasted. If you learn the principles of analytics (what KPI:s to measure, why, how to diagnose based on analytics data etc) you can hopefully transition to another analytics platform such as matomo. Kind of like how it isn't wasted time to learn a programming language even if you later have to switch language.


How many bootcamp "developers" tooled up in one framework and struggle outside of that?


> we also have "Cloud Engineer" as a job title

Not quite the same, the core concepts and skills of a Cloud Engineer should be easily transfereable between providers and even to on-prem infra.


> where at least a few thousand people depend on this tool

Just taking Google employees alone, a few *tens* of thousands of people depend on this tool. Millions of non-Googlers depend on the tool.


> she will almost certainly fight very hard to ensure that these skills remain relevant

Skills are very seldom tied to a specific product these days, so she will be good.


Sounds a lot like the tech programs when I grew up being MS exclusive with active efforts to undermine non-MS products....


No snark intended--sounds like it may not have been a well designed course. Hopefully it was at least explicit about being GA-based.


Im still waiting for the Scrum fad to passover and the entire scrum masters guild to be without a job.


A bit like Instagram etc - there has to be a new fad for those people to move to for that to happen.

So sadly these things don't tend to go away, they just evolve.


Occupational lock-in was a theme explored repeatedly and with nuance in The Sopranos.


It is nothing new, that using Google Analytics is in violation of the European GDPR, I guess this was covered in the course. So why would she learn a technology, that is mostly illegal to use?


Maybe because most companies use it. Also, it is quite new for it to be considered illegal, GDPR only came into effect a few years ago. It would only seem "nothing new" if you're extraordinarily young and inexperienced.


Or if you work in web development in Europe. Moving away from Google Analytics and Google Fonts was a huge deal in the last 3-5 years. Companies kept looking intensively if they still use it somewhere and it was a lot of work.


One could still use Google Analytics by proxying the tracking events. Afaik only the IP is considered private data. So one could mask or (non-reversibly) hash the IP, remove anything else which might be considered private data and then send the event to Google. A simple PHP script with a few lines of code could do that.

But Google lost me by:

A) Making it impossible to convert your old data into the new Analytics version

B) Abandoning the API which allowed you to code your own reports. Over the years, I wrote a ton of code that talks to the API. This is all worthless now.

I recently switched to self-hosted Matomo. At first I did not think much about it, but now after I got used to it, I have to say it is much better than GA. The interface is so much nicer and snappier. And more logical.

Apart from that, I like that it is open source. If there ever is a point in the road where the makers of Matomo decide on a non-compatible fork, I'm sure the community will write a converter that converts the old data into the new format.

And after using it for a while, it hit me: You can write your own reporting tools by just querying the MariaDB database! Using SQL is so much better than it was to fight the insanely complex and unintuitive Google Analytics API.

If I really wanted to still use Google Analytics, I would just write a converter, which pumps all the Matomo events into Google Analytics. That would be a GDPR-compliant way to use Google's tools. But I don't. I'm done with Google Analytics forever. Matomo is the promised land for me.


Hashing the IP is not enough by IMY's decisions, none of the companies are allowed to use GA going forward.

CDON used GA's IP anonymization through truncation, it was not deemed enough. [1] The IP itself becomes is not personal data after truncation but it's unclear if the truncation happens before it leaves the country. And combined with the other personal data (e.g. cookies), it is considered personal data. [2]

Coop proxied all calls to GA and use the same generic IP address for all users. [3] They don't get a fine but have to stop using GA.

[1] "1.3.15 Effektiviteten hos vidtagna skyddsåtgärder av Google och CDON" https://www.imy.se/globalassets/dokument/beslut/2023/beslut-...

[2] "2.2.2 Integritetsskyddsmyndighetens bedömning" https://www.imy.se/globalassets/dokument/beslut/2023/beslut-...

[3] "1.3.14.2 Coops implementering av server side container" https://www.imy.se/globalassets/dokument/beslut/2023/beslut-...


Automated translation of 1.3.14.2:

1.3.14.2 Coop's implementation of the server side container The purpose of the server side container that Coop has implemented is to improve the security related to the data sent. More specifically, the aim is to on a good and safely be able to protect the personal privacy of those registered. Server side the container acts as a proxy between the registrant's browser and the Tool where Coop has chosen to implement the server side container in a way that makes them the registered browser's public IP address is never transmitted to the Tool. Implementation can be described as follows. A registrant visits the website www.coop.se in your browser. The Google Analytics script is downloaded from the server side container instead of being downloaded directly from Google Analytics servers. This results in the registrant's IP address as well as information about user behavior, device information, customer status, online identifiers and transaction data (according to points 1–5 above under section 1.3.10) are transferred to the server side container, instead directly to Google Analytics. Once the Google Analytics script has been downloaded from the server side container, a new call is made from the server side container to Google Analytics servers. Since the call is made from the server side container, no transfer of the registrant's public IP address to Google Analytics. Coop has configured the server side container in such a way that all data as above, except it was recorded public IP address, passes through the server side container to Google Analytics. Google Analytics receives data sent from the server side container and that data (information) that has been sent is popularized in reports by the measurement set up on the website www.coop.se. The treatments that take place through the aforementioned – i.e. to receive, convert and forward the call - takes place in the working memory of the server side container. It means all processing takes place in real time and that no data is permanently stored. In other words, stored public IP addresses were not registered in the server side container and they are not exposed rather against Google Analytics servers. All communication from the browser, via server side container, to The tool is also encrypted.

This process cannot be reversed as the information is not stored and the conversion not based on a one-to-one relationship that enables the use of a "key" to recreate the public IP addresses. Coop has activated Google's function for IP anonymization. It means that the IP address sent to the Tool is truncated. This is done by Google removing one part of the IP address before the IP address is stored on disk. For an IPv4 address, last is replaced the octet in the address with a zero. For an IPv6 address, the last 80 bits are replaced with zeros. The action cannot be reversed but as this action is done by Google i Coop has also chosen to implement the tool as a server side container. In Coop's case, the IP anonymization feature is enabled and applied to the generic IP address sent via the server side container. In context, however, the function is redundant considering that the server side container prevents the public of the registered IP addresses from being sent to the Tool. Coop's assessment is that server side the container as a measure is a sufficient protective measure, but that it does not harm that even have the IP anonymization function activated in the Tool.

Do you happen to know which section of the ruling it is where they discuss why Coop needs to stop doing this? It's a PDF and the translation tool I'm using on my phone is a pain.


In section 2.2.2 they expand on their reasoning. The claim is basically that unique identifiers stored in cookies ("_gads", "_ga" and "_gid") (they also mention more unique identifiers in the same context in section 2.4.2.3.2) together with information about the page that was visited, the visitors browser fingerprint and the generic IP address can be used to identify individual users.


You can’t hash an ipv4 address. It’s trivial to brute force all possibilities given the limited problem space.


How about this:

Server-wide salt. Randomly generated every 24h or server reboot (whichever is sooner).

The salt is not saved alongside the hashed IP, it is not saved anywhere whatsoever. There is no log of previous salts.

You can still track a user session across multiple page calls, but the hash can not track them across different sites.


I have not researched it but I wonder if you can even hash IPv6 addresses. The issue I see is that allocations could be too regular so even if the full space is huge most addresses may occupy a small and predictable part of it.


Depends how you define IP in this sense. Each full individual IPv6, unlikely as most consumer devices are getting a somewhat random internal address on their network.

Now, with IPv6 for most consumers the first 64 bits is generally enough to define the edge network device that would be covered by a single IPv4 these days.


> the first 64 bits is generally enough to define the edge network

One complication is that hashing removes this structure. If you use any good algorithm, you will need to test the entire address to recover any part of it.

I am very wary of IPv6 addresses being so heavily biased into 00 or ff segments that the address space doesn't actually add much entropy. So, I'd go with no, it's not safe to hash them. But if you get some random ones, I am really not sure.


That's why you use a salt, which is what I assume is meant by "non-reversibly".


Salts provide resistance against bulk bruteforce by making it so that you can't identify which hashes are the same plaintext without actually computing all of them. The issue is still that there are not that many IPv4 addresses and so even with very heavy algorithms it would be trivial to break.


But there are infinite numbers of salts. Please explain how this could be brute-forced as long as the salt is used correctly? What am I missing?


The salt is stored in plaintext alongside the hash and simply concatenated. If your scheme, for example, is h := H(ip, salt) := sha256(ip :: salt), bruteforcing is simply a matter of trying all values of ip:

    for i in 0..<(1<<32):
        if H(i, salt) == h:
            return i


The salt doesn't have to be plain-text, that's an implementation detail in the common password hashing algorithms for obvious reasons. The requirement was that the hash should be non-reversible. Store the salt in a (http-only) session cookie and concatenate it to the IP before the hashing rounds. Put your entropy in the salt and any brute-force attempt is theoretical. For every session you need to compute the exact combination of IP + salt (which isn't even known to the server).


At that point, what value does the salted+hashed IP address give you over a randomly generated number (say, a UUID) per session?


None, I'm not arguing for the solution. Proxying through a PHP script just to keep using Google Analytics is overkill when private self-hosted solutions exist. I'm simply showing how you can anonymize IPs even from yourself, if the goal is to anonymize from only Google and not the server it could be useful across sessions.

The solution being overkill does not mean my first comment 'That's why you use a salt, which is what I assume is meant by "non-reversibly"' is wrong.


Could you be more precise with what envision by "correctly"? If you use a shared salt for all or a significant portion of rows, then it's a realistic matter of bruteforcing to cover your entire table, given that the attacker has been able to recover the salt.

Even without any cracking at all, shared salt would mean that rows can be correlated if the attacker can identify a single row and correlate that to the target.

Let's say you use up the game and use per-row salts, like here: https://stackoverflow.com/questions/4159827/another-question...

Given an attacker wanting to pull out information about a specific user, and they have on their hands your salted dataset, the salts, and a handful of IP addresses that the target is known to be associated with from other datasets, it's still trivial to brute-force.

Even increasing it to a set of a few thousand IP addresses (say, a handful of /24s) it should be perfectly realistic, assuming you don't use enough rounds that your infrastructure is spending a majority of its CPU-time only performing psuedoanonymizing hashing.

Oh, and if you use per-row salts, is any of that data still usable in the first place?

The above is besides the point of the IPv4 address space being small enough to exhaust and shows why this is an issue for IPv6 addresses as well.


If it's shared, it's not a salt.

The term of art in that case is 'pepper'.


That's backwards.

> In cryptography, a pepper is a secret added to an input such as a password during hashing with a cryptographic hash function. This value differs from a salt in that it is not stored alongside a password hash, but rather the pepper is kept separate in some other medium, such as a Hardware Security Module.

https://en.wikipedia.org/wiki/Pepper_(cryptography)


Apparently we understand words differently.

A salt is a unique random value added to each value when hashing. Obviously, such a salt needs to be stored alongside each value because it’s random and unique. If you don’t store it how could you possibly ever figure out what it is?

A pepper is a shared value added to lots of different things before they are hashed. It isn’t unique to each value, so it doesn’t need to be stored alongside them. Although it still needs to be stored somewhere. But hey, you could also store it alongside every value you store. This is a thing sometimes done when you do ‘per user’ pepper, where for example the pepper is itself a a username or something - so your pepper is also in the database alongside the passwords - but that doesn’t make it a salt.

If you are using the same piece of data to season multiple things before hashing, you are using ‘pepper’ not ‘salt’.


I think you may be alone in understanding the words "pepper" and "salt" this way.


Try reading the rest of the Wikipedia article beyond the first paragraph. Particularly the section on ‘shared-secret pepper’ and ‘unique pepper per user’.


Isn't it the other way around? Most salted hash implementations share the salt by embedding it in the hash. It's pepper that isn't shared.

After rereading, I suspect by 'shared' you meant 'non unique' rather than 'public'?


Why would ‘shared’ mean ‘public’? And in most password hashing contexts the salt and hash are not ‘public’ either.


First time I hear this interpretation. A shared salt is still a (very poor) salt. A pepper would not be stored alongside the ciphertexts, if at all.


No matter what strategy you use to hash IPs, if you can correlate to an IP you can find the original IP by just trying all options. It doesn’t matter what you do because 4B unique possibilities is just too low to prevent brute forcing while maintaining utility.

If you use a random salt, then you need to store it or else the stored value has no utility. However you implement retrieval of that salt it can just be brute forced.


The salt could be stored as a cookie and you can follow the session but never be able to reverse the hash yourself. Any match you get in the brute-force attempt might as well be a collision.

The entropy can be in the salt, you're all making it sound way too easy. The requirement is "non-reversible". Given infinite time everything can be brute-forced, but this is the mossad/not-mossad problem.

It's good enough for storing passwords, where the salt is plain-text.


That’s just unique identifiers with extra steps.


You need to protect it from yourself.

If you know the salt then you can trivially brute-force it yourself and now you are not GDPR-compliant.

If you don't know the salt then you'll have to use a new salt for each IP and then all hashed IPs will be unique and you have no way of correlating them so it is all completely worthless.


But you could use a salt, right?


Only if the salt is kept secret. There also needs to be a different salt value per ip, obviously. But given those conditions, it works.

Of course, it would be just as simple to use the salt as-is, in that case, since you have to look it up anyway.


A salt will render rainbow tables useless, but AIUI will not prevent brute-forcing after the fact. IPv4 has ~4 billion addresses, which would be too expensive for data analytics but could be brute-forced if someone really wants this one piece of data in particular.


A, especially, is atrocious and caused multiple small companies I know to ditch GA entirely. They had years of GA data that is impossible to export without BigQuery shenanigans.

It is impossible to believe a team as well-funded as Analytics could/would not find a way to automatically migrate gtag.js to GA4. They should have made the upgrade transparent by simply converting requests behind the scenes so legacy analytics properties could exist perpetually, while forcing new property registrations to use GA4.


One of the sites (coop.se) in this decision did use a server-side GTM container to mask the IP before it was sent to Google, but they were still told to stop using GA, but they weren't fined. The DPA said that the _gads, _ga, and _gid cookies were enough to be identifiable. I don't follow the logic there, but that rules out using a proxy for compliance (at least done as coop did it).


My company is Matomo too and wanted to use it to track how the user uses the WebApp. But Custom Actions are deprecated and Custom Dimensions are not made for that. Do you use any analytic tool to track what the users are doing that is GRPD compliant?

I am looking for something where I can track which button he pressed, how many times, which part of the app are the most used and underused, do some funnel of the happy paths, things like that. Like not to know who is the user but really app based and how the user uses the app. Before I used to use Google Tag Manager for that. But it’s not GRPD compliant so I can’t use it.


Depending on your knowledge of the browser, it's not extremely hard to rollout your own barebone analytics.

If you're only interested in session data without collecting any cross-session knowledge, all you have to do, basically, is tag your page and listen to dom events.


Last time we had a privacy officer make a report on our setup, we were (unknowingly) sending much more PI data to Google than the IP. The fact that you e.g. clicked the "like button" on profiles/xxx, leaks that you have access to xxx, can like it, and have liked it. There were many more like this. Ours was a business tool.

The data we were leaking was e.g. the fact Foo was employee at ACME, simply because we sent events occurring on the estate of Acme for user Foo.

It's not as straightforward as proxying. Or, as we did, removing some bits from the IP.


This case is about the old Analytics that was replaced with Google Analytics 4 in 2020. So they must stop using a version that Google definitely killed July 1 this year.

There are arguments that GA4 would fail the same requirements. Denmark hold that view but it hasn't been tried. Their argument is that a EU-citizen that goes to Asia and visits a site there, will have his information sent to US servers and not EU servers. I find this argument objectively absurd considering how internet works but it possible that's how the law works. We wouldn't know before it has been tried though and I'd be sceptical about anyone claiming to know the result.


> Their argument is that a EU-citizen that goes to Asia and visits a site there, will have his information sent to US servers and not EU servers.

Agreed, this is absolutely ridiculous. And I say this as an EU resident! I’d rather NOT have websites start checking residency/citizenship to decide my data ownership.


They already do.


I’ve literally never had to provide such proof, and I would say I use the internet quite a bit. Any example websites?


No need for travels - all you need is to use a VPN/proxy/tor. IP geolocation is not a reliable proxy for physical location (let alone residency - while citizenship doesn't enter the equation at all). I don't find it problematic that the law recognizes this.


AFAIK it’s often accepted that the law does accept IP geolocation as a reasonable effort to detect EU residents. GPs point here is that Denmark does not hold that view.

On the one hand, I kinda agree with DK, on the other hand that would bring on the fears of US-liberal HNers who lose all their freedom to sell user data.


Yet IP address is the "personal data" they are claiming falls under the law. It's somehow both personal and useless at identification.


Not so strange at all. The IP address _can_ be used to identify you (together with other data-points), which brings it into PII territory. That it can not be used to reliably determine your physical location is besides the point.

Consider an online store shipping physical goods. It asks you for a shipping address. This shipping address is PII and must be treated as such. The facts that you may reside elsewhere and that multiple other people may be residing on that address are both irrelevant to the GDPR.

https://www.fieldfisher.com/en/services/privacy-security-and...

https://commission.europa.eu/law/law-topic/data-protection/r...


My opinion is that this applies to GA4 as well.

The decisions don't explicitly mention a version, they say these particular sites: "...shall cease to use the version of the Google Analytics tool used on 14 August 2020". They don't say if that's UA or GA4. The original complaints from NOYB refer to UA, but the issues cited in this decision would apply to GA4 as well.

So when the DPA says "Companies must stop using Google Analytics", there's no reason to think they only mean the version that was already shut off when they published that post.


I guess they can't ban a product for all eternity. In the decision [1] they are a bit more specific:

"This shall be done in particular by ceasing to use that version of the tool Google Analytics as used on August 14, 2020, if not sufficient protective measures have been taken."

[1] https://www.imy.se/globalassets/dokument/beslut/2023/beslut-...


The result is the balkanization of the internet, although most folks in the EU want that for nationalistic reasons and to prop up their own industries, under a thin veneer of opposing US imperialism (but being unable to do anything with respect to the things which actually threaten the EU, such as an over dependence on Russia.)


I don't want my data in the hands of the NSA as much as you don't want yours in the hands of the BND, FSB, or the People's Liberation Army.


Fair enough, but this were actually the reason, simply adding a statement such as “if a non-EU member state wants to request data about an EU national and said data cannot be released to the law enforcement of the non-EU member without a MLAT being served to the EU national’s government” would have been enough. However, I don’t see that, and many Europeans do give “dominance of Americans” as the reason, so I simply go by the information available to me.


Fair enough, but if this were actually sufficient then the US wouldn't have a law saying they can require US based companies to give them the data without regards to the laws of their countries the data is actually being taken from.

However I don't see that, and thus Europeans familiar with the relevant cases give “dominance of Americans” as the reason.


Someone has not heard about CLOUD act.

The European Comission has repeteadly tried to figure out a framework that would let US providers access EU markets safely, respecting EU laws. Every single attempt has been broken because there is no way for an US company to respect EU law and also comply with the CLOUD act.

The bit of law you suggest would essentially make it impossible for a company to respect both EU law and US law.


Are you aware of the history of this case and the CLOUD Act?

The CLOUD Act requires US based companies to comply with US requests for data even when that data is stored exclusively outside the US. It's in direct conflict with the GDPR.


I would rather the NSA get my data than the BND though. But GDPR doesn't protect me from the latter. The intelligence agency at home is always a bigger threat to you than any foreign actor.


> although most folks in the EU want that for nationalistic reasons and to prop up their own industries

Sources for this extraordinary claim, please.

As an EU citizen I'm interested in the EU protecting my privacy, not for nationalistic reasons, not to prop up EU's industries. I care that my data isn't willy-nilly given away under some opaque mechanisms controlled by large corporations, because as many here on HN like to remind me: capitalism is amoral, this is some regulation instilling morals into the system.

I want to be aware and protected about where my data is being used, for what purposes, and I want to have the power to control how corporations can use my data, or if they can use it at all. This kind of data can be modeled into a version of what a system sees as "me", through the interactions I had with it, building a profile of what moves, and interests me, I want to be able to know and control who can know, and to what degree, who I am.

If you are against that, please explain why.


The best and only real way to protect your data is to not give it away in the first place. But that would require you a browser that doesn't leak data like a firehose or not using the website.

The regulation just makes sure Europe stays unimportant in tech. We will always be a secondary market that services get taken to once they've become successful elsewhere.

It's not even about any specific regulation anymore. Just the fact that they've been so trigger happy with regulations is enough to chase away investors and startups.

I would also like to remind people that the EU did adopt the Data Retention Directive in the past that forced ISPs to keep logs of every website people visited. That kind of soured any belief I had of EU politicians caring about our privacy.

>According to the Data Retention Directive, EU member states had to store information on all citizens' telecommunications data (phone and internet connections) for a minimum of six months and at most twenty-four months, to be delivered on demand to police authorities.

https://en.wikipedia.org/wiki/Data_Retention_Directive


I don’t object to the consent requirements for marketing, however, please see [1] for the justification behind my stated position.

[1] https://news.ycombinator.com/item?id=36584739


Your citation is... your own still-unsupported opinion? It's not really responsive is it :-)

The EU has a fundamentally different viewpoint on data privacy to the USA. And they are entitled to it. The EU has roughly as many citizens as the USA under a single governance; why should they not collectively argue?

Many Europeans do, also, think the USA is mad on other issues and are unwilling to see a situation where the USA's chosen solutions to things are the de facto solutions. They see data privacy as one of the last opportunities to resist that.

(Alas here in the UK we decided we didn't want to be part of that solidarity, and we are apparently desperate to capitulate.)

As a side note, why is it only ever non-American states that are said to "prop up" their own businesses? It's a two-way street.


EU is much more threatened by an over dependence on USA.


Analytics right now is basically "you won't get any useful information for your website because we value users' privacy, don't worry we see all of the data anyway"

Remember, you aren't the customer if you embed Google Analytics, Google is.

edit: if you want analytics, honestly just roll your own... you can't trust advertising companies with your users' data


This is true not just for analytics but pretty much all features.

Imagine you are a great speaker and instructor and have an audience. Right now you GIFT it to YouTube, Twitter, etc. and they monetize it for you, give you a tiny percentage, and even constantly direct your audience to competitors and other distractions. In fact YouTube even sells an option to advertise your videos on your competitor’s videos!

I say — opt out. Run your own everything! Your own community software (instead of Discord). Your own videoconferencing, livestreaming, chats, presentations, gated content, accept payments with crypto in addition to PaymentRequest. It’s hard to build an open-source alternative that is good enough (no, Mastodon and Bluesky aren’t — yet).

Which is why (shameless plug warning) I spent 12 years and $1 million dollars with my team to build it. https://github.com/Qbix/Platform

Use it — as 1 of hundreds of features, you can have your own analytics on your own database on your own community site. The other features are here: https://qbix.com/features.pdf

PS: Don’t get me wrong. Keep using YouTube to host your content, etc. But relegate it to hosting short form teasers and highlights and testimonials all of which link to your site. People can discover you on the big sites but if they are serious about your long-form content and community they should buy a membership on YOUR site and have a direct relationship — then deplatforming or coersion will be the last of your worries.


All the videos on qbix.com itself seem to be hosted on youtube?

Also, the features pdf lists "nodejs" and "php" as features.. I don't mean to be snarky here, but I am simply not sure what this product is?


In addition to PHP, it is a "Distributed Operating System for The Web" https://qbix.com/ecosystem#Distributed-Operating-System


Watch the videos for a start


Who has time to watch videos without a high level overview? :)


The high level overview is at https://qbix.com/platform


I read that and don't feel like I have any more idea what this is. A CMS with some "community" features I guess?

But all those overlapping screen shots make it look like there's been an explosion at the website factory and the smart thing is for me to run in the other direction.

1/3 of the Youtube videos are just "This video is unavailable".

I couldn't look at it that much longer because the colorful stars falling out of my mouse cursor was so distracting I had to close the tab.


Holy 2008 what a website! I won't comment on any specific design choices here, but accessibility-wise: the text on the team page seems fuzzy? Looks like it's a text-shadow.


What is 2008 about this website? I don't know any websites from 2008 that look like that. What do current websites have which websites from 2008 don't?

When I visit YouTube, Facebook or Twitter, they seem extremely "busy", overrun with ads, and rather ugly, but we are used to them. I am not sure it's so bad to have a clean layout. But I am open to constructive criticism.


I get where they're coming from. It kinda reminds me of an old-school iOS app (pre-iOS 7, before everything went all flat design). The skeumorphism, the arrows in the menu, the big black borders on the menu. The layout, styling and font kind of reminds me of the jQuery era (but of course, I was primed to look for this stuff by the GP comment).

None of what I'm saying is criticism BTW. Just observation.


Does it make it bad or offputting?


I don't know about the 2008 stuff but I found the site to be pretty broken, as in things clearly not working and the code blowing up. The design is very busy on desktop; I suspect your "clean design" is what you see on mobile. The desktop site is packed to the gills and every page has multiple things animating and bouncing at me. Let me make it clear--this site is busier than most, not cleaner.

On desktop, it's pretty easy to open one of the menus on the top and then have it fail to close. Since the site disables the scrollbars while a menu is open, it breaks the site until you figure out the magic spot to move the mouse to make the menu close again. The magic spot doesn't seem to be in the same place every time. Seems buggy. I spent most of my time on this site with one of the menus open, unable to scroll down and see beyond the first page.

The worst part? The links in the menus don't work. A peek in the source code suggests they are supposed to be links, but clicking them doesn't do anything because they're just <div>s (not real <a> links) and the click event handler simply calls preventDefault (QTools.js line 107). That finicky navbar nearly ruins the entire site. I'm almost entirely unable to navigate around. This site is, unfortunately, pretty broken on desktop Chrome. Test your site on desktop in addition to mobile.

I got the site to hit an explicit debugger breakpoint in Q.js line 10293 just by clicking around the top menu buttons. The author of that code didn't bother writing an exception handler, they just had it trigger the debugger.

I do find it offputting; the site definitely has the feel of "a programmer hacked this together without any input from a designer." The massive drop shadow from the embedded videos actually covers up some of the text on desktop. The font is VERY thin--make sure to check your site on Windows and not just macOS. More generally: hire a designer. Programmer designs stick out in a bad way, and users seeing a broken marketing site will assume the product is broken, too. I certainly do.


I want to fix what you're talking about, but I was not able to reproduce the bug. Can you please tell me how you got the menu to not close, for instance? Also, clicking on the menu items clearly opens the page they're linked to. I couldn't get it to not do that.


Reproduction steps:

1. Install Chrome for Windows from the Google website. As of today, that version is 114.0.5735.199 (Official Build) (64-bit). I am testing on Windows 11 and I used a fresh install of Chrome on a machine that has never had Chrome before. This machine has a standard 60Hz display which may matter for my theory at the end.

2. Go to qbix.com.

3. Hover the mouse over "Communities". Now hover over its submenu items. Observe that they do not highlight on rollover like they're supposed to, and clicking on them does not do anything.

4. Now quickly move the mouse outside of the menu. Observe that the mouse escapes the menu, and the menu does not close. Move the mouse around the rest of the page. Observe that the menu continues to stay open. Observe that you can't scroll the page. In this state, the site is unusable.

5. Move the mouse back inside the menu, then slowly move the mouse across the edge of the menu. Observe that now the menu closes.

I have reproduced the same in Edge and Vivaldi; the issues appear to manifest in Chromium-based browsers on Windows. I tested on macOS and iOS and the issue does not show up there. I can provide a screen capture if needed. Without looking deeper, I wonder if this page is trying to use JavaScript to close the menu based on a mouse event that it misses when you move the mouse too fast. I wonder if the entire navbar is implemented in JavaScript instead of a modern CSS-only technique with regular links.


Tried it with Chrome and Edge on Windows. Still can't reproduce those bugs. Strange.

The worst thing is when some users see a heisenbug that you can't seem to reproduce on a similar environment.


I've tried it again on both Windows 10 and Windows Server 2022 and it reproduces every time. I have yet to see the site working properly on Windows Chromium.

Please try creating a fresh cloud Windows instance rather than using your usual computer so you can be sure you are seeing what a fresh user on a new computer would see. I have done so--this reproduces in the preinstalled Edge on a brand new c6a.large Windows Server 2022 instance in AWS. I can provide a click-by-click screen capture starting at the AWS Management Console if desired--I've found this is a good way to prove bug reports to companies and demonstrate that it has nothing to do with my computer.


https://youtu.be/NTujUB8t4tE

Here is the screen capture. This video shows the creation of a fresh Windows instance in AWS EC2 and then the reproduction of all of the above issues in that fresh instance. Follow my exact clicks and you will see it, too.


It's just rough around the edges. Nothing that isn't fixable. Nothing that stops you using the site. It just looks a little bit dated - things like the gradients on the menus, a trail behind the mouse pointer.

It would definitely be more appealing to the masses if it was brought more in line with a more minimalist 2023 aesthetic, IMO.


Since you ask, I'll be honest. I don't think your layout is clean at all:

- There's a confetti effect following my cursor around - There are constant animations playing everywhere, even over video embeds - A lot of the spacing is uncomfortably tight - Shadow effects overlap other content - There are some 30 menu options, buttons everywhere, many video embeds that I'm supposed to listen to?

None of this is enticing me to consume any content on your website. I'm feeling uncomfortable even trying to browse the site to find out what your product is. I've learned that a significant portion of the web runs on PHP and that you have an app and a token of some sort? And you're looking for investors? I'm sure there's very cool ideas in here, but I'm lost in the vast amount of information with inscrutable organisation.

All that aside, there's a big reason that a lot of landing pages look similar: it works. First impressions count for a lot.


> Which is why (shameless plug warning) I spent 12 years and $1 million dollars with my team to build it.

How much of that one million dollars dollars went to creating the confetti effect following the mouse cursor around on https://qbix.com/ ?


> they monetize it for you, give you a tiny percentage

It’s actually about 55% for YouTube. Creators are in demand and it’s competitive to keep them.


How do you plan to monetize?


That's essentially what I did, don't need to know the browser or device or where they are from (the product is UK only, so it's largely irrelevant) - so as I control the server side, just log page visits and certain CTA's, and can see a rough journey of where people have been and if they placed an order or not.

That's far more useful than hoping that people have JS enabled or tracking stuff blocked.

(I.e. Can see that they visited an order page, then back to FAQ's, then clicked on a "whats is x" link - so should probably update the content on the order page to explain what X is)

Obviously it depends on what data you actually need, but that gets me most of the way there without gathering a load of data that isn't needed


Most alternatives are not made by advertising companies, but they also frequently aren't free... Rolling your own from the ground up is not necessary or typically advisable when there are so many good options, including many self-hosted and open source options if you're wanting that level of control.

I usually describe the cost of GA as "subsidized by your customers' data".


Matomo is a good self hosted analytics option.


I am a bit concerned about how Matomo deals with security. All their PHP code is located in the public folder and they use nginx[1] rules to block access to scripts that are dangerous.

[1] https://github.com/matomo-org/matomo-nginx/blob/master/sites...



Deploying the Snowplow backend is non-trivial (to put it mildly) and extremely expensive if you don't want to host it yourself.


€0.02/day and easy enough for anyone with basic cloud skills: https://www.dumky.net/posts/own-your-web-analytics-pipeline-...

(With caveats: This is a toy pipeline. More work required to make it robust and probably not a great option for any site with reasonable traffic. But easy to get started and play.)


Plausible is a better alternative.


We found it to be nothing but trouble. Disappearing data, database servers bogging down under what should be trivial load because of inefficient SQL design, painful upgrade process, etc.


"Roll your own" is hubris, unless you have the time, energy, inclination, theoretical knowledge, etc; I wouldn't try and solve solved problems if you can help it.

But, host your own is definitely recommended IMO; a lot of the GDPR issues are resolved if you just host your own, because no data is shared to a 3rd party. Then you only need to worry about getting some approval and data retention. I'm sure data retention is a non-issue if you process raw analytics data (that can be traced back to a user) into generalized statistics, too.


I would love to know other peoples opinion on this.

But I am coming around to the idea that self hosted(atleast partially) might be cheaper and better if you are a single dev/a small team.

The learning curve on the tools/hosting providers out there is has become very steep. Plus the costs are unclear with a lot of cloud providers and monthly subscription charges across the services you need can quickly stack up or the prices can suddenly change.

I tried deploying an app to AWS a few months back. You get a year's worth of credit when you start out. But the database I used was not covered(I did not realize all the database options were not covered). I got charged a pretty penny(Luckily it was not life shattering, but it was a shock. I hate to imagine what would have happened if I did this for a bigger app)

Tried Google Analytics a few times over the last few years. But again it has become complex, so I would had to spend a ton of time learning it to even just get started.

Had a few SAAS providers suddenly hike up prices or change pricing models or just shut down.

We have been using dedicated servers of late. A single server seems to be able to handle multiple client apps along with hobby/test apps for a fixed price. Yes it is not as easy as putting in an email and credit card and using a service. But the price and peace of mind has been worth it. Plus we just write a few scripts to automate things. In case the work load becomes too much we can hire a person to do that and 1 person will do. Compared to having to hire a specialist for each major cloud service we use.


I thought (though apparently with these comments I'm in the minority) the most recent conventional wisdom has reverted to agreeing that yes, starting on self-hosted / managed servers is the best path. Seems like naysayers in here don't know how to set up a server with NGINX / SSL / whatever else you need. IMO these are _required_ experiences before trying to do it on a much more complex platform like AWS. These self managed server setup tasks are more than manageable for a small team, or even a solo dev, in a matter of a day or two.

My go to 'cloud' path is just a digital ocean droplet, using docker containers to spool up whatever you need and connect it all on a docker network.

A single server (2-4 cores) running quite literally ANY modern backend framework (node, go, C#) should be able to handle _thousands_ of requests per second, I'm not sure where or when this idea has disappeared, it seems like everyone automatically assumes their small SaaS or webshop needs an autoscaling kubernetes 20 rack workhorse of a server. Not the case at all!!!!

Scale when you need to - if you're getting the kind of traffic where you need to, by then you won't need to worry about the added cost to do the actual scaling / upgrading.

Sorry for the winded / ranty answer, I've done this like 20+ times at this point and always had to battle against the "let's put it on AWS with kubectl and 2349023 redundant instances!", when in the long run it was never needed...


> I tried deploying an app to AWS a few months back.

AWS is a toolbox you can use to construct your app deployment/hosting environment. Unless you know or want to know how to (for example) setup routing on VPCs, it may be better to go with someone actually hosting apps rather than infrastructure.


I'm much more productive since I realised I don't have to use "standard" tools. It saves me so much time.

With that being said, I don't make everything myself, but just the thought that I'm not obliged to use the best-practice standard solution to the problem is liberating to me and it makes me so much more productive in actually doing stuff.


Besides, even if you use the standard tool today, five years from now this will be "the tool that was standard five years ago" unless you actively migrate every so often. There are a few domains where the standard tools remain the same for a long while (Google Analytics being one), but often the standard tooling shifts more frequently than that.


Yeah, I heavily suspect the changes in "standard" stuff isn't due to there being revolutionary improvements every 4 years or so, it's just because the new stuff becomes trendy.

When I research about programming languages I'm always amazed how almost every feature existed in the 70s as well, just no one bothered to use it.


> unless you have the time, energy, inclination, theoretical knowledge

Why does anyone need all that?


Because setting up useful analytics for almost any definition of "useful" is not a trivial "I could do this in a WE" project.


The better question is.... what analytics do you actually need? I don't think cross-page tracking of the user is something most websites need...

Useful information is more like, what users clicked, what platform did they use, how much time they spent on the website and things like that. Those aren't that hard to just track yourself.


I reckon server render based analytics (as opposed to cookies and pixels etc. on 3rd party domains) will be better because it wont be blocked by ad blockers. If the page got rendered and sent, well I know about it unless you blocked the site entirely.


What I am curios about is how much people actually use ALL the analytics information provided by a lot of these tools. I know Matomo and other such open source/self-hostable solutions, but how much info do you really use?

I think for most use cases users would want to know if their content is consumed/read. Maybe how long someone spends on it and where they came from. For this sort of stuff you can write a small script to parse your logs. I did something along these lines to parse Caddy logs to get some idea of how many people visit a link. That's really all I needed and the great part is that I run it whenever I want an update, so it's not consuming resources constantly. The logs are cleared and the output is saved before logs are cleared so I know Article 1 had 39 views (or less!) and Article 2 had 5 views and so on...

So I think we're overdoing it and we would benefit from taking a few minutes before going down the rabbit hole of analyzing EVERYTHING.


Analytics and Business Intelligence in general tend to play a big part in modern enterprise organisations, at least in my experience. Often what happens with corporations is that the larger they grow, the more risk-averse decision makers become, and suddenly things like analytics become nice foundations to lean on for when a decision is questioned.

What I'd be curious to see is the ROI on these tools. They obviously work in some cases, but do they always work? We currently employ three business intelligence developers, and two developers who actually build products. What's the most hilarious about it, however, is that despite employing three BI's I can't tell you if they earn their keep, because their data doesn't show that.


I'm using plausible Analytics. It logs and displays very little data. And no PII data.

It's more then enough for me.

And a few clients whom I enabled it for, told me they very much liked the simplicity. Less data as a feature!


Plausible rubbed me the wrong way because of the attitude of their staff, but maybe I was the asshole?

I found a bad bug in their JS which means that on some pages it just silently fails and doesn't log anything, which means your analytics are even more inaccurate than ever (given the browser restrictions). I was totally broke and I wanted to use their paid service for a few months, so I offered them the fix in exchange for a few months free service (maybe $30 credit?). They told me basically "don't worry, we'll find the bug ourselves one day, we don't need your help."


I obviously don't know this exact situation.

But I've been in one, where a customer offered "patches", despite our software not being open for contributions. Not only were they inconsistent with our standards, they were hard to read and had some subtle security issues on careful review. I'm still suspecting it was an attempt to plant a backdoor.

In any case, even if legit, it was a lot of work on our side to just review and clean it. Far more than if we just did it ourselves.

This is different for OSS, which should have external contributions as main workflow. Ours wasn't prepared for external contributions.

Maybe the same is with Plausible?


Plausible is great, and they just added conversion funnel analytics last week which was the big feature I was missing.


With SPAs and mobile apps server logs won't be accurate and that much useful.

Tracking events is actually useful to see which features are used etc.

It's not all marketing and evil ads.

Having said that, GA4 is awful as a casual user.


Depends on the SPA, plenty of them fire enough requests to the server for server side logs to be useful.


Or even if you don't want to use your web server's logs for this purpose for whatever reason, this is quite trivial to implement in JS yourself. No need for GA and other bloated analytics frameworks.


If you could trivially implement Matomo (a project that has been developing over 16 years) in JS, please open source it. Would love to get rid of the PHP in our stack.


If you're using the docker container it really shouldn't matter.


Security risks still there.


Well, you still need some kind of backend to store the data. You can send it to a 3rd party, but then you'll run into all the same GDPR issues.


If the website is not just a static html page, there is likely a web-server with a database that can store information.


Oh, sure, but a LOT of websites are static html pages — or, at least, should be.


> all the same GDPR issues

Not necessarily. If I read the article correctly, it is about sending data to the US:

> The complaints allege that the companies, in violation of the law, transfer personal data to the United States.

So if the 3rd party is inside the EU, you might be fine. Or at least you may run into different GDPR issues.


> how much people actually use ALL the analytics information

I sometimes check access logs and pipe some grep queries into a line counter, or uniq by IP address to have a rough idea of how many people look at a particular part of, or tool on, my website. Maybe twice a year or so. Helps prioritise which things are worth maintaining/updating based on what's still being read (found by search engine or linked from third parties)


With a SaaS application, we use it for monitoring customer activity to drive support and sales renewal activity, to determine which features particular customers are using, to determine how they are using it, and how these things are changing over time. It's a vital part of everything we do from a product and sales perspective.


Do you know a static host that makes logs available? I happen to be looking to do something like this right now, but I would rather not run my own web server for my simple static blog.


I recently deployed a static website on Bunny.net using their object storage and their CDN and they make available logs in this format https://docs.bunny.net/docs/cdn-log-format.


Thanks. I hadn't heard of bunny.net before. I'm going to give this a shot.


Nearlyfreespeech.net does. I’ve used them for many years to host static sites.


I’ve been tinkering around on nearlyfreespeech.net for about an hour now, and I love it. Thanks.


You bet! They’ve been good to me.


> For this sort of stuff you can write a small script to parse your logs.

IFF you have access to your logs.


So you can setup GA but you can't check your web server logs?


That covers everything hosted on GitHub Pages, as one example.


Why would you not have access to your logs? Not having access to logs doesn't even make sense to me. We honor hippa and gdpr and we can access logs. Beyond that, I am a proponent of structured logging and log aggregators that can help you see trends and analyze the logs, like Splunk or, to a lessor extent, DataDog.


I decided to not convert to Google Analytics 4 because I used it as a glorified visitor count. I opted for a websocket to measure active users, the page they are on and some basic hourly peak and total user count split out over logged in and anonymous visits.


Unrelated: I used ChatGPT to generate remark.js presentation HTML code from some content. It did generate the code, but it inserted a GA snippet along with a random GA account code at the bottom of the code. I did not even catch it immediately (laziness, totally my fault), but noticed it a couple of days later when I was modifying the presentation.


> totally my fault

IMHO, it's not totally your fault; private information shouldn't be shared with others. It's your responsibility to verify what ChatGPT is writing before you use it; it's OpenAI's responsibility to not share private information.


Google Analytics tracking ID is public information.


Here you are making assumptions this was somebody's valid GA ID and not dummy data...


Just like how google wasn't going to train ML models on your private possibly confidential information...


Wait… People are actually using ChatGPT to write production code? That's not just a meme?


Yes. It's an incredibly useful tool that saves a lot of time and mental energy. It's wrong to use it without checking its output, but even with corrections required it is useful. Lately I used it to modify some AWK scripts since I'm not familiar with AWK, and from the changes made I was able to grok enough of how it works to make the changes I wanted.


Hope that's not sarcasm lol.

Yeah we have copilot subscriptions at work and an Azure GPT-4 instance that is being trained with enterprise data.


[flagged]


If the world operated on a "who cares" basis, nothing interesting would ever happen.


Who needs analytics? I'm confused.

When I worked at companies using google analytics, 99.9% of the time they could have gotten this data from server logs with something like awstats or goaccess.

To this day, I still don't get what's the point of embedding some javascript to do extra-requests or a tracking pixel, when the data was already given once.


It's a problem introduced at least in part by SPAs. When the application runs entirely in the client browser, save for a few API calls (that may be shared between multiple pages) it's difficult to tell which pages are actually being viewed, unless you have the client application report back to you (or to Google)


Easily addressable by including a header like x-from-page where you declare which "SPA route" the backend requests were made from.


I've worked at an web agency creating many websites for mid-sized companies (million of views each month). My conclusion is that it is about feelings. It feels good to know you have some data about your users, even if you in most cases are not going to look at it. If someone hire someone to look at it then the findings are not acted upon.

The best way to get feedback is to talk to your users face to face, or do a questionnaire.


The question I have isn't why you need analytics but why you'd ever need any PII in the data. I don't care whether Bob clicked the button I only care whether 1% or 50% of users click the button. Or if those who clicked button A are likely to click button B so they should be closer together. Analytics should be anonymous usage statistics not tracking individuals. We are clumping two things together where one is bad and the other is useful and mostly harmless to integrity.


That’s the idea but to know that an anonymous user who has clicked button A goes on to click button B requires you to track that user via some kind of random ID that uniquely identifies their browser/device. This new Swedish ruling says that ID is itself PI.


How does the ruling come to that conclusion? How does an ID that uniquely identifies a user but can't be used to trace back to a physical person be PII?

Reading the linked rulings it seems like it's not the IP (even though it's only blanked at the last octet) but rather other cookie values, which may in turn be traceable to the user?

Of course if a cookie value is sent and in some other system that same cookie value is stored next to a user's name, then that cookie value is definitely PII and can't be sent via GA, that much I understand.

The key passage from the longest ruling (DI) seems to be

Dessa identifierare har skapats med syftet att kunna särskilja individuella besökare, såsom klaganden. De unika identifierarna gör därmed besökarna på Webbplatsen identifierbara. Även om sådana unika identifierare (enligt punkt 1 ovan) i sig inte skulle anses göra enskilda identifierbara, måste det dock beaktas att dessa unika identifierare i det aktuella fallet kan kombineras med ytterligare element (enligt punkterna 2–4 ovan) samt att det är möjligt att dra slutsatser i förhållande till information (enligt punkterna 2–4 ovan) som medför att uppgifter utgör personuppgifter, oaktat om IP-adressen inte överförts i sin helhet

Basically: the random ids aren't enough by themselves, nor is the IP, but the IDS together with partial IPs and something else is.

I don't know what the bottom line is though. And that worries me a bit. Any analytics will be at risk of doing this. In my desktop app analytics we blank IPs etc, but just storing some hardware data (ram amount, cpu freq, windows version, screen resolution...) means that we eventually have enough entropy to say with certainty that each user we have has a unique set of parameters in the data we log. It's almost impossible to NOT fingerprint perfectly if you gather even just basic hardware and OS info, for example. But there is of course zero possibility that we could use the data backwards and say "ok which single physical person is it that has a 16 core machine and 16Gb ram" making it "not PII"?

I think the key issue in these cases with GA is that it's more a chain leading to actual PII. E.g. the cookie value that GA has access to, can realistically be stored somewhere where there is also PII such as an email address. And that's enough to violate the GDPR.


The questions is not just if the raw data is available ( Google Analytics also has the data accessible in Google BigQuery) but if business stakeholders have the ability to easily access the data and drive decision making.

You need an interface that visualizes the data and decentralizes access and analytics as much as possible.

Since Google Analytics is free and more or less part of one of the biggest marketing stacks (Google Ads) you will find a lot of marketing stakeholders with at least some knowledge of the tool. But perhaps the landscape will change with the very rocky start of Google Analytics 4


Back in the day before responsive design, I loved having stats on things like screen resolution you couldn’t otherwise get from server logs. You also get stats on keywords driving traffic to individual pages.


What makes you think it’s legal to use those log files for that purpose? ;)


> Who needs analytics? I'm confused.

SEO and Marketing Dept. of any company.


It's just something data goblins like to collect and obsess over instead of making an actually good product.


Marketing teams with large budgets.

Not that they actually get questioned properly about actual stats, but they can confidently say they have GA set up and it’s showing some numbers, so just trust us.

Google is “trusted”. Why would the person setting their budget put faith in some hand rolled/open source solution ?! /s


With server side tracking you're not able to identify and properly track non-logged in users. GA (and other client-side tools) take care of this, via cookies.

Additionally, a common argument is that the server side logs contain a lot of logs from bots/crawlers and GA (and alike) can filter them. The other side of the coin is that GA (and alike) are not able to track users with Adblockers.

EDIT: not sure why I'm downvoted - the OP asked for some reasons why people use client side tracking and I listed them. I didn't say that I support these practices, but maybe I should have made that explicit to comply with the overall sentiment of this site.


For the vast majority of sites I would bet using IP address alone would be enough to track enough users individually over their single sessions to be statistically useful.

Does anyone have any experience? How bad does CG-NAT mess this up over a large enough cohort of users?


You can set a tracking cookie when the user first accesses your website without needing them to log in.


They also ensure that you need the annoying popup to consent to tracking to comply to the law, rather than doing no tracking and not annoying your users.


Had an engagement with a client a few years ago and GA came up. Folks on our side tried to avoid Google where possible, and I'd suggested some alternatives. Matomo, Fathom(IIRC) or others - multiple folks on the team had experience with these alternatives, but the client was insistent on GA. "This is the industry standard - look at all the billion dollar companies running GA - this is what we should use". I pointed out those comparison companies also had dozens of engineers per project; we had 3 part time people.

The argument kept coming down to "GA is the standard; GA is what people know". Which is... true, if not somewhat circular.

My other suggestion was try multiple; run GA and Matomo together, for example, for a bit. Or GA on just the public marketing site, and something else on the internal application. Nope, because they wanted to track every single ad spend all the way through to registered user usage of the internal business application. Knowing that the $70 you spent in Tacoma geo lead to 3 users registering then knowing that those 3 people routinely used a budgeting tool more than the $90 spent on 8 people who registered from Toronto... apparently those sorts of analytics might be needed in the future, so we have to have this.

Instead of "let's just install both for a few weeks and try them", this became "let's 'investigate' multiple options and write reports about the pros and cons of each". Nuts. My larger concern was that, for testing/dev purposes, we'd not have as easy a time of 'resetting' an analytics DB that was not under our control (resetting or maybe creating new/unlimited sandboxes for each test run). I didn't find any way in GA (or really any hosted solution) to handle testing well. But maybe that's not a big concern among 'enterprise' analytics users?


Every time a client makes me implement google analytics or facebook pixel code I die a little inside. And even though some actually use google ads, they have zero benefit from using analytics. I know, because I'm the one adjusting their campaigns.

It's just another thing everyone does and one would be stupid not to, right, right? The lemming mentality always makes me sad because so many bad things in our society are a result of it.

And every time someone says that rolling your own is a waste of time,.. I roll everything my own, including CMS / SPA frameworks, because it's a giant waste of time to do otherwise in the long run. The only time I waste regarding rolling on my own is when tobacco is involved.


Could you link to a site where you’ve rolled your own SPA? Or ideally the source code. I’d like to take a look as not many people do that, even less do so with enough attention to detail not to cause UX regressions (not that third-party solutions are brilliant either).


I'm an enjoyer of anonymity online and would rather not doxx myself. However if you have a specific question, I'd be happy to answer. I use js/jq for the front-end and php for the backend. Once you have your own CMS, turning it into a SPA means turning your index.php into a index.html that is php free and relies on ajax calls to change the content. So at minimum you need a mainbody.php and a head.php that accept inputs. After that it's just onclick actions on buttons that trigger the ajax function changeMainbody(targetPage). Or changeHead() after a user loggs in. On the phone app side you use the regular WebView, to avoid any cross-origin blocks and problems. Alternatively you can run it locally and pass the cookies as inputs - depending on the app needs. Should be safe using https, right? I fully expect to be scolded by someone with 20years more experience, but I guess that's how you learn.

Of course there's more to it, depending on the app needs. In my case it also auto refreshes the contents on a js timer.

What UX regressions did you have in mind as troubling? Things like resetting one's password if forgotten? Well all those things need to be turned into their own ajax calls and php scripts as well and sometimes reworked to fit mobile users needs. For resetting passwords specifically I just copied what Twitch.com does.


(Not OP) The kind of regressions I'd think of would be things like:

- updating the URL state when someone clicks on a button

- proper back-button support in the browser that takes me back to the prior 'page'

- being able to navigate to any URL deep in your app and get a valid response (ideally rendered server-side so there's no client-side loading delay).

Things like these are hard, and the reason why it's common advice to use a framework and not hand roll. If you hand roll but don't support these things gracefully, you're making a case for not hand rolling.


True. One needs to build it's own "router" if you will. But in practice that means writing functions that modify the URL (some pushState(url) and scrollTo(top) stuff) and make them part of the primary function, so you can forget about it (I'm a functional programmer). Same with adding/substracting from the history stack. 1h of work each. Is that too much?

It really just worked without much troubleshooting. Most trouble I've had was with cookies and cross-origin problems (or weird client requests).



Haha, thanks. I actually have one of those. I roll faster without it :)


When I visit a company site and it uses google analytics I know they are either: lazy, ignorant or hostile towards their (potential) clients.

This set of possibilities spans all cases and none is actually a positive signal.

Companies (and any entity that has an online presence for that matter) are entitled to know what people are doing in their platform and use any appropriate tool for that purpose. They are not entitled to share that with anyone without the explicit warning and approval of their users.

The Web as a digital predation ground where the amoral fleece the ignorami must stop.

While (commercial) life is not exactly an ethical showcase, the digital version as it has come to evolve is particularly out of kilt with common norms.


How many do you really think are hostile?

Your average person at your average company will one day think, "how are people finding out about us?". They do a Google search for how to answer this question for their company and find Google Analytics.

This is certainly not hostile. May be slightly ignorant, but can you blame them?


For sure i have come across websites where my eyes rolled ("common guys, i know you are better than this").

But the web has not been transformed from a web of users to a web of data mined "product" without very conscious moral choices by many commercial actors.


Government web sites also use Google Analytics. Which means we are tracked much beyond our shopping profile.


"They are not entitled to share that with anyone without the explicit warning and approval of their users."

So using the by law (GDPR) required consent management (cookie banner) where the user has the chance to opt out of any tracking would make them not "lazy, ignorant or hostile" anymore?

I think users should have tight control over their own data and what they share but being against all 3rd party ad or analytics vendors would be against digital user acquisition for 99% of websites out there.


For those who look for an alternative https://plausible.io is a great replacement.


A nice list of many more analytics solutions: https://european-alternatives.eu/category/web-analytics-serv...


Plausible is OK but needs work. For example it isn't even multi lingual.


Or if you're after free analytics, Cloudflare has something. Should be GDPR compliant since they don't use cookies or local storage.


GDPR has nothing to do with cookies or local storage. They are just mediums that are potentially impacted by GDPR.

GDPR simply makes collecting personal data without consent illegal. This is why a lot of American centric sites block us from accessing them, they want your data, and they don't want to ask for it.


> This is why a lot of American centric sites block us from accessing them, they want your data, and they don't want to ask for it.

Also, requiring it when it is not technically required for the product is illegal. So even throwing up a splash screen for EU visitors with a single "allow all" button would be illegal.

The GDPR is actually a quite well designed law for what it tries to do, its just that enforcement lags behind.


> This is why a lot of American centric sites block us from accessing them, they want your data, and they don't want to ask for it.

Or they just think that the costs to adapt their solution, or any law infringement implications don't worth the effort.


You say "or" but then just give examples of what I said.

> law infringement implications

They comes from using people data in ways that you have no asked permission for. They don't want to ask for it because it's quite hard to spin "we want to mine your data for a Cambridge Analytica style social manipulation".

Adaption isn't that difficult, the cost comes from people saying no. They don't want to give that option.


yes, well I dont assume bad intention from every single website there, for some just dont justify the costs and possible headaches.

> You say "or" but then just give examples of what I said. > law infringement implications

No, you mean law infringement implications, like they actually selling your data, I mean some law office, running behind companies not compliant and trying literally money extortion against them.



I generally found much less accurate as something like Plausible, it seems Cloudflare default analytics are more like where requesting are coming from.


Found it*


> Should be GDPR compliant since they don't use cookies or local storage.

That’s not how it works. If there’s personal data being transferred to the US, you are in violation according to the Schrems II ruling. If you only collect non-PII, you should be fine. Make sure though that your definition of PII matches the regulator‘s definition.


GDPR is not about cookies or local storage. It's about knowing users' personal data and doing things with it.


The only personal data that you can get from HTTP requests without doing tracking or fingerprinting is the IP address, which Cloudflare also isn't using.


If data about EU citizen goes outside EU, it is illegal.


Cloudflare is also questionable when it come to GDPR. Lots of folks conflate privacy and cookies with GDPR. Compliance is much more than that.


Cloudflare Web Analytics is extremely simplistic and does not allow for any persistent identification of users or storage of personal information. It uses HTTP Referrers to count visitors and that's it.

One could argue that since it's a US-based company it can't be Shrems II compliant, but you can make that argument about a lot of things.


As a US-based company, they process (even if they don't store) the IP address. As such, the personal data of the EU users is transmitted under the control of the US Surveillance Act. No SCCs nor commercial contracts can shield this data.

You might have a legitimate interest in processing the IP, but because of the aforementioned issues, you cannot provide sufficient controls nor protection of Personal Data.

As such, using Cloudflare as your Data Processor, exposes You, the Data Controller, to DPA scrutiny. As always with GDPR/DPA and EU, whether it is illegal/non-compliant depends on each DPA.

https://medium.com/@christhaefner/shopify-illegal-in-germany...


Great for plausible.io of course. But what is the difference for the end user?

- GDPR: hosted in EU vs US, so your data is traveling less far. The things the plausible can do with the data is more or less the same.

- No cookies: don't see the point of that tbh, they will probably perform even more invasive tricks like finger printing to replace the cookie requirement

Bottom line, the website visitors data is still logged, stored and tracked - only now with a different actor.


It’s like two dudes developing the solution and more importantly, charging you for it. If you don’t see the radical difference in incentive structures, then I don’t know what to tell you.


Sorry, had a bit of time left today. Its more like 7 dudes, and their whole proposition is underwhelming TBH. Mostly gratuite statements against the ruling order. Half their website is a rant against the 'capitalist' competition. And the whole Christmas tree of doing good is exposed. But nothing really sticks:

- Simple and easy: wait until the product matures

- Open source: but no foundational governance, like Apache for example.

- Promise never to sell to investors, but nothing is in place to actually prevent that from happening. Note this common practice via a social enterprise.

- 45 kg reduction of CO2 compared to Google per average website(!): clear violation of EU law (2006/114/EG) in my opinion.

- They suggest to proxy their service to circumvent consumers who actively block traffic to plausible. This is OK, because they are good.[0]

[0] https://plausible.io/docs/proxy/introduction


> they will probably perform even more invasive tricks like finger printing to replace the cookie requirement

It's clear you didn't even bother to look at plausibles data policy [1] before assuming what it does and doesn't collect. The TL;DR: it does not fingerprint, and it does not collect any identifiable information, be it about your device or your person.

> Bottom line, the website visitors data is still logged, stored and tracked - only now with a different actor.

Only basic device info is logged (not even IP addresses are stored). And it's very easy to self host so that different actor may be yourself.

[1]: https://plausible.io/data-policy#first-thing-first-what-we-c...


I indeed do not know Plausible and any of their motivations.

Google Analytics also does not provide PII to their end users per se. But I have seen many tools and solutions do just about anything to circumvent that. Merging analytics with transactional data and site logs. Adding company info to visitor data. There is an entire industry there.

So, an imaginable use case would be to self host it. Intercept to circumvent the limitation.

The reason why I am so cynical is not because of the motivations of Google Analytics or Plausible. It is what motivates the end users, the companies who are using these statistics.


I do know Plausible, and their motivation is to make a sustainable business providing basic web analytics, which is why they charge for their service and Google doesn't. The data they provide to the users of their service is like an order of magnitude less detailed than what Google provides.

I get the cynicism about the industry in general since Google led this merger between web analytics and advertising, but there are plenty of providers in the analytics space that aren't following that path.


Plausible can also be self-hosted, unlike GA.


But then you still do the same thing, but you host it yourself. Meaning: it is installed and left running for years without updates and monitoring. I then rather have Google handle things.


I've finally come to the conclusion that user tracking is generally a poor practices and should be regulated.

As a web developer, I didn't see it as a big problem. We always do it to maximize ad revenue, find out where users leave to increase conversion rate, and simply to improve UX. But even when the intent is to improve UX, tracking is inappripriate.

Imagine if a robot vacuum recorded videos of your home and uploaded them so that bunch of ML engineers can see and use it to improve the algorithm. Or the videos of your car's camera (both of inside and outside). I mean, I'm not surprised if this is already happening, but it's a disturbing thought and should be regulated.

We can certainly develop functional services without tracking users.


"Any reason why my Xiaomi Robot Vacuum uploads 11.5GB of data per month to the internet?" https://www.reddit.com/r/Xiaomi/comments/9tgyrg/any_reason_w...


> Imagine if a robot vacuum recorded videos of your home and uploaded them so that bunch of ML engineers can see and use it to improve the algorithm

No need to imagine, they do:

https://homesupport.irobot.com/s/article/964


> Imagine if a robot vacuum recorded videos of your home and uploaded them so that bunch of ML engineers can see and use it to improve the algorithm.

https://www.technologyreview.com/2022/12/19/1065306/roomba-i...

Yup, happing. Recent news on a lady whose vacuum took a pic of her on the toilet being leaked.


Except you are not in your home. You are in an app, a commercial property. Imagine trying to say that grocery stores are not allowed to have security cameras, or track what isles are most busy, or count the the number of each item sold.

All these things are under attack. I agree that cross-site tracking for ad purposes is bad, but this obsession with privacy goes too far. If you run around outside naked sorry you don't get to demand no one look. There are private spaces and non-private spaces, and I don't believe in eliminating non-private spaces.

edit: and to clarify, an app on your home computer controlling your lights or appliances, that should be a private space with opt-in usage tracking for UX improvement, a server on the internet that you are interacting with, that is not a private space. While you shouldn't be allowed to track across servers, yes I believe the server owner should has every right to anonymously track the views and areas of the website that people spend time on, and certainly they have every right to track purchases and do analytics on them.


Advertising is the big gotcha here. What I think we need is for more and more companies to own their advertising rather than farming it out to third parties. We'd get better advertising AND it would be less intrusive.


It just scales better to have web intern paste in a GA code snippet into their website and that's that - you get revenue and web analytics all in one.

If you have to set up your own server (or at least your own subdomain that points to a GA server IP) then it's more likely to go wrong. I'm sure it'll happen, though.


It's almost certainly happening.


If you need a powerhouse like Google Analytics and are not afraid of complex UI, go with Matomo. Even better if you self-host and have people to support it.

If want something lighter that is just a turn key solution but lets you grow (collecting more data for users who gave you consent, or being super strict about privacy without consent) then go with Wide Angle Analytics (our product).

The time when GA was the only option is long gone.


Luckily for them, google has basically forced everyone to stop using analytics as of July this year (I don't consider ga4 to be a replacement).


It is deeply bizarre how much worse GA4 is as a product. I don't understand it.


It's much worse to comply with onerous EU regulations. They make it painstakingly useless.


And, as a EU citizen, that's a good thing.


I am so grateful for the progressive policies in Europe that help the entire globe.


Me, too! Although I suspect you're an American and are being sarcastic...


There are millions of people in this industry that feel moral choices should not stand in the way of lining their pockets... Ridiculing the laws of countries and people that happen to have some traces of a moral compass shows that basically people have every reason to be suspicious.


I am completely sincere.


Yes. Every time I see a cookie consent dialog, I do a quick "thumbs up" to our pals in Europe.


And you don't consider that the website puts up a consent dialog because they want to (ab)use your personal data?

No need to put a consent dialog if you are not stalking people.


I see those as a dark pattern protest against reasonable data protections. I get why some might hate them, but it's just a mark of being against users.


Missing a /s ?


No.

I wish the US had stronger antitrust enforcement .


Good. I hope the same authority does a round of fines for companies using noncompliant tracking opt out UX too. A nice chunk of total revenue as a fine without prior warning for anyone showing the "Accept all/Show purposes" question would be delicious.


To get the process started for a specific company, submit a complaint to the supervisory authority of the country the company is based in. Contacts: https://edpb.europa.eu/about-edpb/about-edpb/members_en

As an example, the article mentions these specific audits were triggered by complaints by NOYB.


That's why we built Usermaven.com, a privacy-friendly website and product analytics tool.

Our website analytics module is simple and gets the job done in one single easy-to-use dashboard.

However, if you want to dig deep, you can use funnels, journeys and other features to get more insights out of our analytics.

Usermaven collects all client-side events automatically so it makes it really easy for marketing teams to get insights without involding devs.

We also offer simple ready-made reports for SaaS businesses to get product insights.


> For purposes of data protection laws, Userrmaven Inc., a company duly incorporated and organized under the laws of the United States of America, having its registered address at 2055 Limestone Road STE, 200-C, Wilmington, Delaware 19808, is the “data controller”

> To integrate your website or SaaS app with Usermaven, you'll need to add a simple tracking script into the Header (<head></head>) section of your website. Make sure this snippet is present on every page that you want to track.

(The tracking script's URL is https://t.usermaven.com/lib.js)

So similar issues as with Google Analytics – site visitor's data is being shared with an US company.


Do you have any info page for how you implement stuff like funnels, journeys, etc. without storing any PII?

Also this paragraph from your GDPR page had me scratching my head a bit:

> Usermaven agrees to abide by the standard contractual clauses where data is transferred from the EU to the US.

Is that written before Schrems II?


Well, it was only a mouse and cat game here.(with local exemption for France for example that provide an exit pass but render the tools without much interest after)

The focus on Google Analytics is really funny because plenty other company use similar tech to track users (pardot pixel, hubspot etc...) And both parent company are us bases so similar 'transfer to us' is being made with much more PII than google analytics.

(Noyb is probably coming to you as well as Facebook).


Good news then, Google has deliberately and bizarrely broken its API, so thousands or possibly millions of legacy sites will never correctly report their analytics again.


I'd like to note here that NOYB seems to be doing great work, and is one of the very few institutions I donate to. I think they're worth a donation:

https://noyb.eu/en/donations-other-support-options


Yes, you should absolutely not be using Google Analytics. They don't need more data, your users don't want to see cookie banners and most of you really don't need 99% of the data that you can filter through...

I can't recommend Fathom (https://usefathom.com) enough. They have a huge focus on privacy-first tracking. You don't need to show a cookie banner and you can still track events etc.

If you want $10 credit for signing up, use https://usefathom.com/james but otherwise, https://usefathom.com

Seriously, Google Analytics sucks. Use anything other than that.


While I appreciate the push for privacy and anti-tracking, ultimately the tools to prevent tracking are in the hands of users and organizations. The concept that countries have jurisdiction or even exist within the confines of the web is a laughably antiquated idea projecting itself into a realm where it doesn’t belong. Google and all of the usual suspects will continue to collect information about the public in all of the ways that they want, while the naive public believes in some false notion that their leaders are protecting them from the big bad wolf. If you don’t want to be tracked, the only person who can prevent that is you. Government agencies are the keystone cops or this is world. All they’re doing is a Chinese fire drill.


I'm afraid this is not going to help much.

Instead, we should have a law against a panopticon.

I wonder how fast we'd have such a law if Google were a Chinese company ...

Perhaps the way to get rid of Google Analytics is thus to start a Chinese company and make everybody use their analytics tool.


Nothing surprising, this was kinda clear already not too long after the regulation was passed and since then quite a lot of curt decisions which bordered that topic have painted a very clear picture of "it's not really compatible with the law/regulation but you might get away with it anyway".

Also given some scamy things google was found to be doing in their ad business and personal experiences people I know had when running different statistics and ad providers along side of google and noticing gross divergence I _personally_ really wouldn't trust google analytics or ads at all if I where a business.


The heading seems very strong considering this is a governmental agency and since they audited a "version of Google Analytics from 14th of August 2020" and presumably not GA4 that works differently.


Rereading it now seems like "companies" in the heading only refers to the three fined companies and that the decision may be applicable to other companies.


Ok, what is it that google/Facebook analytics is providing people that has them so obsessed with harming their users privacy and slowing their page loads?

I really don't get it: you don't need to sell out your users to google, Facebook, etc to get page view counts, time page loads, get browser statistics, etc. What is it that site developers actually think they're getting out of abusing their users?


I am using a self-hosted Plausible [1] instance, which is GDPR-compliant out of the box with no cookies required. I am super happy with it. The only downside is that you need to run Postgres and Clickhouse which is overkill for my small sites (an option that only uses SQLite would be great). I don't want to track my users. I just want to see which pages get traffic. Sometimes I am also curious about where visitors come from (by country) and what devices they are using.

In a newer update, they allow region tracking based on cities. I think this is too much information. I did not enable this and hope they won't add other more intrusive features.

[1] https://plausible.io/


I'm using the hosted version of plausible.

After I hit the FP of HN two times in a month, their billing warned me of overusage. One email, In which I explained the situation from me, and I got a very friendly email back, from a human, in which they allowed me to stay on my small plan despite the overuse.


It used to be that Google needed GA to see how users used a site.

But I think they just track at the Chrome-level now.

So using GA is really just a way for you to see what Google sees about your site.

Blocking GA use... I don't think it really hurts Google any more. I think they get all they need -- more than they ever got through GA -- through trackers in Chrome.


I think companies should stop using fixed navigation banners but we can't all get what we want now can we?

Like holy crap, I agree with you but your website is unreadable with that may as well be a banner ad of a navigation bar that keeps popping in and out of existence every time I scroll down.


Seems like we have a lot of good GA alternatives on here already. Thought I'd add one: goatcounter.com

Not mine, and I only just started using it. But it's easy to implement, and shows "just enough" analytics data for me. Nice simple option.


Interesting. I thought Google had built the tooling needed to keep European data in EU servers ages ago for compliance on this topic. Maybe I'm thinking of just Google Cloud?



I put Google Analytics on my forum because I thought that maybe having it would help it be found in Google Search.

Google Lighthouse immediately started pissing and bitching about slow page load times because it had to wait for Google Analytics to load.

My site still does not really show up much in Google Search.

I binned Google Analytics because it basically did fuck all of any use.

I don't have one of those fucking idiotic cookie popups, because it doesn't need one, no-one needs one, and they're entirely meaningless noise.


I wonder if this also will apply to Google's ad related spyware? Going after just Analytics seems like quite a small step.


This authority does not go after anybody. They got a complaint by NOYB regarding these specific companies and therefore was forced to investigate specifically them.


Could Google outsmart them by hosting some of the data in EU, but still keeping spying on customers?


Not saying it is morally correct to use Google Analytics. But I still find it amusing the Nordic countries see it is OK for everyone to know everyone's else salary while it is not OK for Google to know your IP.

https://www.dailyscandinavian.com/income-tax-transparency-no...


I wouldn’t call this amusing at all. That’s consistent with a commitment to protection of the common citizen. Salary transparency benefits the worker.

If I know my colleague doing the same work makes more money than me, that gives me leverage to request and receive a raise. If I know the CEO of my company makes 1000x my salary, that gives the workers collective bargaining leverage.

The only people who benefit from keeping their income private are the wealthy.


Those looking for alternatives can take a look at my book which evaluates 15 different options: https://gaalternatives.guide

I also have a google sheet listing the basics of each of those tools: https://gaalternatives.guide/sheet


google or other providers could mitigate this by Allowing the Analytics subscriber to configure which fields to "exclude" or "include" when logging requests.

Regulators are only going to get tougher with service providers, it's wise to prepare.


This has also happened in Denmark.


Currently using self hosted Motamo Analytics. Very easy to use and intuitive.


What analytics solutions are there that you can host yourself?


I tried the self-hosted version of Matomo [1][2] a few years back but I remember it was a bit underwhelming for the effort required to set it up.

[1] https://matomo.org

[2] https://github.com/matomo-org


As long as it is via domains I am able to block, it is fine.


This yet another ruling after Austria, Finland, France, Denmark and Italy

https://wideangle.co/blog/is-google-analytics-illegal-under-...

The writing was on the wall for years now.

Some DPAs like CNIL fire warning shots first, giving 4 months to comply. Then the fines keep rolling.


So, a dumb question. What's the easiest way to run privacy friendly analytics on static github pages? "Privacy friendly" as in unambiguously no need for cookie/gdpr permission popups. "Analytics" can be as simple as page loads per day count. Anything beyond that is a bonus.


Put it behind CloudFlare free plan. You’ll get total uniques for the domain. No hosting or JS required.


I guess you can use any JS snippet integration available. There are plenty of alternatives:

https://european-alternatives.eu/category/web-analytics-serv...

I'm the co-founder of Pirsch (pirsch.io), so if you have any questions regarding analytics (any, not just ours), let me know. For our solution I can assure you that it's GDPR compliant and doesn't require a cookie consent banner.


Are those typically blocked by adblockers?


Yes, all of them. For a non-blockable approach you can use a proxy on your own domain or by using a server-side integration.

https://docs.pirsch.io/get-started/proxy


> According to the data protection regulation, GDPR, personal data may be transferred to third countries, i.e. countries outside the EU/EEA, if the European Commission has decided that the country in question has an adequate level of protection for personal data that corresponds to that within the EU/EEA. However, the CJEU ruled through the Schrems II ruling that the United States could not be considered to have such an adequate level of protection at the time of the ruling.

- European Court of Justice (CJEU)

I always thought that by asking for permission in the privacy statement, and in the cookie banner analytics cookies are also explicit usually, it would be OK.

But indeed, even if you refuse the analytics cookies (I do that automatically, who doesn't?), that still does not stop the website from transferring PII to google analytics. I am assuming that here, not a user of analytics, but i suppose it will still work without cookies, maybe just a little less accurate.


Thing is, the IP address is also considered personal information (since it can be combined with other data to identify a person), and it is getting transferred with every request.

The CJEU ruling about the US is mainly due to the fact that US service providers have to hand over all data if US government agencies request it.


Yes - it is more or less the same. In the EU they have to provide the data to the local authorities as well.

Using any analytics, hosted in US, in the EU or hosting it myself, will involve moving and storing PII.

To be clear, I agree we should keep PII in EU. But I doubt that an EU solution will improve anything for the end user.


I only use GA because our ad provider, Mediavine, requires it


That's cool and all, but in the end, you are the end-responsible for it; you can't hide behind your ad provider if the GDPR police comes after you.


I don't care about GDPR


This entire thread is discussing a data protection authority’s decision which was based on the GDPR. It wasn’t a blog post arguing against GA due to concerns about Google or similar.

If GDPR is irrelevant to whatever you’re trying to say, I think you’re in the wrong thread.


I wasn't "trying to say" anything beyond why a lot of websites are forced into sticking with GA. In the content business industry you have to have GA to work with ad providers or if you want to ever sell your site it is a very important part of the due diligence

You can argue with that all you want but its just the reality of the industry

And am I saying it is an excuse or anything? No, I'm just stating how things are

Sad I have to put so many disclaimers in such a simple comment but people like to read into things that aren't there or jump to conclusions.


I had expected such orders after GDPR went into effect. I guess I was young and naive back then...


Government is a lot of things, but seldom fast.


Just another thing that's going to leave the EU in the stone age, falling further and further behind the USA economically.

15 years ago, US and EU GDP per capita were about the same. Now the USA is 50% higher. Even West Virginia is richer per person than France.


If that "stone age" means I'm less likely tracked and logged by a US megacorp to whom I never inteded to share information like what buying and what my medical problems are, GOOD.

I hope all Alphabet IP ranges get blackholed on the ISP level if they continue to perpetuate this hellscape we call targeted advertising.


Kneecapping tech for amorphous "privacy" concerns is very much a valid choice. I'd bet on the countries that don't make that trade off.


If the subject of your bet is "Whichever system can extract the most amount of wealth in the most efficent manner from it's people at the cost of their wellbeing" then you are correct.


why?

then what are the alternatives of google analytics. Google is big guient that are collecting all world data. alternative platforms are doing the same. We are not secure anywhere i think.

Privacy is already brocken, no options.


> Privacy is already brocken, no options.

And thus you question every attempt to fix it?


Yes, We need anaytics for our website. even for ranking on search engine. There is no other option.

GA, GTM, GSC are the top tools we have to use.

If you are a business website owner


Spin up your own analytics from logs and voila!


There are plenty of business websites that rank on search engines and don't use GA.


> then what are the alternatives

Don't stalk people is the alternative.


I am not stalking anything. I am asking not suggesting anything. Just read the comment first and react on that. No manners


Because of the law in US that the governments have right to all data stored on any server that a US company or its subsidiaritets own. That clash with the EU GDPR law and the fact that Sweden categorize IP-adress as a Privacy data point.

I.e they can switch to a European vendor that do tracking and analytics.


> the fact that Sweden categorize IP-adress as a Privacy data point

Static IP-adresses are considered identity by EU court[0]. There have been several verdicts where EU court have ruled them subjective to GDPR.

[0] https://curia.europa.eu/juris/liste.jsf?language=en&num=C-70...


yes? You can still store data within Sweden/EU that contains GDPR data, you just need to have a valid reason and comply with the GDPR rules i.e remove the data when you don't need them for the reason you where saving them. Store the data outside EU is never ok.


But if you wash the IP and any other PII then you can store it anywhere? So why store IP? Is it because GA doesn't offer an option to wipe PII from the messages?


This is where it gets tricky and why I need to find out more details about the new rulings. Google does not store IP when the user is from the EU but this still seems inadequate to the IMY.

https://support.google.com/analytics/answer/12017362?hl=en


So long as all PII is cleared client side and not merely dropped before storage I can’t see any issues with GDPR?


yeah, This is the great option also.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: