Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I always get excited when I see these "tutorials for dummies" (like me). Like "finally, I get to take an evening to understand this concept that's eluded me for years." Generally, I get let down. This time is no exception.

They always start off well, then inevitably there's a concept or key terminology that gets glossed over without sufficient explanation. .

"The random variable is described by the probability density function. The probability density function is characterized by moments.

The moments of the random value are expected values of powers of the random variable. We are interested in two types of moments"

This is where my journey ended this time. OK, so is that the expected value of the expontnent.of the random value orthe expected value of the random value to some power... and what makes the power of a randome value so special vice just operating on the randome value itself.

It's like these authors get tired of "dumbing" things down at random points in their chain of thought and just decide to skip over stuff. Or, perhaps they don't understand the underlying concepts themselves and simply can't explain them to others.

Who knows. Just frustrating. At least in Udemy you can write to instructors and ask questions. Unlike a book author who aren't paid to respond.



One of the biggest challenges in learning, especially in learning something new, is that it’s a chain. If you don’t have something’s prerequisite knowledge, you can’t understand it (by definition). This means that only one insufficient explanation by the author blows it. As a general rule, it’s really hard to not mess up once!

You make a good distinction between someone being paid to respond and not. A feedback loop helps out a lot here.

Failing that, you need to take this into your own hands. Honestly, you’re probably going to have to tell yourself a new story to get there. Maybe having empathy for the difficulty of teaching. Maybe it’s finding some inner drive. I don’t know. But you need to look at that paragraph, accept it’s insufficiently explained, and take responsibility for understanding it.

I’m not saying to read it over and over until you “get it”. (I don’t know why people try that, it’s kinda foolish). A simple strategy works most of the time. Read it until you find a word or phrase you don’t sufficiently understand. Maybe that’s “random variable”. Maybe it’s “probability density function”. Find an explanation for that (Wikipedia, ChatGPT, textbooks, videos). The fun thing is this algorithm is recursive. So you’ll likely run into something you don’t know again. That’s okay, just keep going. If it’s really tough, a lot of the value of a tutor is steering this depth first search.

Get each concept to the point you understand it very well for the problem at hand. You don’t have to know everything about PDFs, but don’t hand wave it either. After this process, you’ll be able understand this paragraph and continue.

This may take a while! If something is in a new domain, it’s usual for you to actually spend more time backfilling knowledge than on the main content itself. That may make it not worth it for you, but it’s not inherently bad. And the next time for something similar should be faster.


The pdf is just the pullback measure.

A random variable is a function X(w) taking (eg) real values. In your probability space you already have an ambient measure space and an ambient probability measure P which takes sets in the measure space to [0,1]. The pdf is then the function defined on sets P(invX(q)). invX is a set valued inverse.

Ok, consider coin flips. Then X takes each element of sample space either to 1 or -1. Set values inverse of 1 is the sets that map to 1. Then we get the ambient probability measure of them.

You don’t really have to cope with measure theory in full to take this tiny step.


I don’t think the set of people who couldn’t understand the quoted paragraph but could understand your comment is very large.


I'd be shocked if somebody knows what an ambient measure space is but doesn't understand the nth moment of a random variable.


1/ I think you are referring to pushforward measure (https://en.wikipedia.org/wiki/Pushforward_measure): the random variable "pushes" the probability measure to its codomain. 2/ pdf requires a stronger condition: the pushforward measure needs to be absolutely continuous with respect to the sigma-finite measure (usually the Lebegue measure) on the codomain.


If anyone was frustrated like me on this concept of "moments" I found the following insightful and fascinating:

https://gregorygundersen.com/blog/2020/04/11/moments/

I have no idea how I got through undergrad and graduate school without internalizing this concept that seems so foundational. Whacky


I thought I had a pretty good grasp on this, but the idea that an infinite sum of higher order moments uniquely defines a distribution in a way analogous to a Taylor series, was new and super interesting! It gives credence to the shorthand that the lower order moments (mean, variance, etc) are the most important properties of a distribution to capture, and is how you should approximate an unknown distribution given limited parameters.


The hardest part of teaching is it's impossible to remember what it was like to not already know something. You can't un-know it so get back your previous perspective. So, you forget all the things you take as known.

This is the problem I find with the 3blue1brown videos. They're pretty but I never get any understand from them. To people who already know they nod and see all the concepts they are already familiar with shown in a neat way but to some (like me) they don't generally get me to understand. Too many pre-reqs or someting


> This is the problem I find with the 3blue1brown videos. They're pretty but I never get any understand from them.

So relieved to learn that I am not the only one!


> Generally, I get let down. This time is no exception.

This is why I don't even bother to read through such tutorials. To understand Kalman filter one first need to understand basics of probability and then the importance of Gaussian distribution (Kalman filter mathematical derivation assumes that all the probab distributions involved are Gaussian). Then one notices that Gaussian distribution is uniquely defined when you know it first and seconds moments (yes, you cannot dance around introducing moments at some point). And then pretty nasty math follows :-) Kalman filter is not an easy thing. Rudolf Kalman claimed in one of the interviews that without his filter American landing on the Moon would not be possible.


There are quite a few similarities between the Elo rating system and Kalman filters. I’ve always thought this would be a good way to teach them, because you can start with a simplified univariate case, then modify and generalize from there.


I would love to know more about the similarities. Can you share any resources that utilize Kalman filters in rating systems?


I feel this frustration in my bones, and I felt the same way up until the proliferation of LLMs. Just paste your comment to Claude (or ChatGPT maybe?) and it will explain everything you want to know in realtime.


Sure, if you don't care if it's actually correct or not.


There are so many explanations of moments in the training set, it should be correct.


Ha, no joke. I definitely find myself doing that more and more. Good reminder.


> They always start off well, then inevitably there's a concept or key terminology that gets glossed over without sufficient explanation. .

Obligatory meme: how to draw an owl: https://www.memedroid.com/memes/detail/265779.



Yep. When I write tutorials I try very, very hard to avoid having a single moment like this. Takes more effort, but I find that it makes for a better explanation.


> and what makes the power of a randome value so special vice just operating on the randome value itself.

Literally four sentences later you would have found: * The first raw moment E(X) – the mean of the sequence of measurements. * The second central moment E((X−μX)2) – the variance of the sequence of measurements.

And if you had gotten as far as the part you quoted, you would have seen an extended example of why one is interested in means and variances.


I of course read that next section, and beyond, but respectfully disagree that the expressions provided don't justify further explanation. For example, the author neglects to define 'X' as the set of support values for some probability distribution. That's left to the reader to figure out for some reason. Further, nowhere is 'E(X)' defined as the integral of x*f(x) dx, or that the exponent 'k' only applies to the first 'x' term in that expression (i.e. if k=3, then E(X^3) = integral of (x^3)f(x) dx. How is the reader supposed to know all that?

That was left up to me to hunt down... which is fine I guess, but I certainly wouldn't say this is "from the ground up". At the very least, link to some external content that provides the necessary definitions.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: