Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Fun fact: this is only valid for domains that have a notion of "selfness", i.e. that there is such thing as an "identity matrix" for the quantities.

Consider the following square matrix:

            TSLA APPL GOOG MSFT    
    Alice | 100   5     0    1
    Bob   |  0    30   100   5
    Carol |  2    2     2    2
    Dan   |  0    0     0  1000
An input vector of stock prices gives an output vector of net worths. However, that is about the only way you can use this matrix. You cannot transform the table arbitrarily and still have it make sense, such as applying a rotation matrix -- it is nonsensical to speak of a rotation from Tesla-coordinates to Google-coordinates. The input and output vectors lacks tensor transformation symmmetries, so they are not tensors.

This is also why Principal Component Analysis and other data science notions in the same vein are pseudoscience (unless you evaluate the logarithm of the quantities, but nobody seems to recognize the significance of unit dimensions and multiplicative vs additive quantities)



There's a little more nuance:

1. Technically, the table you shared is better thought of as a two-dimensional tensor, rather than a "graph-like matrix" -- which as you point out must be a linear map from a (vector) space to itself.

2. While not technically "Principal Component Analysis", one could do "Singular Value Decomposition" for an arbitrarily shaped 2-tensor. Further, there are other decomposition schemes that make sense for more generic tensors.

3. (Rotations / linear combinations in such spaces) Given a table of stock holdings, it can be sensible to talk about linear combinations / rotations etc. Eg: The "singular vectors" in this space could give you a decomposition in terms of companies held simultaneously by people (eg: SAAS, energy sector, semiconductors, entertainment, etc). Likewise, singular vectors on the other side would tell you the typical holding patterns among people (and clustering people by those, eg. retired pensioner invested for steady income stream, young professional investing for long-term capital growth, etc). As it turns out, this kind of approximate (low-rank) factorization is at the heart of recommender systems.


Yeah, this is also the case if the table is not square as the values can't represent edges any more. So its more something like the rows and columns should index the same "thing".

By the way by changing the graph representation we can give meaning even to non square matrices as described in this article https://www.math3ma.com/blog/matrices-probability-graphs


I think the most common application is describing mesh connectivity in Finite Element Methods, where each entry in the matrix represents the influence each node has on each other. Basically any N^2 table can be constructed to describe the general dependency of components in any simulation or systems in genera.


Another notable example is the PageRank algorithm [0] where you consider the graph where nodes are web pages and edges are links between them and you can build an adjacency matrix of this graph and with this algorithm sort the pages based "popularity" (which pages have more links pointing to them intuitively)

Let's say that in most cases you have a graph and you consider the corresponding matrix. Doing the inverse is not as useful in practice except in some cases as explained in the article.

[0]: https://en.wikipedia.org/wiki/PageRank


You make a good point about types of matrices that a graph representation makes sense with but it seems a bit much to say that PCA is pseudoscience?

If you had a lot of people and a lot of stocks, a low-rank representation of the matrix (probably not PCA per se with that particular matrix, but something closely related) could convey a lot of information about, e.g., submarkets and how they're valuated together. Or not, depending on how those prices covary over time.


I disagree... there are more ways you can use this matrix to creatively extract information out of it.

For instance, you can normalize along the columns, and build a "recommender system" using matrix factorization.

With that, when a new person comes with a portfolio, the system will output a probability for this new person to acquire the other assets he doesn't have.

It's (the very basic) idea of how Netflix recommends movies.


When I try to get this point across about techniques like the PCA, I like to show that the measurement units strongly affect the inference.

Really, if your conclusions change depending on whether you measure in inches or centimeters, there’s something wrong with the analysis!


I would disagree and here is why:

> When I try to get this point across about techniques like the PCA, I like to show that the measurement units strongly affect the inference.

In such a case the problem is not with PCA but with application. PCA is just a rotation of the original coordinate system that projects the data on new axes which are aligned with the directions of highest variability. It is not the job of PCA to parse out the origin of that variability (is it because of different units, or different effects).

> Really, if your conclusions change depending on whether you measure in inches or centimeters, there’s something wrong with the analysis!

To get a statistical distance one should: subtract the mean if the measurements differ in origin; divide by standard deviation if the measurements differ in scale; rotate (or equivalently compute Mahalanobis distance) if the measurements are dependant (co-vary). The PCA itself is closely related to Mahalanobis distance: Euclidian distance on PCA-transformed data should be equivalent to Mahalanobis distance on the original data. So, saying that something is wrong with PCA because it doesn't take units of measurement into account is close to saying that something is wrong with dividing by standard deviation because it doesn't subtract the mean.


Is the effect of measurement units eliminated by applying something like zero mean unit variance normalization prior to dimensionality reduction?


I dunno, there are some semi-useful things you can do.

For example, the transform from (Alive, Bob, Carol, Dan) to (Male, Female) is linear -- it's another matrix that you can compose with the individual-ownership one you have here.

Or, call your individual-ownership matrix A, and say that P is the covariance of daily changes to prices of the four stocks listed. Then A P A' is the covariance of daily changes to the peoples' wealths. The framing as linear algebra hasn't been useless.

I kinda get what you're saying though. Like, why would powers of this matrix be useful? It only makes sense if there's some implicit transform between prices and people, or vice versa, that happens to be an identity matrix.

You can make up a story. Say the people can borrow on margin some fraction of their wealth. Then say that they use that borrowing to buy stock, and that that borrowing affects prices. Composing all these transforms, you could get from price to price, and then ask what the dynamics are as the function is iterated.

But, ok, "I'm just going to do an SVD of the matrix and put it in a slide" isn't going to tell anybody much.

Maybe there's a use for a rank-one approximation to this system? Like, "this is pretty close to a situation where there's a single ETF with those stocks in these proportions, and where the people own the following numbers of shares in the ETF"? Maybe if you have millions of people and millions of stocks and wanted to simulate this "stock market" at 100Hz on a TI-83?

I dunno. You can make up stories.


Is there any domain where you can apply arbitrary transformations on a table and still make sense? I feel there is some depth in your argument that I cannot infer just by the content of your comment and I would be keen to look further into it. I.e in you domain example, a currency would be coordinate and you can move to alternate currencies? Would that be the identity you look for?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: