Hacker Newsnew | past | comments | ask | show | jobs | submit | arthurcolle's commentslogin

Modal's CUDA Book is cool

Renters paying mortgage holders to pay their mortgage in exchange for conceptual gains like "freedom"

We've been there for a long fucking time


If you're in a market where landlords can charge the full mortgage you desperately need more housing. Rent around here costs slightly more than the interest on the mortgage.

I don’t think there’s anywhere in the UK where a mortgage is more than the rent for the same property. Renters are generally the breadwinners in their landlord’s families.

Well in that case, yeah, I can't imagine any reason you would rent. But in my neck of the woods renting a place is a little more than half the cost of the equivalent mortgage on the same property.

You would rent because you’d lack the deposit and income a bank would require of you for the same property.

Now that I have a mortgage, it’s much less than the rent I was paying for a similar property. This is the case in most developed cities on Earth.


Where on earth is this land of beneficent landlords? Do you live in Hobbiton or on the island from Lost

Every city is like this

Why are you making something cheap more expensive than it needs to be?

It's not cheap. It costs millions to $100 million depending on the model. I was responding to this tradeoff:

"A 10x increase in training costs is not necessarily prohibitive if you get a 10x decrease in inference costs."

Given millions and up, I'd like that to be 10x cheaper while inference was 10x more expensive. Then, it could do research or coding for me at $15/hr instead of $1.50/hr. I'd just use it carefully with batching.


Calculating the gradient requires a forward pass (inference) and a backward pass (back propagation).

They cost roughly the same, with the backwards pass being maybe 50% more expensive. So let's say three times the cost of a forward pass.

You can't make training faster by making inference slower.


I was responding to their claim by starting with an assumption that it may be correct. I don't know the cost data myself. Now, I'll assume what you say is true.

That leaves computation and memory use of two passes plus interlayer communication.

I think backpropagation doesn't occur in the brain since it appears to use local learning but global optimization probably happens during sleep/dreaming. I have a lot of papers on removing backpropagation, Hebbien learning, and "local, learning rules."

From there, many are publishing how to do training at 8-bit and below. A recent one did a mix of low-bit training with sub-1-bit storage for weights. The NoLayer architecture might address interlayer better.

People keep trying to build analog accelerators. There are mismatches between their features and hardware. Recent works have come up with analog NN's that work well with analog hardware.

A combination of those would likely get cost down dramatically on both inference and training. Also, energy use would be lower.


Move fast & break things

is this the most transparent form of grift ever seen? TikTok valuation must be upwards of $60B+

Not all of TikTok, only the American stuff.

Bytedance was recently valued at $330B. $14B is ridiculous.

claude code for first 3-4 months was a monster. it's been optimized

It's very unstable, indexing doesn't work anymore

the model is the product

Agent-1

you should get a lawyer and try to sue in small claims court, it is the fastest path vs. anything they will surface to you. even by saying this I put myself at risk but they are truly a demonic organization

s/demonic/pernicious


As others have pointed out, legal is fastest approach with them.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: