More

cweill · on March 4, 2023

If you ever need to join two large dataframes, but are OOMing on the join, write them to disk as parquet files then use DuckDB to do the join. It's amazing what you can do on one machine thanks to DuckDB.

qolop · on March 4, 2023

This isn't unique to duckdb. Almost all databases allow for sorting and joins of large tables that don't fit into memory.

cweill · on March 4, 2023

Yes but if you're in a Jupyter notebook, you may not be directly connected to a DB. If you're using pandas, this unlocks some scalability before needing dask and a cluster.

cweill · on Feb 9, 2023

Do you think he was on garden leave? Some kind of non-compete?

karpathy · on Feb 9, 2023

there were no non-competes, garden leaves or etc., i just took some time for myself and then ~late last year started to feel an itch again.

jackallis · on Feb 9, 2023

Are you going to continue your youtube experience?

captn3m0 · on Feb 9, 2023

He said yes on twitter: https://twitter.com/karpathy/status/1623492347739910145

pfdietz · on Feb 9, 2023

I appreciate your videos. Thanks!

nunez · on Feb 9, 2023

Wow; appreciate the transparency dude and thanks for helping build autopilot and fsd.

redox99 · on Feb 9, 2023

Really happy for you. I thoroughly enjoyed your YouTube NN series, and your Lex interview.

beecafe · on Feb 9, 2023

Foss?

tasuki · on Feb 9, 2023

Some people enjoy not working some of the time. If that's hard for you to believe, I'd love to know what job you have! :)

htrp · on Feb 9, 2023

100% ... i would be shocked if that wasn't written into his contract.

motoxpro · on Feb 9, 2023

Classic. "100%" certainty but then the guy himself responds and says it's not the case.

cweill · on Jan 4, 2023

I also have this question. Is the RL MDP actually encoding cause and effect? Or just learning (bidirectional) correlations between states and actions?

I wonder if Pearl thinks that RL replicates his do-calculus under the hood, or if that's an innovation we're missing.

cweill · on Jan 3, 2023

For reference, most gaming channels in the US have an RPM of $4-$5. I'm guessing their RPM is $1 because the channel owner is based in Europe. I imagine toy channels have higher RPMs in the US.

cweill · on Jan 1, 2023

Look into Coach Sommer and https://www.gymnasticbodies.com/. Probably the best guide from beginner to advanced out there. I've been following their program for over 15 years, and there's something for every level of strength and flexibility. It's a program you can follow your entire life.

The big thing with calisthenics/gymnastics is developing the tendon and ligament strength in your joints which take several months/years, but is the foundation to doing the advanced stuff like planches and levers. Good luck!

cweill · on Dec 24, 2022

How far back can this go? Would you be able to do 1-3 years of daily data?

noelblanc · on Dec 24, 2022

Not very far unfortunately, up to 7 days in the past with my version of Twitter API access. I could have more data with academic/entreprise access.

https://developer.twitter.com/en/docs/twitter-api/tweets/cou...

cweill · on Dec 18, 2022

Having been on both sides as a customer and foundee, I love subscriptions.

Every business needs a business model. Subscribers is a very clear one. I'm untrusting of any service that's free and doesn't rely on ad-revenue, because they have incentive to make money in other shady ways like selling your data to third-parties who will use it against your best interests.

Subscriptions are a nice business model because they are predictable costs/revenue to both customers and the business.

What I don't like are subscriptions that don't grandfather you into higher pricing, and pay-as-you-go plans, because it's so easy to forget about then and their rules and get charged a nasty bill later.

As a customer IMO subscriptions are the lesser of all the evils, and the best alternative to ads, eg YouTube Premium.

falcolas · on Dec 18, 2022

Here's the problem with that - there's not enough room for every company and their subscription model. So where you have a chance to get money from someone with a one time purchase, you're going to have to fight to get another piece of their subscription budget.

And more and more - you're not going get that piece.

Customers only have so much money and so much time. Expecting them to make a long term commitment is going to be a worse and worse business model in the coming years.

Oxidation · on Dec 18, 2022

And it's the time almost more than anything else. A subscription means I need a login to some subscription portal. I need to monitor the subscription and the payments. When I want to cancel it, quite possibly that will take some time and need me to call. There's a non zero chance it'll slip though the cracks and end up costing me four figures.

You know what? I didn't really want your software enough to get into any kind of ongoing relationship here, I'll make do without it. I don't have the energy for dozens of services.

It does work a little better for business as I don't need to personally deal with the admin, but even then on a financial basis, per-user-per-month costs still add up very fast.

cweill · on Dec 11, 2022

Google colab is $49/mo to get an A100. Trust me when I say you can build multimillion-dollar ML companies with just that (and maybe $100 extra dollars per month of spot credits).

cweill · on Dec 10, 2022

I disagree with this comment, and anyone reading it should take it with a big grain of salt. Let's go back to 2016 and replace "LLM" with "Reinforcement Learning". Everyone thought every problem could be solve by RL because it's a looser restriction on the problem space. But then RL failed to deliver real world benefits beyond some very specific circumstances (well defined games), and supervised learning is/was still king for 99% of problems.

Yes, LLMs are amazing but they won't be winning every single Kaggle competition, displacing every other ML algorithms in every setting.

PartiallyTyped · on Dec 10, 2022

Sure enough LLMs are not going to win every kaggle competition. But... I am fairly certain that transformers may. Embed all categorical values, scale continuous features by embeddings, and run it through a graph neural network, with high probability it will beat nearly everything.

janalsncm · on Dec 10, 2022

Transformers require a lot of data to converge. There’s a reason tree models are still king of kaggle even though transformers have been around for 5 years now.

flooo · on Dec 11, 2022

To me, the vast majority of people in the field seem to hold on to the idea that different technology suits different problems.

Also, are you aware that one of the most prominent AI tools of this month (ChatGPT) was obtained with RL!

cweill · on Nov 23, 2022

So many accomplishments are the result of someone just putting one foot in front of the other with a vague idea of where they're going. Hopefully, their north star is something others find respectable once achieved. They're then labeled "geniuses" if it works out and "fools" if it doesn't. Accomplishments are only impressive with hindsight.