What I find most interesting with this is that it shows they believe there is nothing unique at Meta related to AI. There is no resource, people and computing power, that they can't get elsewhere for whatever they believe would be more interesting for them.
I mention this because it feels analogous to military research, where people "dream" of how advanced the military is, how forward they are compared to public research... and yet, it seems to be a recurring myth they love to sustain.
So the signal I get here is AI "labs" in BigTech have nothing worth waiting for around the corner, it's just more of the same and boring for people who stick there.
I think you might be reading a bit too much into this.
He’s been with Meta for 11 years and is likely in a very comfortable financial position, given the substantial stock options he’s received over that time.
He also mentioned the arrival of a new child, and it’s well known that Meta's work-life balance isn’t always ideal.
On top of that, Meta, like many major tech companies, has been shifting its focus toward LLM-based AI, moving away from more traditional PyTorch use cases.
Considering all of this, it seems like a natural time for him to move on and pursue new, more exciting opportunities.
> On top of that, Meta, like many major tech companies, has been shifting its focus toward LLM-based AI, moving away from more traditional PyTorch use cases.
This is very wrong. Meta is on the forefront of recommendation algorithms and that's all done with traditional ML models made using PyTorch.
Some recommendations are uncanny, except that I don't want any of them in my Facebook news feed and no matter how often I select "never show me this feed again," it keeps trying.
Everyone is fine-tuning constantly though. Training an entire model in excess of a few billion parameters. It’s pretty much on nobody’s personal radar, you have a handful of well fundedgroups using pytorch to do that. The masses are still using pytorch, just on small training jobs.
Fine-tuning is great for known, concrete use cases where you have the data in hand already, but how much of the industry does that actually cover? Managers have hated those use cases since the beginning of the deep learning era — huge upfront cost for data collection, high latency cycles for training and validation, slow reaction speed to new requirements and conditions.
That's wrong. Llama.cpp / Candle doesn't offer anything on the table that PyTorch cannot do (design wise). What they offer is smaller deployment footprint.
What's modern about LLM is the training infrastructure and single coordinator pattern, which PyTorch just started and inferior to many internal implementations: https://pytorch.org/blog/integration-idea-monarch/
Pytorch is still pretty dominant in cloud hosting. I’m not aware of anyone not using it (usually by way of vLLM or similar). It’s also completely dominant for training. I’m not aware of anyone using anything else.
It’s not dominant in terms of self-hosted where llama.cpp wins but there’s also not really that much self-hosting going on (at least compared with the amount of requests that hosted models are serving)
> What I find most interesting with this is that it shows they believe there is nothing unique at Meta related to AI
Whether or not this is the case, I don't get this as being the reason for Sousmith leaving - it sounds as if he is just ready for a change.
Still, it is noticeable that with many of the AI companies claiming that their version of "AGI" is just around the corner, developers and staff don't appear to be particularly excited about this (I assume they realize it is just hype, not some momentous advance around the corner), and leave to pursue different things, such as Mira Murati starting a fine-tuning company, Karpathy going back to education, others switching ship (typically from OpenAI to Anthropic), etc.
> Still, it is noticeable that with many of the AI companies claiming that their version of "AGI" is just around the corner, developers and staff don't appear to be particularly excited about this
Why would they be excited about it? There's little in it for them.
Nobody that has to work for a living should be excited for AI, they should be genuinely afraid of it. AGI will have vast, deeply negative consequence for almost everyone that has to work for a living.
"Ready for change" is just the polite way to say, "I can't stand it here anymore. I'd rather roll the dice on a new place because reversion-to-mean means it's probably going to be better than whatever this has become."
There are a lot of things I don't like about my current job, but not enough for it to make sense to gamble on a new place. It's easier to push for change from my current position than to count on any new place being any better.
But if it gets worse and I do leave, I'll definitely be telling the interviewer, "I was just ready for a change."
On the other hand, while I know nothing about Soumith, he clearly has enough financial runway (see my calc below) to not have to work again.
As far as I know, we all get one life. If one can help it (modulo other constraints), one should not get trapped by prestige, achievement, short-term admiration by others, impact and external facing factors. To see an alternate reality, it helps to escape the bubble, for example, by spending time in a completely different culture or environment where no one knows or cares about what one did.
I admire people taking such decisions. It's easy to be on autopilot in life. But, people who wear their success lightly are rare but more philosophically aware, in my opinion at least. I wish him good luck!
> see an alternate reality, it helps to escape the bubble, for example, by spending time in a completely different culture
I'm at similar position now, need to make decision. The problem is after leaving IT world for a while it will be hard to get back. I'll have to change my life completely and discard all knowledge and expertise I have. That will be fun, interesting, eyes opening, etc, but no way back.
I don't know you, don't know your situation, but this does not seem to match the experiences of many of my friends who left for a while and then came back. "Spent two years starting a restaurant" and "had to take care of my parents" were not blockers for getting another computer related job in due time. There are few truly irrevocable decisions in our life.
Now, the current job market makes this significantly harder than it was in the 2010's, but that's floating over all of us- if your company does an Amazon tomorrow, would you get a job as nice as you currently have? Maybe, maybe not.
In executive roles, your expertise really is in management acumen a lot of the time. But as an individual contributor--or adjacent--once you're out of a technical space for a few years, it's increasingly hard to get back in even if you've casually kept a finger in.
Exactly, the only way to stay current is to keep doing something at least half time. The good thing it doesn't have to be the same as prev job. Just keep brain working and learning.
Agree and disagree. Yes, keep brain working and learning of course. But, if you've dropped out of some space, you're going to be pretty rusty about what is currently going on.
I think age plays an important part in the decision to move away from a place. I think in your 20s or very early 30s you have far more leeway to kind of go away and start again, but a lot of the hope to actually be able to find that unicorn workplace fades away as you approach your late 30s. Once into your 40s, depending on your trade, you're dead on arrival unless you successfully manage to rebrand yourself as a consultant, whatever the fuck that means.
Can be*, that's not necessarily always true. I've quit jobs plenty of times without having any plan for the future or particular drama-reason for leaving, just "It's not as fun here anymore, despite this being a great place to work", I'm sure I'm not the only one who does so.
What I've never done though, is leaving a place without being 100% honest exactly why I'm leaving. I won't say "I was just ready for change" if that wasn't the reason, I have no reason not to be honest about why I'm leaving.
I've generally had 10+ year tenures other than a return to school that was basically always in my plan and dot-bomb (leaving a company I wasn't really a fit with anyway). But, yeah, I've always been ready to move on at about that ten year point which is actually fairly long by a lot of people's standards in the tech industry.
I do disagree though that, unless there's some actionable change that would specifically benefit you like more money, my answer outside of private conversations with people I know well is going to be some variant of "time for a change." Anything else just invites arguments and conversations I don't want to have.
I do want to push back on this a little. People leave all the time for this "I wanna see what else is out there" especially at such senior levels and with so much financial security as he inevitably has from working at Meta for 11 years. It is not always a gamble for many of them, and many of them are not so skeptical and cynical of other places they could go and bring their expertise
About the military, from my limited experience, they are significantly behind the civilian state of the art, except for technology that has has few applications outside of the military, like stealth.
In fact everything secret tends to be behind. Secrecy is a huge burden, and seriously limits all forms of collaboration.
In addition, because military projects are often big and highly politicized you get all the inefficiencies that goes with that. Classification is also convenient for hiding screwups and corruption.
I spent a dozen years as a US defense contractor across a broad spectrum of places (from R&D for the future to working with E3's today), and worked at internet scale and start-up B2B stuff in the other dozen years of my working career.
I think that the major difference about deployed military technologies- in contrast to both military R&D and the entire commercial side- is that they are, by and large, incredibly rock solid and reliable. If they aren't, they don't actually get used. It takes a lot of effort to get them that way. I remember once at a testing ground for our robot tanks of the far future, right next door was an outdoor test-track. And they were testing a kitchen trailer (a kitchen for ~200 men that can be towed by a Humvee). And they drove it around the track continuously for three weeks, stopping only long enough to change drivers/vehicles, and the four times a day they would halt and make 200 people meals, and then pack up and get back to driving. This was one of several reliability tests that the kitchen trailer had to get through before it was accepted for service.
Our R&D stuff couldn't handle that (it needed 3-4 engineers to carefully monitor it at all times), but the stuff that needed to be in the hands of some random 18 year old with a two week training course had to be rock solid to use, do regular maintenance on, and fix, even when they were only getting four hours of sleep a night. If it wasn't up to that level, then the troops ended up ignoring it, leaving it behind when they went out to do their job. And by and large, from what I could tell, most of the stuff they had was that reliable. There were some cool things that we were doing in the R&D space, but we were a long way from that level.
One thing I meant to add: this extensive testing- and the enormous amount of documentation/training materials necessary to take an 18 year old with average ASVAB scores and produce someone who can cook meals for 200 other soldiers on four hours of sleep a night- is both why military things cost so much, relative to commercial grade stuff, and why they don't get updated particularly often. Frequent software updates that change menus around play havoc with the detailed training powerpoints that the military relies on to produce those 18 year old tool operators.
Secret Squirrel projects (which I was near but never read into) can get away with lower reliability because they can count on the users to be much better trained and prepared, though again, from my brief encounters with these sorts, they will ignore anything they don't trust to be completely reliable. Reliability matters far more than cutting edge for like 99.9% of military gear.
The funny thing is when you have that spill over to civilians who then take it to 11.
Case in point: firearms. The standard-issue M4A1 is actually pretty good on that front already, but for civilian ARs, there's a whole cottage industry around making improved components that can handle even more abuse.
Knives, as well. Your average military field knife is something like 80 years behind the curve on materials, especially steel. Which isn't necessarily a bad thing - it's "good enough" (given what they're realistically used for) and cheap at that. But civilians can and do drop 10x money for knives that you can baton wood with and still have a shave after, even though there's no practical use for that kind of thing.
> drop 10x money for knives that you can baton wood with and still have a shave after, even though there's no practical use for that kind of thing.
Excuse you, I just came back from a 6 month backpacking trip where I had to split my own kindling along the way AND shave regularly and I didn't have weight for a knife/axe AND razor blade
/s
> a 318-page report [...] said the SAIC software was incomplete, inadequate and so poorly designed that it would be essentially unusable under real-world conditions. Even in rudimentary tests, the system did not comply with basic requirements
I figured the reason Palantir was so successful was because it was a SV software company instead of a defense contractor dabbling in IT or specialized government consultancy.
The military doesn't have the luxury of things being unreliable. It puts a pressure on them that corporations don't necessarily have: they'd rather have a less-effective but proven system than a potentially-more-effective but riskier system (especially since each system they have comes with massive logistics support).
Ironically, corporations can afford to take more risks of failure (financially and project-wise) than militaries because failure for them doesn't mean actual human death (and when it can, you see processes come in that look a lot more like military processes).
It's actually the commercial/consumer side that gets more reliability than the military side.
The military should have very reliable systems, and they often know the point at which their systems will fail (MTBF calculations are easier to develop with their record keeping). However, the military also has an almost unlimited budget and body count to keep just reliable enough things working much better than they should. It's also really bad about actually competing companies against each other.
The commercial sector, targeting consumers, is where you actually get reliable systems. Why? Because consumers will go towards either the cheapest option (reliability is replaced with ubiquity in the market, it's replaceable) or the more reliable but more expensive options. They (individuals) don't have an unlimited budget or unlimited time to maintain everything in their life. There's competition in the commercial world that's completely absent in the military world
The two major exceptions are where COTS products have taken over (definitionally, DOD is using commercial, often consumer-targeted, products instead of military specific products) and special forces. Special forces often bypasses normal acquisitions processes and so ends up having a better chance to compete vendors against each other than other parts of the military.
This doesn't mean everything the DOD procures through normal acquisitions is inherently unreliable, but reliability is only one of many factors and often only really discovered after selection and full-rate production has started. By that point, the DOD is committed to it for years to come. Each DOD procurement is separate enough from others that you don't even get huge opportunities for reuse. The F-35, to pick something from this century, didn't get components that were shared with other aircraft in the DOD fleet. It's almost all new, which means a lot of things were learned about its reliability after it started flying. It has new comms, new radar, new almost everything. Even the engine (though that probably used many subcomponents shared with other engines) was a new engine just used by the F-35.
Post Cold War, most militaries shifted to COTS and less boutique development. Turns out, you only need to put resources in a few places to stay ahead (stealth, sensing and measuring, space, hypersonics, drones, etc).
I don't think that's the read? Guy says he wants to work on something small. If you want to work on something big you probably want to be in a big corp to have the resources to do the big thing.
Also absolutely unknown if the "new thing" is AI-related at all!
> If you want to work on something big you probably want to be in a big corp to have the resources to do the big thing.
If anything, the reverse seems to be true, if you want to work on something big, you want to be in a small company, sufficiently funded, filled with great people, yet not "big", that's when "something big" seems to be more likely to happen.
In contrast, as far as I can think, the bigger a company gets, the less likely they are to actually come up with "something big", it seems like most of the times you need (creative) constraints in order for the results to end up being actually innovative, otherwise you end up like IBM and Meta, throwing money on stuff and getting some results, but nothing really out of the ordinary considering what's happening elsewhere in their ecosystems.
Well he left so whatever is coming next, AI related or not, "small" or not (small for them might be reaching just a million people, he wrote that he "lead the software layer that powers the entire AI industry." so his notion of scale is probably unlike mine, maybe yours too) is more exciting to him that whatever he could do next with all of Meta's resources.
Edit: to be clear, I didn't mean to imply their next thing is AI related, solely that they obviously know more about AI at Meta than e.g. XR at Meta, just because that's their expertise.
Pretty crazy/bizarre that a VP/Fellow engineer would have such little say in what they do at Meta. In my mind, companies would do everything possible to retain them. They are a special and rare breed.
> where people "dream" of how advanced the military is
If you've ever worked on "advanced military grade" equipment, you'd know better.
It tends to be what you'd euphemistically call "well-proven technology", built down to a price by the lowest bidder, by comparatively unskilled labour.
The most shocking thing about the "captured" Russian drones is they use name-brand Raspberry Pis inside. I'm prepared to bet the American versions use whatever AliExpress crap is on special this week. The UK stuff definitely does.
I mean, these things do exist. There are always tons of big and small tech projects floating around in the special operations community. Cutting-edge sets of hybrid night/thermal vision. Classified helicopters. Hand-built rifled with custom cartridges. Classified medical tech. Advanced fixed wing aircraft with unique capabilities. Advanced dive gear. So on.
"Big Army" doesn't see that stuff for decades, if ever, and mostly never due to cost. And I'm not even getting into classified submarine and nuclear tech, fixed wing drones and aircraft flying at night out of known test facilities, etc.
There's tons of actually advance tech out there in military circles.
Yeah. The DOD is enormous and definitely has your boring every day stuff, but tons of skunk works as r&d. just not very public. An organization that big has all kinds of nooks and crannies, so isn’t really that monolithic.
If you can afford to support yourself, which I’m sure he can, there’s a serenity to working on small projects that are nothin the public eye. It may simply be that he craves some quiet time that enables him to focus on his family and himself.
Negative, what you shave taken away is it’s the people. He mentions standing up clusters. Small shops can’t afford clusters. Ignore the technical aspect of this article and read it for what it’s for. A thank you note to the people he has worked with on amazing projects. Research in a bubble of 1 isn’t very useful. Research in a small team with Meta Budget is extremely useful. With the right people.
> I mention this because it feels analogous to military research, where people "dream" of how advanced the military is, how forward they are compared to public research... and yet, it seems to be a recurring myth they love to sustain.
I don't think that you can read this from the blog post at all, but it gives me a chuckle to think how the quest for AGI at Meta may be "The Men Who Stare at Goats" all over again.
I'm totally speculating. I have no extra information there.
It just makes me think of all the staff, technical staff, that left OpenAI recently. Altman was making grand claims about what was coming next.
Well, we know what followed, namely I don't think any researcher who left knowing what was in the pipeline feel like they missed much in terms of access.
The non-fiction book behind it is probably better comparison than the film adaptation, if you think Meta are doing goat-staring (I don't think they're especially bad on this issue compared to their rivals).
Having friends who are at or near both FAIR and other AI parts of meta, reosurces are not the issue, anymore at least. (there had been a massive squeeze for the last two years though) But pytorch and FAIR use(d) a AWS based cluster. (however pytorch is used everywhere else inside facebook though. well not everywhere...)
There is/are plenty of interesting things happening at big tech, and Meta specifically. If you like computer vision, then Meta is pretty much still the world leader. Much as it pains me to say it.
Unlimited compute resources aren’t literally unique but there are only a small handful of places in the world that have that.
Vast quantities of private data, especially text communications and images. Very few places have that. Coupled with a culture that puts zero privacy protections on that data. Even Google likes to think they’re doing the right thing, so I think that makes Meta unique.
Yes but that's my point, he's willing to give up on that and I believe that's more meaningful that the money (that other highlighted) specifically in a field where it does matter.
It's not what you or I believe, it's what he believe.
The fact that he's ready to give up on something unique means he professionally can't predict from what he knows internally anything interesting enough in a timeframe sufficient for him to want to stay.
I mention this because it feels analogous to military research, where people "dream" of how advanced the military is, how forward they are compared to public research... and yet, it seems to be a recurring myth they love to sustain.
So the signal I get here is AI "labs" in BigTech have nothing worth waiting for around the corner, it's just more of the same and boring for people who stick there.