Sundar's email mentions something critical - Jeff Dean is going to be the Chief Scientist in DeepMind, and coordinate back to Sundar. This is a big deal; that move tells you that Google is taking being behind on public-facing AI seriously, Dean is an incredibly valuable, incredibly scarce resource.
If we wind way back to Google Docs, Gmail and Android strategy, they took market share from leaders by giving away high quality products. If I were in charge of strategy there, I would double down on the Stability / Facebook plan, and open source PaLM architecture Chinchilla-optimal foundation models stat. Then I'd build tooling to run and customize the models over GCP, so open + cloud. I'd probably start selling TPUv4 racks immediately as well. I don't believe they can win on a direct API business model this cycle. But, I think they could do a form of embrace and extend by going radically open and leveraging their research + deployment skills.
Jeff Dean is clearly one of the greatest software developers/engineers ever but there isn’t much evidence that he is a brilliant ML researcher
And indeed Google AI has achieved very little product wise during his time as CEO. Kind of suggests he is a big part of bureaucratic challenges they have faced
Of course he is someone any technology organisation would want to have as a resource. But probably not as chief scientist or ceo of an ML company based on the available evidence
>Jeff Dean is clearly one of the greatest software developers/engineers ever but there isn’t much evidence that he is a brilliant ML researcher
Google have oversupply of brilliant ML researchers. What they need is a engineer that sees the applications of the technology so it can be turned into a product. Someone that can bridge the gap between the R&D team and the Bureaucracy.
Want an idea for a stupid product - input - description of a girl, hobbies, some minor flaws - output - create a poem. Have been using Vicuna quite successfully for that purpose.
He had the perfect balance of being legendary engineer and an ML researcher
Can't emphasize more on how much rigorous engineering practice could accelerate research delivery. It is THE key to have a productive research oriented team.
Good research engineers are underrated, and very difficult to find.
Furthermore, Jeff is not a great (or even good...) manager/director/leader.
There were a lot of internal and external dramas because of his leadership, that he failed to address.
How often you hear about dramas about other Chef Scientists at other, comparably sized, companies?
He should stay a Fellow, in a "brilliant consultant" role.
He absolutely should have gotten rid of the troublemaker. Many folks used this publicity to leave for higher positions or higher pay (which is very common at google) but made it look like Jeff Dean was the problem.
That's my bias as well. To me, it seems like every day someone releases a new AI toy, but the thing you would actually want is for a real software engineer to take the LLM or whatever, put it inside a black box, and then write actually useful software around it. Like off the top of my head, LLM + Google Calendar = useful product for managing schedules and emailing people. You could make it in a day of tinkering as a langchain demo, but actually making a real product that is useful and doesn't suck will require good old fashioned software engineering.
Based on the multitask generalisation capabilities shown so far of LLMs I’m kinda in the opposite camp - if we can figure out more data efficient and reliable architectures base language models will likely be enough to do just about anything and take general instructions. Like you can just tell the language model to directly operate on Google calendar with suitable supplied permissions and it can do it no integration needed
Exactly this. There is a reasonable chance the GUI goes the way of the dodo and some large (75% or something) percentage of tasks are done just by typing (or speaking) in natural language and the response is words and very simple visual elements.
People are building toy demos in a day that are not actual useable products. It’s cool, but it’s the difference between “I made a Twitter clone in a weekend” and real Twitter.
1 - companies are deploying real products internally for productivity, especially technical and customer support, and in data science to enable internal people to query their data warehouse in natural language. I know of 2 very large companies with the first in production and 1 with the second, and those are just ones I'm aware of.
2 - you are conflating the problems of engineering a system to do a thing for billions of users (an incredibly rare situation requiring herculean effort regardless of the underlying product) with the ability of a technology to do a thing. The above mentioned systems couldn't handle billions of users. So what? The vast majority of useful enterprise saas could not handle a billion users.
OpenAI from a research point of view haven’t really had any “big innovations”. At least I struggle to think of any published research they have done that would qualify in that category. Probably they keep the good stuff for themselves
But Ilya definitely had some big papers before and he is widely acknowledged as a top researcher in the field.
I think the fact that there are no other systems publicly available that are comparable to GPT-4 (and I dont think Bard is as good), points to innovation they havent released
> Jeff Dean is clearly one of the greatest software developers/engineers ever
Based on what? I've heard all the Chuck Norris type jokes, but what has Jeff Dean actually accomplished that is so legendary as a software developer (or as a leader) ?
Per his Google bio/CV his main claims to fame seem to have been work on large scale infrastructure projects such as BigTable, MapReduce, Protobuf and TensorFlow, which seem more like solid engineering accomplishments rather than the stuff of legend.
Seems like he's perhaps being rewarded with the title of "Chief Scientist" rather than necessarily suited to it, but I guess that depends on what Sundar is expecting out of him.
When I joined Brain in 2016, I had thought the idea of training billion/trillion-parameter sparsely gated mixtures of experts was a huge waste of resources, and that the idea was incredibly naive. But it turns out he was right, and it would take ~6 more years before that was abundantly obvious to the rest of the research community.
As a leader, he also managed the development of TensorFlow and TPU. Consider the context / time frame - the year is 2014/2015 and a lot of academics still don't believe deep learning works. Jeff pivots a >100-person org to go all-in on deep learning, invest in an upgraded version of Theano (TF) and then give it away to the community for free, and develop Google's own training chip to compete with Nvidia. These are highly non-obvious ideas that show much more spine & vision than most tech leaders. Not to mention he designed & coded large parts of TF himself!
And before that, he was doing systems engineering on non-ML stuff. It's rare to pivot as a very senior-level engineer to a completely new field and then do what he did.
Jeff certainly has made mistakes as a leader (failing to translate Google Brain's numerous fundamental breakthroughs to more ambitious AI products, and consolidating the redundant big model efforts in google research) but I would consider his high level directional bets to be incredibly prescient.
OK - I can see the early ML push as obviously massively impactful, although by 2014/2015 we're already a couple of years after AlexNet, other frameworks such as Theano, Torch (already 10+ yrs old at that point), etc already existed, so the idea of another ML framework wasn't exactly revolutionary. I'm not sure how you'd characterize Jeff Dean's role in TensorFlow given that you're saying he lead a 100-person org, yet coded much of himself.... a hands-on technical lead perhaps?
I wonder if you know any of the history of exactly how TF's predecessor DistBelief came into being, given that this was during Andrew Ng's time at Google - who's idea was it?
The Pathways architecture is very interesting... what is the current status of this project? Is it still going to be a focus after the reorg, or too early to tell ?
Jeff was the first author on the DistBelief paper - he's always been big on model-parallelism + distributing neural network knowledge on many computers https://research.google/pubs/pub40565/ . I really have to emphasize that model-parallelism of a big network sounds obvious today, but it was totally non-obvious in 2011 when they were building it out.
DistBelief was tricky to program because it was written all in C++ and Protobufs IIRC. The development of TFv1 preceded my time at Google, so I can't comment on who contributed what.
1. what was the reasoning behind thinking billion/trillion parameters would be naive and wasteful? perhaps part are right and could inform improvements today.
2. can you elaborate on the failure to translate research breakthroughs, of which there are many, into ambitious AI products? do you mean commercialize them, or pursue something like alphafold? this question is especially relevant. everyone is watching to see if recent changes can bring google to its rightful place at the forefront of applied AI.
> large scale infrastructure projects such as BigTable, MapReduce, Protobuf and TensorFlow
If you initiated and successfully landed large scale engineering projects and products that has transformed the entire industry more than 10 times, that's something qualified for being a "legend".
Only if you did it at a company like Google where it's being talked about and you've got that large a user base. Inside most of corporate America internal infrastructure / modernization efforts get little recognition.
I wrote an entire (Torch-like - pre PyTorch) C++-based NN framework myself, just as a hobbyist effort. Ran on CPU as well as GPU (CUDA). For sure it didn't compete with TensorFlow in terms of features, but was complete enough to build and train things like ResNet. A lot of work to be sure, but hardly legendary.
> Only if you did it at a company like Google where it's being talked about and you've got that large a user base
Google has lots of folks who had access to the similar level of resources and no one but Jeff and Sanjay made it. Large scale engineering is not just about writing some fancy infra code, but a very rigorous project to convince thousands of people to onboard which typically requires them to rewrite significant fraction of their production code, typically referred as "replacing wheels on a running train". You gotta need lots of evidence, credits and visions to make them move.
Yeah - just finished migrating a system of 100+ Linux processes all inter-communicating via CORBA to use RabbitMQ instead. Production system with 24x7 uptime and migration spread over more than a year with ongoing functional releases at the same time. I prefer to call it changing the wheels on a moving car.
No doubt it's worse at Google, but these type of infrastructure projects are going on everywhere, and nobody is getting medals.
This announcement, including the leadership changes, sounds more like they've shut down DeepMind and moved everyone over to Google Brain. Keeping the DeepMind name for the new team is a clever trick to make it look like more positive news than it actually is.
He is key to defining the culture (secrecy etc). There is a huge culture difference between Brain and DM and, with Demis at the helm, I'm concerned that it'll be Brain moving towards DM culture, not vice versa.
ehhhhhhh sure but for how long? these statements stick out to me:
- "I’m sure you will have lots of questions about what this new unit will look like" aka we're not going to talk about specifics in public comms
- "Jeff Dean will take on the elevated role ... reporting to me. ... Working alongside Demis, Jeff will help set the future direction of our AI research" aka Demis isn't the only Big Dog in the room anymore
It seems that DeepMind has now gone from what had appeared to be a blue sky research org to almost a product group, with Google Research now being the primary research group.
Jeff Dean's reputation has always been as an uber-engineer, not any kind of visionary or great leader, so it's not obvious how well suited he's going to be to this somewhat odd role of Chief Scientist both to Google DeepMind and Google Research.
How things have changed since OpenAI was founded on the fear that Google was becoming an unbeatable powerhouse in AI!
I'd hope that a direct report to the top also has the strength of character to call for shutting it down, if it doesn't fly. Or, to respond appropriately to requests to commercialise it, with huge latent risks.
Rushing it out the door in a space race model is probably not a good idea. At this point, the value proposition is moot.
As a client and even paying customer of google I don't want this AI intruded into my product experiences without a big fat OFF switch. Not because of some terminator skynet fantasy: I want an opportunity to discriminate between reality as projected from pagerank and classic NLP algorithms, from the synthetic responses from a model.
That sounds like a fairly brilliant counter to "Open"AI. Something tells me that Google is still too scared of this tech in the hands of the public to go there though.
Piles of previous statements talking about safety and a general unwillingness to put any of its models in the hands of anyone not under NDA. Bard is the first counterexample I can think of and that was forced by OpenAI.
Anecdotally, Bard is much better at guiding the responses than chatGPT. They're slowly releasing more and more "features" as they feel comfortable. You can see what they're up to with the TestAI kitchen app.
I kinda liked how open chatGPT was before the heavy filtering, but I see why we need to reign in chaos overall.
The AI primitives are pretty basic. The real brains are figuring out how to make the best model. The engineering integration are pretty straightforward
If we wind way back to Google Docs, Gmail and Android strategy, they took market share from leaders by giving away high quality products. If I were in charge of strategy there, I would double down on the Stability / Facebook plan, and open source PaLM architecture Chinchilla-optimal foundation models stat. Then I'd build tooling to run and customize the models over GCP, so open + cloud. I'd probably start selling TPUv4 racks immediately as well. I don't believe they can win on a direct API business model this cycle. But, I think they could do a form of embrace and extend by going radically open and leveraging their research + deployment skills.