Different isn't always better, better is always different
I’ve built products and led companies through multiple transformations. The LLM AI transition is — I am quite sure — already the most significant of my lifetime.
Companies struggle through transformations for many reasons: Technology and investments tied to the last era. Leaders whose instincts are actively wrong in the new world they are entering. All the petty and very human conflicts that are part and parcel of change.
Having just released NewsArc — a product made possible by LLMs advancements — I am stoked LLMs are receiving so much focus and funding. But, the experience of past transitions also makes me aware of the challenges this kind of concentration creates for teams, organizations, and product development.
The trouble with transitions
The Facebook mobile transition illustrates many of these dynamics. Mark could push “mobile first” messaging all day long, but Facebook’s engineering teams lacked the expertise and technical foundations to deliver it. Worse, Mark’s finely honed PHP instincts and product development expertise were actively misleading for native mobile development, compounded by his otherwise valuable engagement with product developers.
Similarly, all of Facebook’s infrastructure and processes had been built around PHP assumptions: Easy rollbacks. Everything is a web push. Functional partitioning per endpoint. And of course, the jockeying at a very valuable, pre-IPO company with a set of established leaders who would rather be wrong than give up one iota of power or compensation.
Mr. But Actually says “Oh, come on. Well-compensated, brilliant people aren’t petty idiots!” Sure. A little over six months into my time at Facebook was my first performance review. Schrep had already asked me to investigate mobile and I was trying hard not to step on toes while finding numerous problems. I was getting promoted — unusual for such a new hire — and had cleaned up a few troubled teams already. Then, in my review, a longer-tenured engineering director explained to me he had decided I shouldn’t be part of the discretionary equity program. Since I didn’t know what the program was, it didn’t really register. Six months later, when multiple engineering directors were reporting to me, I learned a) every other engineering director received it and b) that equity grant was significantly larger than my entire compensation over my Facebook career. Absent a very strong culture and management, pettiness and fear will exist. Worse, most senior executives will be blind to it, since it’s almost always expressed downward.
Huge credit to Mark and Schrep that Facebook navigated the mobile transition despite all the reasons to fail. On top of that, right after, Deep Learning exploded into our collective awareness and Schrep had to manage the first huge AI transition at Facebook: Deep Learning and Convolutional Networks as the foundation for the future of AI. Facebook’s ML teams were world class — News Feed had recently changes to pure ML ranking — so you can imagine the challenges Yann LeCun joining and creating FAIR (now Meta AI) created.
Facebook navigated it well. Google had to solve similar dynamics with the creation of Google DeepMind. These transitions break companies that aren’t well managed. Look at the struggles Apple has endured in the LLM era despite the lead Siri gave them.
The next AI revolution
The AI rocketship of the last four years has been closely tied to a particular field of AI: generative models. In particular, diffusion models (like Stable Diffusion) and Large Language Models. If you’re reading this blog, you probably already know this work springs largely from the attention paper, extending the foundational work that caused the last AI revolution. Work that had languished for decades — the AI winter — until it turned out that the ideas worked; we just needed many orders of magnitude more compute to see it.
My framing around the struggle with change applies here. Classical Machine Learning had demonstrated real value in classification, similarity, and ranking — everything I talked about in Rethinking Attention — so funding and graduate students were pivoted away from Deep Learning. Because no matter the stories of pettiness I can find in professional settings, stories from academia will top them.
As Sayre’s law states:
Academic politics is the most vicious and bitter form of politics, because the stakes are so low
Of course, we now know the stakes aren’t low. Plus, we’ve so completely entangled academic and industry research that we have a simultaneous path to tenure and spectacular compensation. All of this is good for LLM research, which has seen impressive progress. But what about all the other AI research?
Of course, LLMs aren’t the only game in town. A few people are shouting into the void about this, like Gary Marcus. I’m not going to pretend to be an expert on different, competing AI approaches.
But I am an expert on how transformation impacts organization and product development. Through that lens, it is a certainty that highly effective, highly complementary work to LLMs is being neglected. Not just because of pettiness and organizational boundaries — though those both count for a lot — but everything else. The $500 billion being spent on data centers, NVIDIA AI chips, and new power generation does not make it easy to fund technologies perceived as competing with LLMs. Nor do powerful and high-profile teams like to share the spotlight internally.
Worse, when you hire the world’s experts on LLMs, what do you expect them to say about other technologies?
Early in my time at Google, I was following my “build an ethnography” approach to learning new organizations by asking lots of questions and then asking who else to talk to. One conversation was with an executive with a great deal of influence over Search and Google Assistant. And who is an expert on classical ML and ranking. I was asking about the attention paper and how central voice was likely to be — as well as how awkward voice was in so many settings — when this senior executive interrupted me with, “Interfaces will never be conversational.”
Doing video the hard way
Video and interactive models perfectly illustrate the dynamic.
Veo 3 is spectacular, and Genie 3 is perhaps even more so. If you haven’t looked at them, Veo 3 is a text/image-to-video model that includes audio generation and is superior to prior models at image and world stability. Similarly, Genie 3 does the same for video game worlds, allowing you to explore a world synthesized and modeled for you in real time. These are huge steps forward. Meanwhile, Microsoft is creating a playable Quake 2 using similar techniques. This is all super cool stuff. The use cases aren’t hard to imagine, the progress is impressive, and the teams are likely very well funded.
It’s also pretty hard to imagine a more challenging way to try and solve the problem.
Video games have solved stable visualizations for decades, including incredibly flexible generative approaches. Almost five years ago, Intel researchers demonstrated astonishing style transfer work. A week doesn’t go by without advances in Gaussian splatting. The list goes on and on.
This is not an argument against neural rendering approaches. I’m a huge fan — who wouldn’t want to shift the computational complexity to screen space from world space? Especially when DLSS creates incredibly cheap, high-quality upscaling. Instead I’m pondering what would accelerate and transform the neural rendering efforts, make them more useful and impactful.
Obviously, there are big, very talented teams working on these approaches, too. They’re mostly inside NVIDIA and a few of the largest game studios. They’re using LLMs and diffusion models, but they aren’t working alongside those researchers. And neither the big tech researchers nor game developers are well connected to yet another set of parallel work going on quietly inside studios and special effects shops.
What about even more distant AI research areas like symbolic reasoning? What could you create if those teams were aligned and cooperating? What existing multi-billion dollar efforts are going to be rendered valueless by the companies that figure this out?
Shipping org charts
Companies large and small often ship their org charts and operate in silos. Big acquisitions and senior executive attention often reinforce these tendencies. Great leaders, vision, and goals help mitigate them. As we move beyond the initial period of “who has the most access to raw training data?” it’s going to be the perfect moment to create the teams and goals to leverage more valuable AI work rather than less..