Taking Risks

The thing about taking risks is that they don’t always work out.

Two years ago, I took a risk — technically somewhere between “large” and “a flier” — to leave my role as the head of Core Experience at Google to join a startup in a country where I didn’t speak the language. Why? I’d never really lived outside of the US. I knew what was possible with LLMs and was itching to build something with them.

Moreover, the challenges around news in the United States due to polarization and attention are important and worth taking a run at. Startups — for all their risks and challenges — give space to maneuver that large companies lack.

Finally, later in your career, it’s easy to no longer do things that scare you.

So, I took the risk.

We spent two years understanding the problem, building new technologies, hiring great people, and launching a very different way to explore news. I found the early results incredibly exciting, but ultimately the company thought otherwise. We spent much of the last month working together to find ways to keep NewsArc going, but sadly, we didn’t find a way forward.

Which hurts.

If you are reading this and you’re looking for some really talented folks, drop me a line or go spelunking on LinkedIn. Whether it’s AI researchers in New York or the whole mix of skills you would expect for mobile development in Palo Alto, the NewsArc team is a group of people you’d be stoked to work with.

What it doesn’t change is the incredible adventure of the last two years. So much food, the infinite depth of Japan, building connections with an incredible group of engineers in our Tokyo office, becoming a local in Shibuya, and the joy of pushing so hard against a really challenging problem.

We built technology and a product that people really connected with and enjoyed using. It’s all too fresh to really consider next steps, but let’s be clear: news matters and conventional attention reinforcement isn’t enough.

Stay tuned for what’s next.

SEAaaS: Social Engineering Attacks-as-a-Service

Thanks to Dan Kaufman, I am an advisor to Badge. Badge has done something incredibly powerful with identity: enabling strong, cryptographic identity without any stored secrets. At Google, my teams contributed to the compromise-o-ramma that is PassKey — an improvement over passwords, no doubt, but if you were to ask yourself “exactly how is Apple syncing PassKeys when I get a new device?” you wouldn’t love the answers — so when I met them I was excited to help out in any way I could.

Why provably human matters more now than ever before

Cheaply and reliably authenticating your presence on any device without having to store the crown jewels of either secret keys or — way worse — a centralized repository of biometrics is a Holy Grail challenge of cryptography, which is why Badge’s advancement is so powerful. For all the obvious use cases — multi-device authentication, account recovery, human-present transactions — Badge is going to change how companies approach the problem and make auth, transactions, and identity fundamentally safer for people around the world.

And just in time. Because one of the many impacts of LLMs and GenAI is that a whole class of cyber attacks are about to become available to script kiddies around the world. Think of it as “Social Engineering Attacks as a Service” — SEAaaS, most definitely pronounced “See Ass.”

One of Badge’s founders, Dr. Charles Herder and I just wrote an op-ed on the topic, “In an AI World, Every Attack Is a Social Engineering Attack.” What was remarkable about writing it was how many of the ideas we were discussing made headlines between starting on the article and completing it.

As we wrote:

With the emergence of Large Language Models (LLMs) and Generative AI, tasks that previously required significant investments in human capital and training are about to become completely automatable and turnkey. The same script kiddies who helped scale botnets, DDos (distributed denial of service), and phishing attacks are about to gain access to Social Engineering as a Service.

As we were drafting, the story broke about Claude being used in a wide-ranging set of attacks

Anthropic, which makes the chatbot Claude, says its tools were used by hackers “to commit large-scale theft and extortion of personal data”.

The firm said its AI was used to help write code which carried out cyber-attacks, while in another case, North Korean scammers used Claude to fraudulently get remote jobs at top US companies.

What all of these attacks apply more pressure to is the need to know if an actor — or the author of a piece of code — is who they claim to be. Increasingly sophisticated attackers leveraging cutting edge frontier models will exploit any form of identity vulnerable to replay or credential theft.

As we wrote:

The same AI that is being used today to generate fraudulent content and influence discussions on the Internet is also capable of generating synthetic accounts that are increasingly indistinguishable from real, human accounts. It is now becoming economical to completely automate the process of operating millions of accounts for years to emulate human behavior and build trust.

All this even if we’re incredibly careful about how we use LLMs.

Come talk about this more

Scared? Curious? In the Bay Area? Come join us at the MIT Club of Northern California to hear Charles and I in conversation with Ida Wahlquist-Ortiz. It should be a very interesting conversation.

What the hell is a CTO?

Apparently it’s CTO week in the technosphere. I just gave a keynote at the AWS CTO Night and Day conference in Nagoya. A “Why I code as a CTO”-post got ravaged on Hacker News via incandescent Nerd Rage. Then there was some LinkedIn discussion about “how to become a CTO”-posts, and how they tend to be written by people who’ve never been in the role. My framing is that being a CTO is generally about delivering the impossible — if your job was easy, your CEO would have already hired somebody cheaper.

Like with planning, these are tricky discussions to navigate because a) nobody really agrees on what the hell a CTO is and b) even if we did, it’s so company — and company stage — dependent that the agreement would be an illusion. The CTO role has only really been in existence for 40 years, so it isn’t shocking that defining it can prove challenging.

AWS asked me to weigh in anyway, so let’s give it a go.

But first, a brief incentive digression

Incentives. A CTO title from a high-flying company can be the springboard to future funding, board seats, C-level roles elsewhere, and all kinds of Life on Easy Mode opportunities. It can make you Internet Famous (tm), lead to embarrassingly high-paying speaking engagements, invitations to lucrative carry deals as a VC, and get you on speed dial from journalists at The Information looking for inside information.

For the non-business, non-CEO founder, CTO is a weighty title that implies a certain balance in power and responsibilities1. During fundraising, the CTO role can help early stage companies look more grown up2, signal weight of experience, level of technical moat, etc. All good.

A developer might also have grown up thinking about CTO as the gold ring they’re aspiring to.

These are all perfectly reasonable career. There are similar incentives around being a CEO. Pretending they don’t exist is foolish, but after acknowledging them, I want to focus instead on what matters for technology companies and organizations.

Building the right way

As CTO, you are one of the few leaders well positioned to own how you are build, prioritize, and allocate technical resources. In particular, are you chasing a well-understood product/problem/goal or are you venturing boldly into terra incognita? This distinction matters, because the tools for the former — hill-climbing, metrics-driven OKRs and KPIs — are much more challenging (and sometimes actively destructive) when applied to the unknown. Similarly, highly unstructured R&D adventures aren’t the most efficient or effective way to deliver a well-understood product. Neither is better in all cases and (almost) no company is wholly one or the other, but as CTO you must be opinionated here.

Learning and rate of positive change

I’ve written about this elsewhere but how fast you learn and measuring your rate of positive change delivered to customers is on the CTO.

My favorite rule of thumb from Linden Lab: 1/n is a pretty good confidence estimate when judging an developer’s time estimate in weeks.

Stay in the Goldilocks zone

In astronomy, there’s the idea of the Goldilocks zone. It’s the distance from a star where water is liquid. Too close, everything boils off. Too far, everything freezes. CTOs (like product/tech CEOs) have a very similar tightrope to walk. Stay too close to the tech, too close to all the critical decisions, and you deprive your company and teams from the chance to grow as leaders and technologists. Suddenly you’re trying to lead weak, disempowered leaders through a micro-management hellscape. On the other hand, drift too far away and your team — and CEO — loses a critical voice and thought partner. You’ll find yourself guessing and actively misdirecting the technology direction because you’re out of the loop.

What’s the right balance? It depends. On scale, on your skills, level of technical risk around you, etc. It’s also not static. Take a week to go through engineer onboarding. Challenge a deputy to deeply explore emerging tech. Explore the tech decisions that are being routed around you.

Two full time jobs

At any stage, a company that is dependent on technology innovation and delivery has distinct — but equally critical — challenges to solve: org health and tech vision.

Org health. Can developers able to do their best work? Are they setup for success? Are there minimal impediments to doing great work? Are they able to hire and fire effectively? Are speed and experimentation properly balanced against risk and debt? How does the tech org fit into the company, cooperate with other orgs? Do developers and other tech employees have career paths? Are ladders, levels, and comp aligned with company principles? Is the culture working?

Tech vision. Is the company looking around technology corners? Are the deeply technical decisions correct, tested, and working? Is the tech org staffed to solve both the current and next set of technology problems? Is the technology vision correct? Is the tech organization delivering against company mission, vision, and goals? For most people, one of these two challenges is likely to be more exciting and interesting. My past mistakes as CEO or CTO have been on the org health side. I’m an introvert with a great act, so I’ve learned to seek out strong partnership to reduce that risk.

Sometimes early stage companies can split this across CEO and CTO, or two tech cofounders can split it up. No matter how you solve it, recognize that you do need to solve it.

There are even options where the CTO has neither of these responsibilities, which can also work so long as somebody does have them.

Don’t just import hyperscaler crap

Real Talk (tm): you probably aren’t a hyperscaler. I hope you get there, but you’re not there yet. All those fancy toys Google, Meta, et al brag about? They solve problems you probably don’t have yet. Worse, they often generate high fixed costs/loads that hyperscalers don’t care about but will materially impact your business.

A few last thoughts

I’ve known quite a few extremely successful CTOs and if there’s one commonality it’s how differently they approached their role and day-to-day activities. Several wrote code everyday. One acted more as a private CEO R&D lab than org leader. Another was 85% VPE but had the keenest sense for emerging tech I’ve ever seen. Yet another was mostly outbound and deal focused3.

All of them rocked.

So, think about the core of the job that your company needs, is compatible with your CEO’s style, and fits your skills. Figure out how to really know how your team and company are performing. Rinse and repeat.

Footnotes

  1. It isn’t of course, since the CEO hires and fires.

  2. Despite Wired promoting me!

  3. Philip and I used to joke about the cartoon version of this type of CTO. Live and learn.

OODA Loops and Setec Astronomy

For the last 20 years, I have been alternately amused and terrified by the military cosplaying via lingo in the tech sector — with references to “S-2s” being among the most eye-rolling. I will however make a rule-proving exception with Bruce Schneier’s latest article about AI security: “Agentic AI’s OODA Loop Problem.”

Few people have thought longer or more deeply about cyber security than Bruce, and his reasoning behind adopting the OODA-loop framework is dead-on.

Traditional OODA analysis assumes trusted inputs and outputs, in the same way that classical AI assumed trusted sensors, controlled environments, and physical boundaries. This no longer holds true. AI agents don’t just execute OODA loops; they embed untrusted actors within them.

The OODA Loop

For those unfamiliar with the term, the OODA loop is a fighter pilot term originated by Air Force Colonel John Boyd. Boyd is credited with inventing basically everything about modern jet-fighter combat, from energy being the core currency of fighter engagements to the decision-making framework known as OODA:

  • Observe
  • Orient
  • Decide
  • Act

The influence and debate around OODA is far-ranging, but the important concepts to take away are the idea of gathering information and data, processing it in the context of goals, making a decision comparatively late in the process, and then acting. Then you repeat the loop, with new data from your actions.

It’s the core of most agile thinking. “Strong opinions, loosely held” is OODA shorthand. It’s a very effective methodology in many circumstances and is even built to be resilient to noise/misdirection in observations. Unfortunately, it is not designed to tolerate a hostile actor running their own OODA loop within each step.

And that’s the world we’re stepping into.

The Threat Surface

Schneier’s article breaks down the implications for each step:

Observe: The risks include adversarial examples, prompt injection, and sensor spoofing. A sticker fools computer vision, a string fools an LLM. The observation layer lacks authentication and integrity.

Orient: The risks include training data poisoning, context manipulation, and semantic backdoors. The model’s worldview—its orientation—can be influenced by attackers months before deployment. Encoded behavior activates on trigger phrases.

Decide: The risks include logic corruption via fine-tuning attacks, reward hacking, and objective misalignment. The decision process itself becomes the payload. Models can be manipulated to trust malicious sources preferentially.

Act: The risks include output manipulation, tool confusion, and action hijacking. MCP and similar protocols multiply attack surfaces. Each tool call trusts prior stages implicitly.

These are all supply-chain and compiler attacks as a service. It used to be that these types of attacks required significant time, money, and/or technical expertise — consider the cleverness of Ken Thompson’s 40-year-old backdooring of the C-compiler — but these are now available to pretty much anyone with an LLM.

Suddenly, rather than debating “Fast, Cheap, Good”, we’re debating “Fast, Smart, Secure”:

This is the agentic AI security trilemma. Fast, smart, secure; pick any two. Fast and smart—you can’t verify your inputs. Smart and secure—you check everything, slowly, because AI itself can’t be used for this. Secure and fast—you’re stuck with models with intentionally limited capabilities.

Alignment and Integrity

OpenAI had a chance to discuss these issues around the launch of Atlas but was deafeningly quiet about it initially. Their CISO did a long post to twitter, which Simon Willison pulled into a manageable post. It’s pretty sobering reading. Sure, their goals are admirable:

Our long-term goal is that you should be able to trust ChatGPT agent to use your browser, the same way you’d trust your most competent, trustworthy, and security-aware colleague or friend.

Sure, and I want a pony. The how gets much thinner. On the hand, they advocate for logged our mode and forced human observation — basically Schneier’s “slower, less smart” tradeoff — but then we get this absolutely brutal comment:

New levels of intelligence and capability require the technology, society, the risk mitigation strategy to co-evolve. And as with computer viruses in the early 2000s, we think it’s important for everyone to understand responsible usage, including thinking about prompt injection attacks, so we can all learn to benefit from this technology safely.

Let’s be clear: nobody understands responsible usage for LLMs. If they did, we wouldn’t have daily reports of successful data exfiltration. Or LLM psychosis. Or “error-ridden” rulings by US District Judges.

The good news — such as there is — is that the big model developers have every incentive to solve the alignment problem and make architecture improvements at every stage from training through inference. AI slop requires this. Model integrity and user safety, too.

What about right now? My recommendation would be that if you are exploring agentic browsers — and anyone working in AI really should — to do it in logged out, locked down, and sandboxed ways. I would avoid browser makers known for abuse of robots.txt and user data. Yolo-mode only in very controlled ways.