AI shopping is years away. The infrastructure isn't.

tl;dr

AI shopping is like a self driving car -- it's years away.
Value migrates to the infrastructure layers, and Stripe is a winner.
It will favour big brands and scale players. Smaller businesses suffer.

I recently bought a shampoo brand I'd never heard of based on some very specific filters that I ran through Gemini and ChatGPT to minimise bias and get a "neutral" view on the best product for me.. and I bought it (I won't name the shampoo.. but if someone one wants to sponsor me.. I'm all ears).

Recently I'd also seen a few headlines on Shopify CLI, and UCP, and ACP and seen "AI Shopping" tools popping up. Anthropic just had Claude agents transact on behalf of 69 of their employees in an internal marketplace experiment. The logical rabbit hole for me to dig into then is - "what is the future of online shopping?"

Stripe with the assist

Stripe's annual letter is one of the better reads in tech. This year's had a framework tucked about halfway through that guided how I think about AI shopping.

They sketched out five levels of how AI agents interact with commerce, from filling out forms for you to buying things before you've asked.

Hand-drawn sketch of Stripe's five levels of agentic commerce: L1 Autofill, L2 Recommendations, L3 Personalised Recs, L4 Personal Shopper, L5 Proactive Shopper. L1 is circled with 'WE ARE HERE'. A dashed line marked 'THE GAP' sits between L3 and L4. A straight arrow labelled 'MORE AUTONOMY' points right beneath the row of boxes.

Level 1 is the agent typing in your credit card number and clicking buy. Let's call this "autofill"
Level 2 is you describing a situation like "stationery for a 9 year old who likes KPop and tennis" and the agent finding stuff that matches. Let's call this "recommendations".
At Level 3, the agent remembers your preferences and the suggestions get sharper. You still choose. Let's call this "personalised recommendations".

After this the agent buys for you.

Level 4 is where you stop choosing. You say "sort out my back to school shopping + keep it under $400", and the agent just does it. Let's call this "personal shopper".
Level 5 has no prompt at all. The agent just buys. Let's call this "proactive shopper".

Stripe's own read on where the industry actually is: hovering on the edge of autofill and recommendations.

That's actually pretty insightful.

Most blogs and strategy decks I've read in the last year has been about personal/proactive shopping aka 'auto buying'. Agents buying on your behalf. Meanwhile... the reality is searching for you and filling out forms?!

The gap between personalised recommendations vs personal shopper is structural. At Level 3, existing business models all survive -- agents search, humans decide, merchants keep their customer relationships. At Level 4, the agent sits between you and the merchant and makes the call. That's a different purchasing motion and therefore a different economy entirely.

If the Level 3 vs Level 4 framing reminds you of self-driving cars, that's not an accident (pardon the pun). Level 2 driver assistance (e.g. lane keeping, adaptive cruise) showed up around 2016. Everyone assumed Level 4 self-driving (robotaxi basically) was a few years behind. It's 2026 and we're still not quite there yet except for pocketed trials due to different engineering problems, regulatory problems, different liability/insurance models.

The gap was and is structural, not incremental. Agentic commerce has the same sort of gap.

So what actually changes at personal shopper?

Every shift in internet commerce so far has been about where you discover products.

Search engines moved discovery from shelves to web pages.
Social media moved it from search results to feeds.
Marketplaces consolidated it onto one platform...

...but YOU still made the decision. You saw the options, weighed the tradeoffs, and clicked buy.

That's everything up to personalised recommendations (L3). The agent helps you find things. You pick.

Personal shopper (L4) is different. The AGENT makes the decision. You set a budget, set a goal, and the agent evaluates, compares, and buys. You approve the result, maybe but the judgement happened without you.

Past tech shifts moved where you discover. This one moves who decides. A reorganisation of who captures value and why.

This might be called 'the delegation economy' -- the shift where value migrates from whoever attracted your attention to whoever absorbs your delegated decision.

It's worth taking this seriously even though we're only at the autofill and recommendations stage, because the companies building the infrastructure right now are making bets that only pay off when we hit personal shopper.

Early days, but the direction seems clear

Per-token inference costs fell ~1,000x while enterprise AI spending grew 320% over the same three years. Agentic commerce interactions burn 20 to 50x more tokens than a simple search query because each agent runs multiple steps -- evaluate, compare, negotiate, transact.

Cheaper to compute, more expensive to deploy

Per-token cost vs enterprise AI spend, 2023 → 2026

Per-token inference cost

← ~1,000x cheaper

Enterprise AI spending

+320% →

Inference costs collapsed (a16z: ~1,000x cheaper). Enterprise AI spend grew 320% over the same period (Stripe). Cheap tokens didn't make agentic commerce free -- they just made it possible at all.

Cheaper tokens didn't make agentic commerce cheap. They just made it possible at all.

Adoption gap is another story. Orders to Shopify stores from AI sourced traffic are up roughly 15x since January 2025. Tiny base of numbers, sure but the more interesting number is that Shopify activated all 5.6 million stores for agent access, and about a dozen merchants ever went live with agentic tools during OpenAI's Instant Checkout pilot. OpenAI pulled the whole thing back in March 2026, less than six months after launch. The Instant Checkout retreat tells you most of what you need to know... i.e. infrastructure running lightyears ahead of adoption.

THIS is the strongest evidence we're genuinely at autofill/recommendations stage, not just theoretically and that there is a big gap to get to the next levels.

I mean, think about it... OpenAI has the largest consumer AI user base on earth. They tried to turn ChatGPT into a personal shopper and about a dozen Shopify merchants integrated. Conversion was weak. OpenAI's wording on the shutdown: "the initial version of Instant Checkout did not offer the level of flexibility that we aspire to provide." Purchases now route back to merchants' own storefronts. If the leader in agentic commerce couldn't make AI personal shoppers work in 2026, the timeline to mainstream personal shopper is longer than the roadmaps claim.

Ok but who actually captures this future value?

Every era of internet commerce has had a value stack -- layers of infrastructure and services, with the most important layer capturing the most value. In the delegation economy, I see five:

1. Settlement; payment processing, clearing, compliance. Stripe, Visa, card networks. Every delegated decision ends in a transaction, and someone settles it.

2. Merchant infrastructure; checkout, inventory, subscriptions, loyalty, returns, tax, shipping. e.g. Shopify, and Amazon's internal systems.

3. Protocol and routing; how agents connect to merchants. Shopify's Universal Commerce Protocol, Stripe's Agent Commerce Protocol, and about eight more protocols entering the field in Q2 2026 -- Mastercard Verifiable Intent, Visa Ready, Stripe Machine Payments, etc. Ten active protocols, zero interoperability.

4. Agent platform; where the delegation actually happens. ChatGPT, Google Gemini, Amazon Rufus. i.e. the layer that absorbs your decision.

5. Consumer identity; who the agent works for. Preferences, budget authority, purchase history. Held by the agent platform, but belonging ultimately to you.

At the autofill/recommendations stage (where we are), most value sits where it always has -- in merchant infrastructure and whatever platform owns the consumer relationship. (i.e. 3 and 4 are places like Google search and Instagram/TikTok) The agent is a tool, not a decision maker.

At the future personal shopper (L4), value migrates. The agent platform absorbs the pre-purchase value that used to be split across an entire industry -- search, advertising, comparison shopping, brand marketing. The settlement layer captures more because there are more transactions to settle. The layers in between -- merchant infrastructure, protocols, brand -- compress.

Who captures value as consumers delegate more

Illustrative value distribution across the five-layer delegation stack. Agent platform rises gradually L1-L3 (it is doing real work -- autofilling, recommending, personalising) then jumps at personal shopper and again at proactive shopper. Brand holds through L4 because consumers still encode preferences or name brands explicitly; only collapses at L5 when the agent buys before being asked. Merchant infra compresses as the agent absorbs the front end.

Shopify's tell Tobi Finkelstein told a room of Morgan Stanley analysts that "OpenAI will run the front end... Shopify still runs the back end". He framed it as a feature. In every digital business that has actually mattered, the front end is where the relationship lives. Google, Facebook, Amazon -- all front end companies.

I do this for the love of writing — each subscriber costs me money. But please subscribe anyway and spread the word.

Isn't AI based discovery a good thing? Isn't it fairer?

Shopify's pitch for agentic commerce is that it levels the playing field. Finkelstein calls it "merit based discovery at scale". Every merchant, no matter how small, gets surfaced to every consumer through every agent. More doors, more chances.

There's truth to it -- Shopify's newer cohorts are outperforming older ones; European GMV grew 45% in Q4. Real merchants, getting real distribution.

But "merit" to an agent is price, delivery speed, reviews, specifications. The columns with data. An agent optimising on those dimensions systematically favours merchants with scale advantages. Lower unit costs produce lower prices. Bigger logistics networks produce faster shipping. More customers produce more reviews.

Spotify is the analogy here. Streaming removed the gatekeeping physical distribution used to require -- any artist, anywhere, could upload a track. After a decade of streaming, the top 1% of artists earn roughly 90% of revenue, and the major labels people expected streaming to displace grew their share. Access was democratised but outcomes ended up concentrated.

Shopify's "merit based discovery" for merchants looks structurally identical.

Stripe's own data says the sorting is well under way. The top third of US public companies hold 2/3 of total market capitalisation -- the highest profit concentration since tracking began in 1963. Small business loans under US$1 million have fallen 5% since 2010; loans over US$1 million are up 68%.

Capital flows toward businesses that are ALREADY winning.

Stripe calls this "the sorting machine." AI agents accelerates this sorting and "inequity", if you want to call it that.

Does anyone win the protocol war?

Right now, there are several competing protocols.. UCP, ACP, Shopify Agents, Mastercard Verifiable Intent, Stripe Machine Payments, Klarna Agentic Product Protocol, Visa Agentic Ready (and likely many more) and none ship the same identity or payment model.

Finkelstein says they'll converge. "These things are going to converge... they'll be open source in this way, that will be the new language." The industry's quietly shifted from "convergence" to "coexistence" -- each protocol serving different moments of intent.

If the protocols coexist technically, value accrues to who controls the default payment route inside each one. That's whoever already sits beneath them.

That's Stripe. Their Agentic Commerce Suite supports both ACP and UCP from a single integration. They co-designed ACP with OpenAI while powering the payment infrastructure underneath Shopify's UCP. They process US$1.9 trillion annually -- about 1.6% of global GDP. Protocol-agnostic because they sit beneath.

The Visa precedent is the cleanest version of this story. Between 2000-2010 one question that consumed analysts was Amazon vs eBay vs (insert online marketplace here). Visa processed payments for both of them and more... Visa's market cap today, roughly US$550 billion, exceeds every combatant from that era except Amazon. The most durable value didn't accrue to whoever won the front end. It accrued to whoever settled the transactions beneath it.

Stripe is adding NOS to this - Vin Diesel would be proud. Stripe Capital grew 45% YoY in 2025, funding 81,000 businesses, with recipients growing 27 percentage points faster than comparable non-recipients. Stripe sees every transaction, lends to whoever's growing fastest, and the agents recommend them more. The flywheel is the difference.

Visa was a passive beneficiary of commerce concentration. Stripe is an active accelerator of it.

What about brands? Can an agent tell the difference?

The easy argument is that agents only see structured data -- price, specs, reviews -- and are blind to brand. End of brands.

I think that's too simple. Agents read the internet. They can parse editorial reviews, Reddit threads, expert recommendations etc. They can work out that Patagonia stands for something different than a generic outdoor jacket with the same specs... but there's still a gap between summarising what a brand's customers SAY and feeling what a brand STANDS for.

The agent doesn't experience the brand. It doesn't walk past the store, see the editorial, feel aspirational. Brand loyalty survives when the consumer already has it. The agent won't build it on its own.

Merit to an agent is a very sophisticated spreadsheet. The columns with data get optimised. The things that don't reduce to data get compressed.

So the real question is whether agents can create the desire that makes a consumer willing to pay a premium. Probably not. Agents can discover, evaluate, and fulfil brand preferences. They can't create them.

The sorting machine produces a barbell. Established brands with name recognition benefit -- a consumer who says "buy me Nike running shoes" bypasses the algorithm entirely; the brand name is the instruction. Niche brands survive through specificity -- nobody searches for "buy me the cheapest ceramic" and ends up at a Japanese workshop.

The brand barbell

Brand survival probability when an agent does the buying

survives

squeezed

survives

Niche / specialty

consumer asks by name

e.g. Japanese ceramics, artisan makers

Mid market

no scale, no specificity

undifferentiated DTC, generic mid-tier retailers

Luxury / named brands

brand IS the instruction

e.g. Nike, Hermès, Patek

At the extremes, brands survive -- niche by specificity, luxury by name recognition. The middle (undifferentiated DTC, mid market retailers) gets sorted out by the agent.

The brands that get sorted out are in the middle. Too small to dominate on agent readable metrics, not big enough to be recognisable brands.

Mid market retailers, undifferentiated DTC... the middle tier's economics collapse.

What brands should actually do about this is a Part 2 question. The structural fact is that the middle gets squeezed.

Two threads worth pulling on

The framework above tells you what the delegation economy is. Value migrates from attention to delegation. The sorting favours scale players (either via brand name or operational efficiency). The settlement layer takes its toll. Brand equity gets compressed to what the agent can measure.

Two questions follow.

First, is timing. If the structural dynamics activate at personal shopper (L4), and we're still at autofill/recommendations, how far away is any of this really? The gap matters a lot since it changes what the smart move is right now -- pick the right infrastructure, not the best agent. That's a Part 2 article.

Second, direction. The whole Western conversation assumes agents work for consumers. There's another delegation economy where agents work for merchants. China has been running it at scale for years. Western B2B is accidentally copying it. By most measures, it's further along than anything in the B2C conversation. That's a Part 3 article.

Both coming soon... stay tuned.

Disclaimer: Thoughts are my own and do not represent any other parties.