Why did Walmart's ChatGPT checkout flow convert 3 times worse than the regular Walmart.com experience?

Walmart's ChatGPT checkout underperformed because AI-assisted discovery and checkout differ significantly from classic web sessions. Factors such as different user intent, context, trust levels, control, and measurement systems mean conversion rates are not directly transferable between traditional website flows and AI assistant interactions.

How does AI-assisted commerce differ from traditional ecommerce in terms of user intent?

AI assistants compress the 'shopping' part of the journey, skipping many pre-conversion activities like browsing categories, reading reviews, and comparing products. This 'intent compression' results in users arriving at checkout with less internal commitment, causing higher drop-off rates compared to traditional ecommerce flows where users form stronger purchase intent through exploration.

What role does merchandising context play in AI-assisted shopping experiences?

Traditional ecommerce sites act as persuasion machines using social proof, comparison tables, bundles, upsells, stock urgency cues, shipping promises, returns reassurance, fit guides, compatibility checks, and user-generated photos. AI assistants often strip away or simplify this rich merchandising context, reducing product trustworthiness and brand differentiation which can negatively impact conversion rates.

Why is trust a critical factor affecting conversions in AI-assisted checkouts?

While users may trust AI assistants for summarizing or recommending products, payment triggers a need for clear merchant identity, fulfillment responsibility, returns policies, customer support paths, and reassurance against fraud. On a website these are visible and familiar; inside an assistant experience they can feel uncertain or like a handoff, increasing friction and reducing conversions.

What does it mean that 'conversion rate is not portable' when shifting to AI commerce flows?

Conversion rate depends on an entire system including traffic source pre-framing, user context upon arrival, merchandising environment, trust state, UX control over purchase path, measurement tools to diagnose drop-offs, and feedback loops for funnel improvement. Changing any part of this 'container' changes how conversion behaves; thus you can't assume traditional conversion rates will hold in new AI-assisted commerce environments.

How should ecommerce businesses approach their AI strategy given these insights?

Businesses should recognize that AI-assisted discovery and checkout require different approaches than classic web sessions. They need to design experiments that measure true profitability rather than novelty traffic. Strategies must address differences in user intent formation, merchandising presentation, trust building during payment stages, and adapt SEO and product roadmaps accordingly to optimize AI commerce performance.

Walmart ChatGPT Checkout 3x Worse: AI Commerce Lesson

Hacker News had a fun little moment recently. Someone surfaced a Search Engine Land story saying Walmart tested a ChatGPT checkout flow and it converted roughly 3x worse than the normal Walmart.com experience.

Not a little worse. Not “needs iteration”. Three times worse.

And if you run ecommerce, or you own SEO, or you sit in product and you are being asked, weekly, “what’s our AI strategy?”, this is the kind of datapoint that matters. Because it forces the uncomfortable question underneath the hype.

If AI assistants are going to “replace websites”, why did the assistant checkout underperform a regular site journey so badly?

The lesson here is not “AI is useless”. It’s that AI-assisted discovery and AI-assisted checkout are different animals than classic web sessions. Different intent. Different context. Different trust. Different control. Different instrumentation. So you can very easily mistake novelty traffic for profitable traffic.

Let’s break down what’s actually happening, what it means for your SEO and product roadmap, and how to run experiments that tell you the truth.

The signal: conversion rate is not portable

In normal ecommerce thinking, you take a high intent user, you remove friction, conversion goes up. AI assistants promise both. More intent and less friction.

But conversion isn’t a universal constant you can “move” into a new container.

Conversion rate is an outcome of an entire system:

The source and its pre framing
The amount of context the user has when they land
The merchandising environment (comparison, reviews, alternatives, bundles, inventory cues, returns policy)
The trust state
The UX control you have over the path to purchase
The measurement you have to diagnose drop offs
The feedback loops you use to improve the funnel

Change the container and you change the system. Which is why some AI commerce flows will print money. And others will quietly bleed.

Also, this aligns with the bigger trend we have been writing about. AI answers are already reshaping click patterns and starving pages of the “warm up” traffic that used to convert later. If you want the broader picture, this piece on Google AI summaries and traffic loss is worth reading: Google AI summaries killing website traffic and how to fight back.

Now, let’s get specific.

Why AI-assisted commerce can underperform site-native flows

1. Intent compression: the assistant skips the “shopping” part

Classic ecommerce journeys have a lot of “pre conversion work” baked in:

scrolling category pages
filtering
reading reviews
comparing specs
opening multiple tabs
checking shipping and returns
looking for coupons
sanity checking the brand

This isn’t wasted time. It’s intent formation. It’s the user convincing themselves.

AI assistants compress all of that into a few turns of chat. Great for speed, but it can create a weird outcome: the user arrives at checkout with less internal commitment, even if they sound decisive in the chat.

A chat message like “yeah that one seems fine, buy it” is not the same as someone who spent six minutes comparing two SKUs, checked sizing, and read 10 reviews. The second person has already done the emotional work. The first is still half browsing.

So you get more “almost” buyers hitting checkout. And checkout hates “almost”.

This is one reason AI commerce can generate what looks like high intent traffic (short sessions, direct product selection) that actually behaves like mid funnel traffic when you measure completion.

2. Weak merchandising context: AI can pick, but it can’t fully merchandise

Your website is not just a catalog. It’s a persuasion machine. The best ecommerce sites do a thousand little things:

social proof density
“why this one” bulleting
comparison tables
bundles
upsells that actually help
stock urgency that is real
shipping promise clarity
returns reassurance
fit guides
compatibility checks
user generated photos

An assistant might recommend a product, but it often strips away the context that makes the product feel safe to buy.

Even if the AI shows some of this, you don’t control how it’s presented, what’s omitted, or what’s simplified. Sometimes that’s fine. Sometimes it’s fatal. Especially for products where the purchase decision is really about risk management.

Also, AI can collapse differentiation. Your brand work gets reduced to “seems good” unless the assistant is explicitly fed structured reasons to trust you.

We saw a similar dynamic in the broader AI shopping space, where the discovery layer becomes the “storefront” and your PDP becomes optional. If you are tracking this shift, this is relevant background: ChatGPT search shopping update and what it means for ecommerce SEO.

3. Trust friction: users trust the assistant, then suddenly they don’t

This part is subtle.

People will trust an assistant to summarize. They will trust it to compare. They will even trust it to recommend. But paying is different.

Payment is where the user stops thinking “this is informational” and starts thinking “this is real money leaving my account”. That’s where they want:

clear merchant identity
clear fulfillment responsibility
clear returns
clear customer support path
reassurance they’re not being tricked

On your site, all of that is visible and familiar. Inside an assistant experience, it can feel like a handoff. Or worse, a blur.

Also, fraud anxiety is not rational. It’s vibes. Users are trained to be suspicious of new payment flows. And AI checkout is new.

If you’ve followed brand trust issues around AI, you already know how quickly confidence collapses when something feels off. Even unrelated AI incidents have trained users to be cautious. (This is adjacent but relevant: Meta AI celebrity impersonator detection and brand trust.)

4. Checkout UX control: you lose the levers that usually save conversion

When conversion drops on your site, you can do things:

reduce steps
change copy
add express pay
add guest checkout
improve error handling
tune address validation
show shipping earlier
localize payment methods
fix mobile layout bugs
implement abandoned checkout recovery

In an AI mediated flow, you often lose direct control of the UX. Even if the assistant is “powered by” your systems, the interaction layer is not yours. That means fewer levers, slower iteration, and more dependency on a third party’s design decisions.

It also means your usual conversion best practices may not map. A chat interface can be great for choices. It can be clunky for forms. And checkout is mostly forms, edge cases, validation, and compliance.

So Walmart’s result might not be “ChatGPT is bad at commerce”. It might be “checkout UX is not something you can casually abstract away.”

5. Attribution gets weird, and weird attribution creates bad decisions

AI commerce flows can break your analytics in a few ways:

traffic can be mislabeled or lumped into “referral”
deep links may skip pages that normally set cookies and sessions
cross domain handoffs can drop parameters
server side events may not tie cleanly to user journeys
you lose view of “assist interactions” that shaped the decision

When attribution is fuzzy, you can’t tell what’s happening.

And then the really dangerous thing happens. Teams start using proxy metrics:

“we got mentioned in ChatGPT”
“we show up in AI answers”
“AI traffic is up”
“engagement time is high”
“users asked about us”

All nice. None are profit.

If you want to think about AI like a channel you can actually operate, you need to map it into the same measurement discipline as SEO and paid. Incrementality. Cohorts. Contribution margin. Not vibes.

(If your org is still building manual reporting and duct taping workflows together, it gets worse. Here’s a good internal read on moving faster with automation: AI workflow automation to cut manual work and move faster.)

6. Optimization blind spots: you can’t improve what you can’t see

On your site, you can watch recordings, analyze funnels, run A B tests, inspect logs, and ship fixes daily.

In AI assistant commerce, you may not see:

what the user was told before they arrived
what alternatives were suggested
what objections the user raised
what the assistant answered (and whether it was accurate)
whether your product data was misinterpreted
whether your shipping promise was represented correctly

That’s brutal because a lot of conversion problems in commerce are not “the product is wrong”. They’re “the user got the wrong expectation.”

If an assistant tells them something slightly off, the checkout becomes the first moment they realize it. That is when they bounce, annoyed. And you never know why.

This is where AI search strategy starts to overlap with structured data, feed hygiene, and content that is written for extraction not just ranking. You’re not only optimizing for Google’s blue links. You’re optimizing for a model’s ability to represent you accurately.

If you want one practical angle on how AI systems compare and choose, this is a useful contrast piece: Amazon vs Perplexity AI shopping and the SEO implications.

What brands should measure before celebrating AI-commerce traffic

If you’re getting traffic from AI assistants, do not start with “sessions” or “mentions”. Start with unit economics and funnel integrity.

Here’s a clean measurement list that works even if the AI platform is messy.

Measure conversion quality, not just conversion rate

Checkout start rate (AI referred sessions that reach checkout)
Payment attempt rate (how many try to pay)
Authorization failure rate (fraud filters, payment errors)
Time to purchase (how long from landing to order confirmation)
New vs returning mix (AI will skew new, which usually converts worse)
AOV and items per order
Return rate and cancellation rate (this is huge. intent compression can inflate returns)
Customer support contacts per order (signals expectation mismatch)
Contribution margin per visit (not revenue per visit)

If AI referrals “convert” but they come with higher returns and more support load, you may be buying expensive chaos.

Measure trust and expectation mismatch

PDP bounce rate (if they land on PDPs at all)
Shipping info interactions (do they immediately look for shipping and returns?)
Policy page views
Coupon field focus rate (sounds small, but it’s a trust signal in many verticals)
Post purchase survey: “did anything surprise you?”

And yes, sometimes you need qualitative research. Five user interviews can explain what 50,000 sessions can’t.

Measure category and SKU sensitivity

AI doesn’t hit all products equally. Segment by:

commodity vs considered purchase
price bands
size and fit complexity
regulated categories
products with high variant complexity

You’ll usually find that AI performs “fine” on simple replenishment and gets wrecked on nuanced selection.

How to design experiments that compare AI referrals vs classic organic and on-site flows

You need experiments that are boring. Controlled. A little annoying to set up. The kind that produce an answer you can defend to a skeptical CFO.

1. Create a clean channel definition for AI traffic

At minimum:

UTM standards for AI partnerships or tracked links
referral domain rules (ChatGPT, Perplexity, Copilot, etc.)
separate reporting views for “AI discovery” vs “AI checkout” if those are different flows

Don’t let AI sessions fall into “Direct”. That’s how you lose the plot for six months.

2. Match cohorts, not just channels

Comparing AI traffic to “site average” is lazy. Match by:

device type
geography
new vs returning
category
price band
time of day (seriously, it matters)
promo exposure

Then compare conversion and margin.

If AI traffic is mostly top of funnel research users, your benchmark should be informational SEO landings, not branded search.

3. Use holdouts where possible

If you can influence exposure (some brands can, some can’t):

hold out a subset of SKUs from AI commerce programs
hold out a subset of geo regions
time-based holdouts (two weeks on, two weeks off)

Then measure incrementality. Not correlation.

4. Build a “friction map” for the AI flow

Document:

what steps exist
where the user leaves your controlled environment
what identity and payment artifacts appear
what confirmation and support paths exist

Then instrument everything you can control. Especially:

add to cart
checkout start
shipping step completion
payment errors
order confirmation
post purchase email open and support interactions

Even a rough map will show you why the experience might convert worse. Usually, it’s not magical. It’s one or two sharp edges.

5. Run A B tests on your own site to mimic AI compression

You can simulate intent compression by sending users into deeper steps:

land on PDP vs collection
land on “quick buy” modules
hide some comparison content
change how shipping is disclosed

If conversion changes in similar ways, you’ve learned something important: the assistant isn’t the only variable. The “missing context” is.

So what should you actually do? Practical moves that help

A few steps that tend to pay off regardless of which assistant wins.

Make your product data and policies easy for machines to represent

This means:

clean titles, variants, attributes
consistent shipping and returns info
clear stock status
structured data where applicable
feed hygiene if you run shopping feeds

Not glamorous. But this is the substrate AI uses.

Build content that answers purchase blocking questions

AI surfaces answers. So give it answers that remove risk:

sizing and fit guidance
compatibility matrices
“what’s in the box”
warranty clarity
returns steps
shipping cutoffs

This isn’t just for ranking. It’s for accurate representation inside AI results. If you need a workflow for creating and optimizing content at scale without it turning into generic mush, this is relevant: AI SEO tools for content optimization.

Treat AI visibility like SEO, but with profit constraints

You still want to show up. You just don’t want to confuse visibility with business value.

This is where a platform like SEO Software fits naturally. Not as “AI will write 10,000 posts, congrats”, but as an operating system for building and updating content that is meant to rank, be extracted correctly, and drive profitable demand. Research, writing, optimization, publishing, and ongoing refresh. The boring stuff that wins.

If you want the deeper workflow view, here’s a good internal guide: An AI SEO content workflow that ranks.

The real lesson from Walmart’s 3x worse conversion

AI assistants are not a cheat code that lets you skip the hard parts of ecommerce. They move the hard parts around.

They compress intent. They strip context. They introduce trust and handoff friction. They reduce your UX control. They complicate attribution. And they hide the very signals you normally use to optimize conversion.

Which is why some AI commerce initiatives will look exciting in screenshots and disappointing in revenue reports.

So yes, pursue AI commerce. But pursue it like an operator.

Measure profit per visit, not mentions. Run cohort matched experiments, not hype driven launches. And build your SEO and content engine around the reality that AI is becoming a front door, whether we like it or not.

If you want to get serious about that, and optimize for profitable AI traffic, not novelty traffic, take a look at SEO Software. It’s built for teams that want to automate the SEO work, keep quality high, and stay visible in both Google and the AI assistants that are quietly eating the top of the funnel.

Walmart Says ChatGPT Checkout Converted 3x Worse Than Its Website: The Real AI Commerce Lesson