Walmart Says ChatGPT Checkout Converted 3x Worse Than Its Website: The Real AI Commerce Lesson
Walmart says ChatGPT checkout converted 3x worse than its website. Here’s what that means for ecommerce SEO, AI traffic quality, and conversion design.

Hacker News had a fun little moment recently. Someone surfaced a Search Engine Land story saying Walmart tested a ChatGPT checkout flow and it converted roughly 3x worse than the normal Walmart.com experience.
Not a little worse. Not “needs iteration”. Three times worse.
And if you run ecommerce, or you own SEO, or you sit in product and you are being asked, weekly, “what’s our AI strategy?”, this is the kind of datapoint that matters. Because it forces the uncomfortable question underneath the hype.
If AI assistants are going to “replace websites”, why did the assistant checkout underperform a regular site journey so badly?
The lesson here is not “AI is useless”. It’s that AI-assisted discovery and AI-assisted checkout are different animals than classic web sessions. Different intent. Different context. Different trust. Different control. Different instrumentation. So you can very easily mistake novelty traffic for profitable traffic.
Let’s break down what’s actually happening, what it means for your SEO and product roadmap, and how to run experiments that tell you the truth.
The signal: conversion rate is not portable
In normal ecommerce thinking, you take a high intent user, you remove friction, conversion goes up. AI assistants promise both. More intent and less friction.
But conversion isn’t a universal constant you can “move” into a new container.
Conversion rate is an outcome of an entire system:
- The source and its pre framing
- The amount of context the user has when they land
- The merchandising environment (comparison, reviews, alternatives, bundles, inventory cues, returns policy)
- The trust state
- The UX control you have over the path to purchase
- The measurement you have to diagnose drop offs
- The feedback loops you use to improve the funnel
Change the container and you change the system. Which is why some AI commerce flows will print money. And others will quietly bleed.
Also, this aligns with the bigger trend we have been writing about. AI answers are already reshaping click patterns and starving pages of the “warm up” traffic that used to convert later. If you want the broader picture, this piece on Google AI summaries and traffic loss is worth reading: Google AI summaries killing website traffic and how to fight back.
Now, let’s get specific.
Why AI-assisted commerce can underperform site-native flows
1. Intent compression: the assistant skips the “shopping” part
Classic ecommerce journeys have a lot of “pre conversion work” baked in:
- scrolling category pages
- filtering
- reading reviews
- comparing specs
- opening multiple tabs
- checking shipping and returns
- looking for coupons
- sanity checking the brand
This isn’t wasted time. It’s intent formation. It’s the user convincing themselves.
AI assistants compress all of that into a few turns of chat. Great for speed, but it can create a weird outcome: the user arrives at checkout with less internal commitment, even if they sound decisive in the chat.
A chat message like “yeah that one seems fine, buy it” is not the same as someone who spent six minutes comparing two SKUs, checked sizing, and read 10 reviews. The second person has already done the emotional work. The first is still half browsing.
So you get more “almost” buyers hitting checkout. And checkout hates “almost”.
This is one reason AI commerce can generate what looks like high intent traffic (short sessions, direct product selection) that actually behaves like mid funnel traffic when you measure completion.
2. Weak merchandising context: AI can pick, but it can’t fully merchandise
Your website is not just a catalog. It’s a persuasion machine. The best ecommerce sites do a thousand little things:
- social proof density
- “why this one” bulleting
- comparison tables
- bundles
- upsells that actually help
- stock urgency that is real
- shipping promise clarity
- returns reassurance
- fit guides
- compatibility checks
- user generated photos
An assistant might recommend a product, but it often strips away the context that makes the product feel safe to buy.
Even if the AI shows some of this, you don’t control how it’s presented, what’s omitted, or what’s simplified. Sometimes that’s fine. Sometimes it’s fatal. Especially for products where the purchase decision is really about risk management.
Also, AI can collapse differentiation. Your brand work gets reduced to “seems good” unless the assistant is explicitly fed structured reasons to trust you.
We saw a similar dynamic in the broader AI shopping space, where the discovery layer becomes the “storefront” and your PDP becomes optional. If you are tracking this shift, this is relevant background: ChatGPT search shopping update and what it means for ecommerce SEO.
3. Trust friction: users trust the assistant, then suddenly they don’t
This part is subtle.
People will trust an assistant to summarize. They will trust it to compare. They will even trust it to recommend. But paying is different.
Payment is where the user stops thinking “this is informational” and starts thinking “this is real money leaving my account”. That’s where they want:
- clear merchant identity
- clear fulfillment responsibility
- clear returns
- clear customer support path
- reassurance they’re not being tricked
On your site, all of that is visible and familiar. Inside an assistant experience, it can feel like a handoff. Or worse, a blur.
Also, fraud anxiety is not rational. It’s vibes. Users are trained to be suspicious of new payment flows. And AI checkout is new.
If you’ve followed brand trust issues around AI, you already know how quickly confidence collapses when something feels off. Even unrelated AI incidents have trained users to be cautious. (This is adjacent but relevant: Meta AI celebrity impersonator detection and brand trust.)
4. Checkout UX control: you lose the levers that usually save conversion
When conversion drops on your site, you can do things:
- reduce steps
- change copy
- add express pay
- add guest checkout
- improve error handling
- tune address validation
- show shipping earlier
- localize payment methods
- fix mobile layout bugs
- implement abandoned checkout recovery
In an AI mediated flow, you often lose direct control of the UX. Even if the assistant is “powered by” your systems, the interaction layer is not yours. That means fewer levers, slower iteration, and more dependency on a third party’s design decisions.
It also means your usual conversion best practices may not map. A chat interface can be great for choices. It can be clunky for forms. And checkout is mostly forms, edge cases, validation, and compliance.
So Walmart’s result might not be “ChatGPT is bad at commerce”. It might be “checkout UX is not something you can casually abstract away.”
5. Attribution gets weird, and weird attribution creates bad decisions
AI commerce flows can break your analytics in a few ways:
- traffic can be mislabeled or lumped into “referral”
- deep links may skip pages that normally set cookies and sessions
- cross domain handoffs can drop parameters
- server side events may not tie cleanly to user journeys
- you lose view of “assist interactions” that shaped the decision
When attribution is fuzzy, you can’t tell what’s happening.
And then the really dangerous thing happens. Teams start using proxy metrics:
- “we got mentioned in ChatGPT”
- “we show up in AI answers”
- “AI traffic is up”
- “engagement time is high”
- “users asked about us”
All nice. None are profit.
If you want to think about AI like a channel you can actually operate, you need to map it into the same measurement discipline as SEO and paid. Incrementality. Cohorts. Contribution margin. Not vibes.
(If your org is still building manual reporting and duct taping workflows together, it gets worse. Here’s a good internal read on moving faster with automation: AI workflow automation to cut manual work and move faster.)
6. Optimization blind spots: you can’t improve what you can’t see
On your site, you can watch recordings, analyze funnels, run A B tests, inspect logs, and ship fixes daily.
In AI assistant commerce, you may not see:
- what the user was told before they arrived
- what alternatives were suggested
- what objections the user raised
- what the assistant answered (and whether it was accurate)
- whether your product data was misinterpreted
- whether your shipping promise was represented correctly
That’s brutal because a lot of conversion problems in commerce are not “the product is wrong”. They’re “the user got the wrong expectation.”
If an assistant tells them something slightly off, the checkout becomes the first moment they realize it. That is when they bounce, annoyed. And you never know why.
This is where AI search strategy starts to overlap with structured data, feed hygiene, and content that is written for extraction not just ranking. You’re not only optimizing for Google’s blue links. You’re optimizing for a model’s ability to represent you accurately.
If you want one practical angle on how AI systems compare and choose, this is a useful contrast piece: Amazon vs Perplexity AI shopping and the SEO implications.
What brands should measure before celebrating AI-commerce traffic
If you’re getting traffic from AI assistants, do not start with “sessions” or “mentions”. Start with unit economics and funnel integrity.
Here’s a clean measurement list that works even if the AI platform is messy.
Measure conversion quality, not just conversion rate
- Checkout start rate (AI referred sessions that reach checkout)
- Payment attempt rate (how many try to pay)
- Authorization failure rate (fraud filters, payment errors)
- Time to purchase (how long from landing to order confirmation)
- New vs returning mix (AI will skew new, which usually converts worse)
- AOV and items per order
- Return rate and cancellation rate (this is huge. intent compression can inflate returns)
- Customer support contacts per order (signals expectation mismatch)
- Contribution margin per visit (not revenue per visit)
If AI referrals “convert” but they come with higher returns and more support load, you may be buying expensive chaos.
Measure trust and expectation mismatch
- PDP bounce rate (if they land on PDPs at all)
- Shipping info interactions (do they immediately look for shipping and returns?)
- Policy page views
- Coupon field focus rate (sounds small, but it’s a trust signal in many verticals)
- Post purchase survey: “did anything surprise you?”
And yes, sometimes you need qualitative research. Five user interviews can explain what 50,000 sessions can’t.
Measure category and SKU sensitivity
AI doesn’t hit all products equally. Segment by:
- commodity vs considered purchase
- price bands
- size and fit complexity
- regulated categories
- products with high variant complexity
You’ll usually find that AI performs “fine” on simple replenishment and gets wrecked on nuanced selection.
How to design experiments that compare AI referrals vs classic organic and on-site flows
You need experiments that are boring. Controlled. A little annoying to set up. The kind that produce an answer you can defend to a skeptical CFO.
1. Create a clean channel definition for AI traffic
At minimum:
- UTM standards for AI partnerships or tracked links
- referral domain rules (ChatGPT, Perplexity, Copilot, etc.)
- separate reporting views for “AI discovery” vs “AI checkout” if those are different flows
Don’t let AI sessions fall into “Direct”. That’s how you lose the plot for six months.
2. Match cohorts, not just channels
Comparing AI traffic to “site average” is lazy. Match by:
- device type
- geography
- new vs returning
- category
- price band
- time of day (seriously, it matters)
- promo exposure
Then compare conversion and margin.
If AI traffic is mostly top of funnel research users, your benchmark should be informational SEO landings, not branded search.
3. Use holdouts where possible
If you can influence exposure (some brands can, some can’t):
- hold out a subset of SKUs from AI commerce programs
- hold out a subset of geo regions
- time-based holdouts (two weeks on, two weeks off)
Then measure incrementality. Not correlation.
4. Build a “friction map” for the AI flow
Document:
- what steps exist
- where the user leaves your controlled environment
- what identity and payment artifacts appear
- what confirmation and support paths exist
Then instrument everything you can control. Especially:
- add to cart
- checkout start
- shipping step completion
- payment errors
- order confirmation
- post purchase email open and support interactions
Even a rough map will show you why the experience might convert worse. Usually, it’s not magical. It’s one or two sharp edges.
5. Run A B tests on your own site to mimic AI compression
You can simulate intent compression by sending users into deeper steps:
- land on PDP vs collection
- land on “quick buy” modules
- hide some comparison content
- change how shipping is disclosed
If conversion changes in similar ways, you’ve learned something important: the assistant isn’t the only variable. The “missing context” is.
So what should you actually do? Practical moves that help
A few steps that tend to pay off regardless of which assistant wins.
Make your product data and policies easy for machines to represent
This means:
- clean titles, variants, attributes
- consistent shipping and returns info
- clear stock status
- structured data where applicable
- feed hygiene if you run shopping feeds
Not glamorous. But this is the substrate AI uses.
Build content that answers purchase blocking questions
AI surfaces answers. So give it answers that remove risk:
- sizing and fit guidance
- compatibility matrices
- “what’s in the box”
- warranty clarity
- returns steps
- shipping cutoffs
This isn’t just for ranking. It’s for accurate representation inside AI results. If you need a workflow for creating and optimizing content at scale without it turning into generic mush, this is relevant: AI SEO tools for content optimization.
Treat AI visibility like SEO, but with profit constraints
You still want to show up. You just don’t want to confuse visibility with business value.
This is where a platform like SEO Software fits naturally. Not as “AI will write 10,000 posts, congrats”, but as an operating system for building and updating content that is meant to rank, be extracted correctly, and drive profitable demand. Research, writing, optimization, publishing, and ongoing refresh. The boring stuff that wins.
If you want the deeper workflow view, here’s a good internal guide: An AI SEO content workflow that ranks.
The real lesson from Walmart’s 3x worse conversion
AI assistants are not a cheat code that lets you skip the hard parts of ecommerce. They move the hard parts around.
They compress intent. They strip context. They introduce trust and handoff friction. They reduce your UX control. They complicate attribution. And they hide the very signals you normally use to optimize conversion.
Which is why some AI commerce initiatives will look exciting in screenshots and disappointing in revenue reports.
So yes, pursue AI commerce. But pursue it like an operator.
Measure profit per visit, not mentions. Run cohort matched experiments, not hype driven launches. And build your SEO and content engine around the reality that AI is becoming a front door, whether we like it or not.
If you want to get serious about that, and optimize for profitable AI traffic, not novelty traffic, take a look at SEO Software. It’s built for teams that want to automate the SEO work, keep quality high, and stay visible in both Google and the AI assistants that are quietly eating the top of the funnel.