How Consumers Navigate High-Stakes Purchases In AI Mode via @sejournal, @Kevin_Indig
New research shows AI Mode is reshaping buying decisions. Learn how to secure visibility, trust, and top placement. The post How Consumers Navigate High-Stakes Purchases In AI Mode appeared first on Search Engine Journal.
Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!
AI Mode is compressing the stage where buyers compare, reject, and discover brands on their own. Our new usability study of 185 documented purchase tasks shows that 74% of AI Mode final shortlists came directly from the AI’s output – no external check, no triangulation, no second opinion.
This analysis will cover:
How the comparison search phase has collapsed. What this means for brands competing in categories with high competitor AI Mode saturation. The three levers that determine whether your brand shows up.Why We Conducted The Study
AI transforms Search from a list of results to a list of recommendations (shortlist). Until now, we have no idea how users treat AI shortlists. Do they take it at face value or thoroughly validate it?
That’s why I partnered with Citation Labs and Clickstream Solutions to record real users and their interactions when facing high-stakes purchases. This usability study of 48 participants completing 185 major-purchase tasks reveals that AI Mode operates as a recommendation environment, not a comparison one.
In traditional search, people click through results, comparing across sources to assemble a candidate set. In AI Mode, they accept the AI’s candidates and move on. 74% of AI Mode shortlists came directly from the AI’s output with no external check. In traditional search, more than half of users built their own shortlist from scratch.
The study covers four categories (televisions, laptops, washer/dryer sets, and car insurance). Participants completed tasks using both AI Mode and traditional search in a within-subjects A/B design, producing 149 AI Mode task observations and 36 search observations. The behavioral patterns are consistent enough across categories and participants to carry weight. (Full study design is at the end.)
From Garret French, founder of Citation Labs:
“In AI Mode, buyers often use a shortlist synthesis to shortcut the cognitive effort of Standard Searching and comparing. This raises the value of onsite decision assets and third-party sources that provide AI with clear trade-offs, specific evidence, and sufficient contextual structure to describe a brand’s offering with confidence.”
From Eric Van Buskirk:
The absence of narrowness frustration is the most intellectually significant finding. 15% in AI Mode vs 11% in Search, with no meaningful statistical difference. That’s the finding that rules out the obvious alternative explanation: that users accepted the AI’s shortlist because they felt trapped. They didn’t push back. They weren’t frustrated. They were satisfied. That makes the acceptance harder to dismiss.
Here’s what happened.
1. 88% Of Users Took The AI’s Shortlist Outright
Across the laptop and insurance tasks, where participants used both search surfaces (classic search and AI Mode), the gap in constructing a product shortlist was stark.
Image Credit: Kevin Indig
Definitions:
AI Adopted: The participant took the AI’s recommended candidates as their shortlist with no changes or external verification. User Built: The participant ignored the AI’s (or Search’s) suggestions and assembled their own candidate list from independent sources. AI Verified: The participant started with the AI’s candidates but checked them against an outside source (a retailer site, a review, a manufacturer page) before finalizing. Hybrid: The participant combined AI-suggested candidates with at least one candidate they found independently.In classic search, 56% of participants built their own shortlist from multiple sources. In AI Mode, only 8 out of 147 codeable tasks produced a genuinely self-built shortlist. The user’s comparison process didn’t just shrink when using AI Mode. For most participants, it didn’t happen at all.
64% of AI Mode participants clicked nothing at all during their task. They read the AI’s text, sometimes scrolled through inline product snippets, and declared their finalists. The no-click rate varied by category:
Image Credit: Kevin Indig
Insurance participants delegated most heavily. Washer/dryer participants clicked the most, likely because appliance decisions involve specific physical constraints (capacity, stacking compatibility, dimensions) that the AI summary didn’t always resolve.
The 36% who did interact with individual results within AI Mode broke into 2 groups:
About 15% of the AI Adopted group (17 of 117 participants) verified inside AI Mode: They opened inline product cards or merchant pop-ups to check a price or spec, then returned to the AI’s list. Others used follow-up prompts as verification tools, asking the AI for prices or narrowing by constraints.A separate 23% of all AI Mode tasks involved at least one visit to an external website, mostly retailers (Best Buy appeared in 10 of 34 tasks with external visits) and manufacturer sites. The destination pattern matters: Users left AI Mode to confirm a candidate they’d already accepted from the AI’s list, not to find new ones.
Of the 117 participants who adopted the AI’s shortlist directly, roughly 85% showed no internal verification behavior at all. Participants who built their own lists took an average of 89 seconds longer and consulted more than twice as many sources.
“Given that the first paragraph says Lenovo or Apple… going with that,” said one user about laptops when searching via AI Mode. Position one in the AI response was the entire decision. Another AI Mode user remarked: “I liked it more than anything else I’ve ever used for product searching. It made it a lot quicker to find the options.” They experienced speed as a valuable feature, not a shortcut.In classic search, the pattern reversed. Nearly 89% of participants clicked on something.
One insurance participant clicked out to Progressive and GEICO independently, read both landing pages, consulted an Experian article, and then arrived at a shortlist. A laptop participant applied hardware filters and flagged a review score discrepancy: “It shows 4.6 out of 5 stars for the reviews, but when you actually click the link: not reviewed yet.” Active skepticism of aggregated data was a behavior absent from AI Mode transcripts.2. The AI’s Top Pick Becomes The User’s Top Pick 74% Of The Time
Just like in classic search, the top answer carries outsized weight. 74% of participants chose the item ranked first in the AI’s response as their top pick. The mean rank of the final choice was 1.35. Only 10% chose something ranked third or lower.
Image Credit: Kevin Indig
Position one in the AI’s output carries an outsized advantage because of where it sits: inside a curated section that typically contains two to five items, after the AI has already done the filtering. The first item is the AI’s top pick. When people engage with AI mode, we know they read almost all of the output: The first AI Mode study found users spend 50 to 80 seconds reading AI Mode output, more than double the dwell time on AI Overviews. Users are reading carefully. They just read within a set the AI already narrowed.
However, 26% of participants in this study overrode rank order. The driver: brand recognition. They spotted a brand lower on the list and preferred it regardless of where the AI placed it. TV and laptop categories saw this most, where participants arrived with existing preferences for Samsung, LG, Apple, or Lenovo. But overriding rank did not mean rejecting the AI’s output: 81% of rank-override participants still chose from the AI’s candidate set.
3. The AI’s Words Become The Trust Signal
“Travelers and USAA actually tell me how much, whereas State Farm and GEICO give percentages. Just knowing the exact amount makes me want to pick Travelers or USAA right off the bat.”
That quote captures a core pattern in AI Mode trust. The AI’s formatting shaped the decision: Dollar amounts versus percentage discounts determined which brands made the shortlist.
AI framing (37%), meaning how AI talks about the product, and brand recognition (34%) were the top 2 trust drivers in AI Mode. They run nearly even:
Brand recognition led when participants arrived with brand preferences. AI’s wording filled the gaps where participants didn’t already have preferences.
Image Credit: Kevin Indig
In classic search, the dominant trust mechanism was multi-source convergence: Participants built confidence by checking whether multiple independent sources agreed about a product.
Essentially, users triangulated. One checked Progressive, then GEICO, then an Experian article. Another compared aggregated star ratings against reviews on the actual site. They were building a case from separate inputs.
That behavior was almost absent in AI Mode (5%). Instead, AI framing (how the AI worded its description of a product) and brand recognition were the top 2 trust drivers.
The split between these two signals tracked closely with product category:
Image Credit: Kevin Indig
For televisions and laptops, where most participants arrived with existing brand preferences, brand recognition dominated. For insurance and washer/dryer, where participants had less prior knowledge, AI framing dominated.
When you lack a prior view, the AI’s description becomes the trust signal. In AI Mode, the synthesis is the corroboration. Participants treated the AI’s summary as if the cross-checking had already been done for them.
The first study showed a related pattern from the supply side: AI Mode matches site type to intent, surfacing brands for transactional queries and review sites for comparisons. This study shows the demand side of the same behavior: When the AI surfaces a brand the user already knows, brand recognition drives the decision; when it doesn’t, the AI’s own framing fills that role. The site-type matching and the trust mechanism reinforce each other.
4. If You’re Not In The List, You Don’t Exist
Purchase outcomes in AI Mode concentrated heavily. For laptops, three brands captured 93% of all AI Mode final choices. In classic search, the distribution was broader: HP EliteBook variants appeared three times, ASUS once, and other brands got consideration they never received in AI Mode.
Image Credit: Kevin Indig
Two distinct problems emerged:
Brands that never appeared in the AI’s output were never considered. Participants didn’t see them, so they couldn’t evaluate them. The AI decided who made the list, not the buyer. Brands that did appear but lacked recognition faced a different problem: They weren’t seriously considered. Erie Insurance showed up in AI Mode results, but multiple participants eliminated it on name recognition alone. The brand was present but hadn’t built enough awareness to survive the moment of selection. One participant dropped a brand because it lacked a hyperlink in the AI output, reading that formatting gap as a credibility signal: “There’s not even a link there.”Another participant said when using AI Mode: “I’m already eager to believe these are good recommendations because it mentions LG and Samsung, two brands I consider very reliable.” The AI didn’t say those brands were better. The participant inferred it from familiarity.
Participants didn’t feel constrained by the narrower set. Narrowness frustration appeared in 15% of AI Mode tasks and 11% of classic search tasks, statistically indistinguishable. The option set shrank, but the feeling of having enough options didn’t change. The most skeptical AI Mode participant in the comparison set, who complained the AI kept pointing to “teen drivers, teen drivers, teen drivers,” still chose GEICO and Travelers: the consensus AI result.
5. Users Leave To Buy, Not To Research
23% of AI Mode tasks involved an external site visit, but keep in mind these prompts reflect high-stakes situations. In standard search, that figure was 67%.
Image Credit: Kevin Indig
The volume difference matters less than the intent difference:
AI Mode participants who left went to retailer sites and manufacturer pages to verify a price or spec for a candidate they’d already selected. Standard Search participants left to discover candidates: Reddit for peer opinions, editorial review sites for expert takes, insurance aggregators for comparison.In the first AI Overviews study, we found that high risk leads users to verify AI claims more and reference against answers from other users on UGC platforms (like Reddit).
In this study, Reddit appeared in 19% of standard search tasks and only twice across all 149 AI Mode sessions. The peer-opinion layer that shapes a large share of traditional Search barely exists in AI Mode behavior.
There’s irony in that pattern. Google leans heavily on Reddit content to train its models. However, the source that users rely on most in standard search is the one they almost never visit when the AI synthesizes those same sources for them.
The first study found the same pattern at a different scale. Across 250 sessions, clicks were “reserved for transactions:” Shopping prompts drove the highest exit share, while comparison prompts drove the lowest. The exit destinations were retailers and brand sites, not editorial or peer-opinion sources. Six months and a different task set later, the pattern holds: When users leave AI Mode, they leave to buy.
6. 3 Levers: Visibility, Framing, And Pricing Data
Three things that excite me most about the study:
First, we can apply the mental model of rankings (higher = better) to AI Mode as well. Most users choose the first product. Now, we can apply this to prompt tracking by focusing more on prompts that lead to shortlists and use our position as a goalpost.
Second, trust trumps rank. We know this since the first user behavior studies I published, but this study reinforces the importance of building trust with users before they search. It’s the ultimate cheat code.
Third, we now know buyers trust AI’s recommendations. Obviously, there’s a high risk here if the AI is wrong, but seeing how quickly buyers take the AI’s recommendation also shows us how fast consumers adopt AI. It truly is the future of Search.
Keep in mind:
1. Visibility at the model layer is the new threshold. If AI Mode doesn’t surface your brand, you have a visibility problem at the model layer. Query your own category the way a buyer would (i.e., “best car insurance for a family with a teen driver,” “best washer dryer set under $2,000”) and document which brands appear, in what order, and with what framing. Do this across multiple prompt variations. Do it regularly, because AI responses shift over time.
2. How the AI describes you matters as much as whether it appears. Brands cited with concrete attributes (specific model, specific price, named use case) held stronger positions than brands described generically. The content on your site that the AI draws from not only affects whether you show up, but also how confidently and specifically you show up. A brand with structured pricing data, clear product specs, and explicit use cases gives the AI better material to work with.
3. For categories with context-dependent pricing, AI Mode creates a false-confidence problem. 63% of insurance participants were rated overconfident about pricing. They accepted AI-quoted rate estimates without checking whether the figures applied to their actual state, driving record, or current insurer. They made elimination decisions based on numbers that may not have applied to them. Where shopping panels showed explicit retailer-confirmed prices (washer/dryer), 85% of participants understood pricing clearly. Where they didn’t (insurance, laptops), confusion and overconfidence filled the gap. Structured pricing data through Merchant Center feeds and schema markup is the most direct lever for brands selling physical products. For services, the lever is editorial: Make sure your landing pages and FAQ content frame pricing as conditional (“your rate depends on X, Y, Z”) so the AI has that framing to draw from.
Study Design
Citation Labs and Clickstream Solutions ran this as a remote, unmoderated usability study with 48 U.S.-based participants recruited through Prolific. Each participant completed up to four major-purchase shortlisting tasks across televisions, laptops, washer/dryer sets, and car insurance.
The comparison between AI Mode and traditional standard search used a within-subjects A/B design: Participants used both surfaces, not one or the other. Significance calculations were normalized for the exact number of participants in each group (149 AI Mode task observations, 36 standard search task observations). This matters because the groups are unequal in size, and raw percentage comparisons between them would overstate confidence without that correction.
Sessions were screen-recorded with think-aloud audio. Trained analysts annotated each recording for behavioral markers (click-through, shortlist origin, trust signals, external site visits) and qualitative markers (stated reasoning, brand mentions, frustration signals). The 185 task-level observations provide a larger analytical base than the 48-participant headcount suggests, but confidence intervals remain wider than a large-scale survey. Findings are directional, not population-level estimates.
Notes on terminology used throughout this report:
Shortlist: The final set of brands a user would consider buying from. AI Adopted: The participant took the AI’s recommended candidates as their shortlist with no changes or external verification. User Built: The participant ignored the AI’s (or Search’s) suggestions and assembled their own candidate list from independent sources. In Search, when there was no AIO present, they had no option for relying on AI suggestions. AI Verified: The participant started with the AI’s candidates but checked them against an outside source (a retailer site, a review, a manufacturer page, further prompting, or interaction with a panel outside the main AI text block ) before finalizing. Hybrid: The participant combined AI-suggested candidates with at least one candidate they found independently. AI framing: The specific words and structure the AI used to describe a product, such as labels like “best for affordability” or explicit price comparisons. Brand recognition: The user chose or eliminated a brand based on prior familiarity, not the AI’s description or any external research. AI trust (general): The user accepted the AI’s output as credible without citing a specific reason, such as a particular label or description. Source trust: The user trusted a recommendation because of where it came from, such as a retailer, manufacturer, or named publication surfaced in results. Multi-source convergence: The user built confidence by checking whether multiple independent sources agreed on the same recommendation. Rank override rate: The share of users who chose a brand other than the AI’s top-ranked option, regardless of whether they stayed within the AI’s candidate list.Featured Image: Tapati Rinchumrus/Shutterstock; Paulo Bobita/Search Engine Journal
AbJimroe