We Asked 5 AI Assistants 500 Business Questions. Here's What We Found.
AI assistants are increasingly being used to find local businesses. But which AI platforms actually recommend your clients? And what biases do they have? We tested ChatGPT-4o, Claude 3.5, Gemini 1.5 Pro, Perplexity, and Microsoft Copilot with 500 real business-finding queries to find out.
How We Ran This Test
Between January and February 2025, SurfAI engineered 500 distinct business-finding queries across 10 industries: law, finance, healthcare, real estate, home services, restaurants, e-commerce, consulting, dental, and insurance. We targeted 5 major metropolitan areas: New York, Chicago, Austin, Denver, and Sacramento.
To test whether query phrasing influences recommendations, each business discovery was phrased in three distinct ways:
- Location-first: "Best accountants in Denver"
- Problem-first: "Who can help me refinance my commercial property in Austin?"
- Category-first: "Top personal injury lawyers in Chicago"
We ran each query against five AI platforms in succession, capturing every business mentioned in the initial response. We then verified whether that business existed, was still operating in the stated location, and offered the claimed service. All testing was completed within a 30-day window to minimize data drift.
Seven Key Findings
1Enterprise Bias Is Real—But Overstated
Large national brands and well-known regional chains appeared disproportionately often in recommendations. However, this bias is not universal across industries. In 6 of 10 industries tested, local and regional businesses captured 40% or more of recommendations when they possessed strong entity signals—including verified citations on industry platforms, consistent NAP (name, address, phone) data, and press mentions from credible sources.
This suggests that enterprise preference is more about data quality and search visibility than deliberate platform favoritism. Smaller businesses with strong online foundations can compete.
2Perplexity Recommends Small Businesses Far More Often
Across all queries, Perplexity recommended businesses with fewer than 50 employees in 34% of responses. ChatGPT-4o did so in only 19% of responses. Claude 3.5, Gemini 1.5 Pro, and Copilot fell between 22-28%.
Why? Perplexity pulls more aggressively from live web data and real-time information, including local business directories, recent reviews, and industry-specific databases. It relies less on training data cutoffs and more on current web indexing, which naturally surfaces smaller, newer, and locally-focused businesses.
3Query Phrasing Matters Enormously
The same business appeared in 0% of location-first queries ("best accountants in Denver") but 41% of problem-first queries ("who can help me with complex business tax planning in Denver?") when that business specialized in exactly that service.
Problem-framed queries trigger a fundamentally different recommendation engine. Instead of matching location + category keywords, the AI is forced to understand specialization and match intent. This creates an enormous opportunity: businesses with clear problem-solving identities in their online presence (website copy, review content, industry citations) see dramatically higher appearance rates when queries are solution-focused rather than geography-focused.
4Healthcare & Legal Show the Most Opportunity (and the Worst Data)
These two industries showed the highest "recommendation concentration"—meaning the top 5 businesses captured 62% and 68% of all mentions, respectively. But they also had the highest error rates: 28% of healthcare recommendations and 31% of legal recommendations contained inaccuracies (wrong address, closed practice, services no longer offered, or misattributed specialties).
This creates a paradox: AI models haven't been fed clean, current data in these highly regulated niches. Businesses that invest in accurate, specialized entity data have an outsized advantage. A well-maintained profile with current credentials, correct service offerings, and verified specializations can dominate in these verticals precisely because the baseline is so poor.
5Reviews Don't Predict AI Mentions—Citations Do
We analyzed 120 businesses that appeared in recommendations. Those with 200+ Google reviews but no press mentions or industry citations appeared in recommendations just 18% of the time across all five platforms. Businesses with 50+ reviews and 3 or more features from industry publications or credible directories appeared 64% of the time.
The pattern is clear: AI platforms weight external citations and editorial mentions far more heavily than user review volume. Citations act as a "authority signal" that reviews do not. A business featured in a legal directory, industry award list, or trade publication gets more AI visibility than one with thousands of happy reviews but no external validation.
6Gemini Is Most Geographically Accurate
When we verified the geographic accuracy of local recommendations, Gemini 1.5 Pro correctly identified a business as being in the target city 78% of the time. ChatGPT-4o achieved 61% accuracy. Claude 3.5 and Copilot were between 64-67%. Perplexity, despite recommending smaller businesses, had slightly lower accuracy at 59%.
Google's structural advantage is evident here: Gemini has direct access to Google's location data infrastructure, Maps data, and local search signals. For queries where geography is critical (e.g., "dentists near me"), Gemini's inherent accuracy advantage is significant. This is worth monitoring as location-based AI search grows.
7AI Recommendations Were Wrong 22% of the Time
Across all 500 queries and all platforms combined, 22% of the businesses recommended either no longer existed at the given address, had closed, offered services they were credited with, or were fundamentally misidentified (e.g., a spa being recommended for hair services).
This creates a hidden competitive opportunity: accurate, up-to-date entity data is itself a differentiator. Businesses with verified, current information across multiple platforms—including correct addresses, active phone numbers, current service listings, and proper categorizations—stand out. This level of accuracy is rare enough that it becomes a competitive advantage.
What This Means for Your Business
These findings suggest that AI visibility is not random or entirely dominated by big brands. Instead, it follows predictable patterns that businesses can optimize for.
1. Optimize for Problem-First Phrasing
Clients increasingly ask AI assistants questions like "Who can help me restructure my debt?" rather than "Best financial advisors in Austin." Your website, service descriptions, and online content should clearly articulate the specific problems you solve, not just the categories you serve. This shift matters enormously for AI visibility.
2. Build Citations Over Reviews
While reviews remain important for local search and user trust, AI platforms weight external citations and authoritative mentions far more heavily. Industry directories, award listings, press features, and trade publication mentions signal credibility to AI systems in ways reviews do not. A small company with 5 industry mentions and 50 reviews will likely appear in more AI recommendations than a company with 500 reviews and no external citations.
3. Close the Data Accuracy Gap
With 22% of AI recommendations containing errors, simply maintaining accurate, current data across all platforms—your website, Google Business Profile, industry directories, and specialty listings—is a measurable advantage. This is particularly true in healthcare and legal, where data quality is currently poorest.
4. Embrace Specialization in Your Online Presence
Businesses with clear, specialized identities perform better in AI recommendations. Rather than positioning yourself as a generalist ("We offer legal services"), own a specific niche ("We specialize in commercial real estate financing for startups"). This signals clearly to AI systems and increases the likelihood you're recommended for specific problem scenarios.
5. Watch the Geographic Accuracy Game
As AI search becomes more location-aware, Gemini's accuracy advantage suggests that location data quality will matter more. Ensuring your business is correctly categorized, mapped, and verified across all platforms—especially Google—is increasingly important.
Methodology Note
This research was conducted over 30 days (January 15 – February 14, 2025) using live, real-time queries against ChatGPT-4o, Claude 3.5 Sonnet, Google Gemini 1.5 Pro, Perplexity Pro, and Microsoft Copilot. Each of the 500 queries was executed in a consistent order and timeframe to minimize data drift. Recommendations were verified against current business registrations, Google Business Profiles, and industry databases. We tested only the initial responses from each platform (no follow-up queries or refinements). Results may differ based on account type, geographic location of the requester, and platform updates after this testing period.
Want to know how your business ranks in AI recommendations? SurfAI's AI Visibility Audit shows you exactly where you appear (and where you don't) across ChatGPT, Claude, Gemini, and Perplexity for queries your customers actually ask.
Get Your Free AI Visibility Audit