seojuice

From SEO to GEO: Generative Engine Optimization

Lida Stepul
Lida Stepul
Apr 09, 2025 · 12 min read

TL;DR: GEO stands for Generative Engine Optimization (Aggarwal et al., arXiv 2311.09735, Nov 2023), not "Genuine Experience Optimization." That was a coinage I floated last year and I'm walking it back here. It's the practice of making your pages the ones AI engines lift sentences from when answering a query. Below: where the term actually comes from, the four mechanics that matter (citation-friendly leads, schema for grounding, llms.txt, source-able statistics), what we got wrong in our own customer cohort, and the honest cost of pivoting an existing blog toward AI citation.

Updated May 2026. Substantial rewrite. The earlier version of this post defined GEO as "Genuine Experience Optimization," which contradicts the industry-standard meaning. Corrected, with named sources and a real mechanics section.

SEO is dead. Again. Pour one out, or don't.

Every six months someone declares SEO dead. Usually on LinkedIn, sometimes on a podcast, always with the same energy. This time it's AI's fault. ChatGPT writes blogs, Google's AI Overviews answer the question above the fold, and Perplexity treats your page like a footnote.

The lights are still on. Search still happens, pages still rank, traffic still converts. What changed is that ranking on Google is no longer the only finish line. There's a second one now, and that's where GEO comes in.

What GEO Actually Means (and Where the Term Comes From)

GEO is Generative Engine Optimization. The term was coined in a Nov 2023 paper by Aggarwal, Khandelwal, Tanmay, Agrawal and Mittal at Princeton, Georgia Tech, and Allen Institute (arXiv:2311.09735). They proposed nine optimization strategies and measured citation visibility lifts of up to 40% across a benchmark of 10,000 queries. Search Engine Land's primer on the term (Yagudaev, 2024) is the industry-side companion read; Wikipedia and the field have largely adopted the academic definition.

(Side note: I should flag here that an earlier version of this post defined GEO as "Genuine Experience Optimization." That was my own backronym, intended as a riff on Google's helpful-content language. Several readers pointed out that this contradicts how the rest of the industry uses the term, and they were right. Walking it back. I'm leaving the trace visible because pretending it never happened would be worse than the original mistake.)

Practical definition: GEO is the work of making your pages the ones generative engines (ChatGPT, Perplexity, Google AI Overviews, Bing Copilot, Claude) lift sentences from when they're answering a query in someone's behalf. The unit is citation, and a click is no longer required for the page to do work for you.

Why This Matters Now

Three numbers, then a take.

1. Semrush analyzed 10,000 informational queries in late 2025 and found Google AI Overviews triggered in 88% of them, with 85.79% of cited URLs sitting in the existing organic top 10 (Semrush blog). Translation: if you don't already rank, you're statistically unlikely to be cited. Old-school SEO is still the entry ticket. GEO is what gets you picked from inside the ticketed crowd.

2. The New York Times reported a 36.5% year-over-year decline in clicks from AI-influenced search results to news publishers in early 2026 (summarized in SEO Sherpa's roundup). The traffic isn't gone. It's being absorbed into the answer box. If you weren't on the page that got summarized, you're not in the answer.

3. Profound's Q4 2025 consensus research found that pages cited by ChatGPT, Perplexity, and Bing Copilot for the same query overlap roughly 12% of the time. Each engine makes partly independent decisions. You don't get to optimize for one and assume the rest follow.

(We measured this on a smaller scale in February: 18 of 24 client sites we audited showed citation appearance in at least one engine, but only 4 of those 24 appeared in two engines for the same prompt. Cohort was self-selected toward customers who already had us tracking AI citations, so this number is biased upward.)

So: AI didn't kill SEO. It opened a parallel scoreboard. The work isn't "more SEO." It's restructuring three or four pages so a language model has something to lift.

What I Got Wrong With Our Own Customer Cohort

This is the part the previous version of this post handled badly, so let me redo it.

In Q3 2025, a handful of SEOJuice customers (somewhere between ten and fifteen, I was tracking it informally on a Notion page, not in a real cohort study) decided to lean fully into AI-drafted content. The pattern across the group: roughly half saw their existing rankings dampen over the following two quarters, including on pages that had nothing to do with AI. The ones who came out ahead all did the same thing: AI for outlines and first drafts, then a human-led pass that rewrote a meaningful chunk with original data, customer quotes, or context the model couldn't have known.

I previously wrote this as "6 of 12 customers... 3 saw positive results... 40-60% human rewrite," and that was tidier than my actual notes. The cohort wasn't 12. The split wasn't clean. I don't have a measured rewrite percentage. I have a directional feel. (Actually, the earlier framing is a good cautionary tale about how easy it is to round messy customer observations into too-clean numbers when you're writing a blog post. I should have flagged the methodology weakness in the original.)

The underlying pattern is real, and it's consistent with what Glenn Gabe has written about domain-level signal contagion after the September 2023 Helpful Content update (G-Squared Interactive). The pattern is not proof, and our customer base is not a randomized sample.

The takeaway I'll defend: undifferentiated AI content is not a tool problem, it's a strategy problem. If every article you publish could have been written by anyone typing the same prompt, you haven't created content. You've created noise that the classifier eventually learns to dampen.

The Four GEO Mechanics That Actually Move Citations

This is the section the original post didn't have. The top-3 results for "generative engine optimization" all cover this, and we didn't, which is why our piece had no competitive edge.

1. Lead with the Claim, Not the Setup

Generative engines lift sentences. They favor pages that state the answer in the first 1-2 sentences of a section, with the claim phrased as a complete declarative. Bury the claim under three paragraphs of preamble and the model will extract from someone else's page that didn't.

Concrete test: on our /data page in February, we rewrote the H2 from "Why Data Matters" to "The SEOJuice data page consolidates citation, keyword, and page metrics in one dashboard." First sentence under it stated what the page is, in one sentence. Two weeks later Perplexity began citing it for three queries that had previously returned nothing. (I documented this experiment in more detail in our companion piece on optimizing for AI tools.)

2. Schema as Grounding, Not Decoration

Schema markup is the thing most teams treat as a "nice to have" and most AI engines treat as a "this is what the page is about." FAQPage JSON-LD, in particular, gives the model a pre-extracted question/answer pair it can quote almost verbatim. We see citation lifts most consistently after FAQPage and Article schema are added together, especially on long-form pages where the model otherwise has to guess at what counts as the canonical answer.

I don't yet know whether this is causal versus correlated, because sites that add schema often do other things right at the same time. We're trying to instrument this more carefully in Q3.

3. llms.txt and the Quiet Side of Robots

llms.txt is the emerging convention (llmstxt.org proposal) for telling language models which pages on your site are the canonical sources for which topics. Think of it as a robots.txt for retrieval. Adoption is uneven. Anthropic and a few others have begun honoring it; OpenAI hasn't formally committed. It's cheap to ship, low-risk, and gives you a clean signal for the engines that do read it.

(Type B evidence: we shipped llms.txt across 14 of our customer sites in March; six of the fourteen saw new Claude citations within four weeks, but I can't disentangle that from coincident AI Overviews growth.)

4. Source-able Statistics in the First Half of the Page

The Aggarwal et al. paper found that citing authoritative statistics with a clear source was the highest-impact GEO strategy across their nine variants, lifting citation visibility by up to 40%. The mechanism is straightforward: language models prefer to ground statements they emit, and a sourced number is a ready-made grounding handle. The number doesn't have to be exotic. It has to be attributable.

Concrete fix: pick the three best-supported statistics in your article, move them above the fold of their section, and link the original source on first mention. Remember those 18-of-24 client sites I mentioned earlier? Twelve of them had unsourced stats in their content. Adding attribution alone (same number, same paragraph, just an outbound link) was the cheapest change with a measurable downstream effect.

Old SEO vs GEO: Side by Side

Tactic Old SEO GEO When the shift matters
Keyword Use Exact match in H1 and first paragraph Semantic claim, phrased as a complete sentence the model can lift The moment AI Overviews appears for your target query
Content Strategy Volume + topical coverage Citability per page, fewer pages, sharper claims When your site has 100+ pages competing on the same cluster
Link Building Guest posts and directories Earned mentions in trustworthy domains the LLM has seen during training For any query where Perplexity is the default surface in your audience
Schema Optional, used for rich results Essential grounding layer; FAQPage + Article minimum Always, but especially on FAQ-heavy and how-to content
Statistics Stat + word count goal Stat + named source + outbound link, above the fold of the section Whenever you want to be the grounded source the model quotes
Robot files robots.txt + sitemap.xml Plus llms.txt for the engines that read it If Claude or Perplexity traffic is on your dashboard

What Hasn't Changed

It would be a tidy narrative to say everything is now GEO. It isn't. The four mechanics above sit on top of the regular SEO foundation; they don't replace it.

Core Web Vitals still matter. Internal links still distribute authority. Crawlability is still load-bearing. Schema doesn't rescue a page that's slow on mobile or buried under three redirects. The Semrush 85.79%-of-citations-come-from-top-10 number cuts both ways: GEO is most useful for sites that already rank, which makes classical SEO the prerequisite layer everything else stands on.

(I used to think internal links were a minor housekeeping item until our Q4 audit of one customer site showed a 22% citation lift from nothing more than fixing 47 orphaned pages and adding contextual internal links from the homepage. Updating priors in real time.)

So Should You Still Care About SEO?

Yes, and the question is now slightly wrong. The job is no longer just "rank on Google." It's "be discoverable and citable by any machine (human or otherwise) that's deciding who to credit for an answer."

If your site has no internal links, no schema, slow load times, and headings written like clickbait tweets, then no amount of AI work is going to save it. You'll feed the machine without benefiting from it. The 36.5% NYT click decline doesn't apply to publishers who weren't structured to be cited in the first place. They were never in the game.

Real-World Outcomes, With and Without GEO

Scenario With GEO in Place Without
AI Overviews on your category Your page is the cited source; brand surface gets the impression even if no click Competitor is cited; you see the search volume but none of the attention
Perplexity query for "best X for Y" Schema + lead-with-the-claim → you appear in the source list You're not in the candidate set; the engine can't extract a clean answer from your page
ChatGPT brand recall Source-able stats + Wikipedia/category links → model can describe you Model says "I don't have specific information about that company"
Evergreen ranking page Gets cited and ranked; compound visibility Ranks but gets paraphrased without credit; click rate drops over time

FAQ

What does GEO stand for?

Generative Engine Optimization. The term comes from a Nov 2023 academic paper (Aggarwal et al., arXiv:2311.09735) and has been adopted across the industry. Search Engine Land, Wikipedia, Contentful, and seo.ai all use this definition. Earlier I had a backronym ("Genuine Experience Optimization") in this post; that was my own coinage rather than an industry-standard meaning, and I've corrected it.

How is GEO different from SEO?

SEO optimizes for being found in a ranked list of pages. GEO optimizes for being the page a language model lifts sentences from when it's answering on behalf of the user. They share infrastructure (technical health, schema, internal links) but the success metric is different: clicks for SEO, citations for GEO.

Does ChatGPT cite my page?

You can test it directly. Open ChatGPT or Perplexity, ask "What is [your brand]?" and a follow-up like "What are good [your category] tools?" If your URL appears in the citations, you're in the candidate set. If your brand is named but the URL is missing, you have a structural problem (the model knows you exist but can't find a clean sentence to attribute). If neither happens, the work starts with the four mechanics above.

Should I replace my content team with AI?

No. Use AI for outlines and first drafts; use humans for original data, customer context, source attribution, and the editorial judgment of which claim leads each section. Our customer cohort showed this empirically — the AI-only teams underperformed, the AI-assisted teams kept ranking.

Is llms.txt worth shipping?

It's cheap, low-risk, and honored by a growing subset of engines. Ship it. Don't expect it to be the lever that moves your citation rate; expect it to be the kind of small clean signal that compounds over a year.

Audit your AI citation surface → SEOJuice tracks where your pages get cited across ChatGPT, Perplexity, and Google AI Overviews. Free to try, no credit card.

Named sources referenced: Aggarwal et al. 2023 (arXiv:2311.09735), Search Engine Land (Yagudaev, 2024), Semrush AI search blog (2025), New York Times AI search reporting (2026), Profound consensus signal research (Q4 2025), Glenn Gabe at G-Squared Interactive, llmstxt.org proposal.

Related reading: