-- Research verified against Botify's crawl studies, Google's official crawl budget documentation, and tested with real-world site audits across 200+ domains.
TL;DR: Orphan pages are pages on your site that no other page links to. Google can't find them, users can't reach them, and they silently drain your crawl budget while generating zero traffic. Botify's data shows orphan pages consume 26% of crawl budget on average while contributing almost nothing. This guide covers how to find them, how to fix them, and how to prevent them from coming back.

An orphan page is any page on your website that has zero internal links pointing to it. Not zero external links -- zero internal links. No navigation menu link. No sidebar link. No "related posts" link. No link from anywhere else on your own site.
The page exists on your server. It has a URL. It might even be in your sitemap. But if you started at your homepage and clicked every link on every page, you'd never reach it. It's unreachable through your site's own navigation.
Here's why that matters: Google's primary method of discovering pages is by following links. It starts at known pages and follows every link it finds to discover new ones. If a page has no links pointing to it, Google has no path to reach it. It's like having a room in your house with no door.
We found this out the hard way. During our migration from seojuice.io to seojuice.com in late 2024, we ran a post-migration crawl and discovered 837 orphan pages on our own site. Eight hundred and thirty-seven. These were a mix of old blog posts that lost their category page links in the redesign, tag archive pages that WordPress had generated automatically but that our new navigation didn't reference, and a handful of landing pages from campaigns we'd forgotten about entirely. For a company that sells SEO tooling, this was -- to put it diplomatically -- embarrassing. It also taught me more about orphan pages in one week than two years of writing about them had.
"Internal linking is one of the biggest things that you can do on a website to kind of guide Google and guide visitors to the pages that you think are important. And with internal linking, you can tell Google and visitors which pages you consider important."
The flip side of Mueller's point is equally important: if you're not linking to a page, you're telling Google it's not important. And Google will treat it accordingly.
Nobody creates orphan pages on purpose. They accumulate through normal website operations -- slowly, invisibly, like dust in server racks. Here are the most common culprits:
Site redesigns and migrations. You rebuilt your navigation. The old category page that linked to 40 blog posts got replaced with a new one that links to 12. Those 28 posts are now orphans. This is by far the most common cause -- and exactly why I wrote the post-launch SEO checklist. Our own migration proved the point spectacularly: we had a redirect map for URLs, but nobody made a map for internal links. The pages were reachable by URL (redirects worked), but nothing linked to them anymore. Different problem, equally damaging. I keep coming back to this because it's the mistake I want every reader to avoid.
CMS-generated pages. Tag pages, date archives, author pages, paginated results -- your CMS creates these automatically, but they're often not linked from anywhere meaningful. WordPress alone can generate hundreds of these. I'd estimate (and I'm genuinely uncertain about this number) that 30-50% of orphan pages on the average WordPress site are CMS-generated pages that nobody asked for and nobody maintains. When we analyzed our own 837 orphans, 412 of them were tag and date archive pages. Nearly half.
Old landing pages. That campaign page from Q3 2024? The promotion that ended 18 months ago? Still sitting on your server, still getting crawled, still contributing nothing. Nobody removed it because nobody remembered it existed. We had 23 of these ourselves -- pages for webinars, seasonal offers, and a Black Friday campaign that I'm fairly sure we ran once.
Pagination changes. You had 50 products per page, now you show 100. Pages 6-10 of your old pagination still exist but nothing links to them anymore.
Content management drift. Over time, as you publish new content and remove old navigation items, pages that were once well-connected slowly lose their links. It's not a single event -- it's erosion. This is the hardest one to catch because there's no moment where something "breaks." It just gradually degrades. We see this constantly in SEOJuice audits: a site that was well-linked two years ago has slowly accumulated 50-80 orphans through nothing more dramatic than normal content operations.
Let me be direct about this. Orphan pages aren't just dead weight -- they actively damage your site's performance in three measurable ways.
Google allocates a finite amount of crawling resources to each site. This is your crawl budget -- the number of pages Googlebot will request from your server in a given time period. Every request spent on an orphan page is a request not spent on a page that actually matters.
The numbers are stark. Botify analyzed enterprise sites and found that orphan pages consume 26% of crawl budget on average. On badly maintained sites, that number hits 70%. TemplateMonster discovered 3 million orphan pages during a migration -- pages actively consuming crawl resources while 250,000 valuable commercial pages weren't being crawled at all.
For small sites (under 500 pages), crawl budget isn't usually a concern -- Google will crawl everything eventually. But the moment you cross a few thousand pages -- especially with a CMS that generates pagination, tags, and archive pages -- orphan pages start to have real impact. I want to be careful not to overstate this for smaller sites, because I've seen too many SEO articles cause unnecessary panic about crawl budget on a 200-page website. If that's you, orphan pages are still worth fixing (for authority flow reasons), but crawl budget isn't your problem.
Internal links pass authority (what Google used to call PageRank) from one page to another. Your homepage gets the most external links, so it has the most authority. That authority flows through internal links to your subpages, which flow it to their subpages, and so on.
Orphan pages are cut off from this flow entirely. They receive zero internal authority. Even if they have great content, Google sees a page that your own site apparently doesn't think is important enough to link to. That's a strong negative signal. When we reconnected our 837 orphans (the ones worth saving, anyway), we saw measurable ranking improvements within 4-6 weeks for pages that had been sitting at positions 15-30. Just adding internal links. No content changes.
Google's crawl budget documentation states it plainly: pages with no incoming links may remain uncrawled regardless of their importance. If Google doesn't crawl a page, it can't index it. If it's not indexed, it can't rank. If it can't rank, it generates zero organic traffic.
The insidious part? This happens silently. You won't see an error in Search Console. The page doesn't "break." It just quietly stops existing in Google's view of the internet. We discovered some of our orphaned blog posts had been deindexed entirely -- Google had stopped visiting them months ago. The content was still good. It just had no links.
| Impact Area | What Happens | Scale of Damage |
|---|---|---|
| Crawl Budget | Orphan pages consume crawl resources without generating traffic | 26% avg. waste (up to 70% on neglected sites) |
| Authority Flow | Pages receive zero internal link equity -- appear unimportant to Google | Ranking potential reduced to near zero |
| Indexing | Pages may never be crawled, or drop from index over time | Botify: 60% of pages on avg. not crawled within 30 days |
| User Experience | Users who find orphan pages via search have no navigation path to the rest of your site | Higher bounce rates, lower engagement |
| Content Decay | Orphaned content gets stale -- no editorial review since it's out of sight | Outdated info damages brand trust |
There are three reliable methods. I use all three because each catches things the others miss. During our post-migration audit, the sitemap comparison caught 743 orphans, Search Console flagged another 60 or so, and server log analysis caught an additional 94 that neither other method found. The overlap wasn't complete -- each method has blind spots.
This is the most reliable approach. You need two data sets:
Set A: Every URL in your XML sitemap (what you think your site contains).
Set B: Every URL discovered by crawling your site from the homepage (what's actually reachable via links).
Any URL in Set A that's not in Set B is an orphan page -- it's in your sitemap but not reachable through internal links.
You can do this with Screaming Frog, Sitebulb, or any crawler that can compare against a sitemap. In Screaming Frog: crawl your site in "Spider" mode, then use "Crawl Analysis" -> "Orphan Pages" to see the results.
GSC shows you pages that are "Discovered -- currently not indexed" or "Crawled -- currently not indexed." These aren't all orphan pages, but many of them are. Cross-reference these URLs with your internal link data. If a page is in GSC but has zero internal links, that's your orphan.
Your server logs show every URL that Googlebot requests. Compare this against your crawl data. A page that Googlebot visits (found via sitemap or old cache) but that your crawler can't reach from the homepage is an orphan that's actively consuming crawl budget.
This is the method that catches the sneaky orphans -- pages that aren't in your sitemap either, but that Google remembers from a previous crawl. Ghost pages that consume resources but are invisible in every other report. When we ran our own log analysis after the migration, this method caught an additional 94 orphan pages that the sitemap comparison had missed entirely. They were old URLs that we'd removed from the sitemap but hadn't redirected or deleted. Google was still visiting them every few days, getting a 200 response, and wasting crawl budget on pages we thought we'd cleaned up.
# Quick comparison: sitemap URLs vs. crawled URLs
# Export your sitemap URLs and crawled URLs as text files, then:
# Find URLs in sitemap that weren't found during crawl (potential orphans)
comm -23 <(sort sitemap_urls.txt) <(sort crawled_urls.txt) > orphan_candidates.txt
# Count them
echo "Potential orphan pages: $(wc -l < orphan_candidates.txt)"
# Cross-reference with server logs to see which are still being crawled
grep -f orphan_candidates.txt access.log | grep "Googlebot" > orphans_wasting_budget.txt
echo "Orphans actively wasting crawl budget: $(wc -l < orphans_wasting_budget.txt)"
Not every orphan page deserves the same fix. Here's my decision process, informed by triaging our own 837:
Read the page. Is the information current? Is there search demand for this topic? Does the page get any traffic at all (check GSC)? If yes -- go to Step 2. If no -- go to Step 3.
The simplest fix. Find 3-5 pages on your site that are topically related and add links to the orphan page. Focus on:
Contextual links in body content. These are the strongest. A link from a related blog post paragraph is worth more than a link from a sidebar widget.
Navigation or category pages. If the orphan page belongs in a specific section of your site, add it to the relevant category page or navigation menu.
Related posts sections. If your CMS has a "related posts" feature, make sure the orphan page appears in relevant results.
Key Takeaway
Don't link to an orphan page from another orphan page. That just creates a cluster of isolated pages that Google still can't reach from your main site structure. Every orphan page needs at least one link from a well-connected page. We made this exact mistake during our initial fix: we linked 30 orphaned blog posts to each other, creating a little island of 30 pages that was still disconnected from the main site. Had to redo it.
Not every page deserves to be saved. Of our 837 orphans, roughly 600 were genuinely worthless -- auto-generated tag archives, expired campaign pages, and a few draft posts that had accidentally been published. Sometimes the right answer is deletion, not rescue. Here are your options:
| Scenario | Action | When to Use |
|---|---|---|
| Duplicate or near-duplicate | 301 redirect to the canonical version | Two pages covering the same topic -- merge them |
| Outdated campaign page | Return 410 (Gone) status code | Content is permanently irrelevant |
| Thin content, no search value | Noindex + remove from sitemap | Pages like empty tag archives or old pagination |
| Valuable but outdated | Update content + add internal links | Good topic, but info needs refreshing |
| Auto-generated junk | Delete and return 404/410 | CMS-generated pages that should never have existed |
A quick note on 410 vs. 404: Google treats them similarly, but 410 (Gone) explicitly tells Google the page is permanently removed. It's a slightly stronger signal to stop wasting crawl budget on this URL. Use 410 when you're certain the page will never return.
Finding and fixing orphan pages is a reactive task. Here's how to prevent them from appearing in the first place — each of these came directly from our post-migration lessons.
Mandatory internal link check before publishing. Every new page you publish should link to at least 3 existing pages and be linked from at least 3 existing pages. Make this a requirement in your content workflow -- not a suggestion. We added this as a literal checkbox in our publishing process after the 837-page incident. It has not been optional since. Any new blog post that doesn't have at least 3 incoming internal links before publishing gets sent back.
Automated internal linking. Tools like SEOJuice's automated linking scan your content and suggest (or automatically insert) relevant internal links. This catches the drift problem -- as you publish new content, old pages automatically get linked from new ones. Yes, I'm recommending our own product here, and yes, I'm biased, but this is also genuinely the problem that motivated us to build the feature in the first place. We built it because we had 837 reasons to.
Monthly crawl audits. Run a full site crawl once a month and check for new orphan pages. This takes 10 minutes and catches problems before they compound. We run ours on the first of every month. The number of new orphans we catch has dropped from dozens to single digits since we implemented the publishing checklist.
Redirect mapping for every redesign. Before you launch a new design, export your current URL structure and verify that every page is reachable in the new navigation. This alone prevents the #1 cause of orphan pages. I wish someone had told me this more forcefully before our migration. I knew it intellectually. I just didn't do it thoroughly enough. We mapped URLs but not internal links. Don't make the same distinction.
Clean up your CMS. Disable automatic generation of pages you don't need -- tag archives, date archives, author pages (if you have one author). Every auto-generated page that isn't in your navigation is a potential orphan. The 412 tag and date archive orphans we found were all pages WordPress created automatically that nobody ever linked to or visited.
Here's a rough formula I use to estimate the traffic impact of orphan pages:
| Metric | How to Get It | Example |
|---|---|---|
| Total orphan pages | Crawl vs. sitemap comparison | 150 pages |
| % of orphans with search demand | Check GSC impressions for orphan URLs | 40% (60 pages) |
| Average impressions per orphan page | GSC data for those 60 pages | 200 impressions/month |
| Expected CTR if properly linked | Your site's average CTR for similar pages | 3.5% |
| Lost clicks per month | 60 x 200 x 0.035 | 420 clicks/month |
420 clicks per month from pages that already exist on your site. No new content needed. No link building. Just add some internal links. That's the kind of ROI that makes SEO the best marketing channel in existence.
For context: our data across 200+ sites shows that the average website has 8-15% of its pages orphaned. For sites that haven't been audited in over a year, that number jumps to 20-30%. The larger the site, the worse it gets. And if you've done a migration recently without an internal link audit? Check your numbers. I'd bet the over on 20%.
"Pages that are not linked in the site structure consume 26% of Google's crawl budget. For local businesses with fewer than 500 pages, orphan pages waste crawl budget while generating only 5% of organic traffic despite representing up to 70% of crawled pages."
Those numbers are from enterprise sites with thousands of pages, but the pattern holds at every scale. Orphan pages are a universal problem with a universal solution: find them and either link to them or remove them.
Technically, yes. If an orphan page is in your sitemap, Google may eventually find it. If it has external backlinks, Google can discover it that way. But "eventually" can mean months, and even if it's indexed, the lack of internal links means it receives zero authority -- so it's unlikely to rank for anything competitive. Submitting a page in your sitemap is not a substitute for proper internal linking.
Technically, one. But one link from a footer doesn't carry the same weight as three contextual links from relevant content pages. I recommend a minimum of 3 internal links from topically related pages. Important pages should have significantly more -- your top content should have 10+ internal links pointing to it.
Not necessarily. If your tag pages are linked from your navigation, sidebar, or footer, they're not orphans. The problem arises when you create tags for every conceivable keyword (a common WordPress habit) and most of those tag pages never appear in any navigation. A tag page with 2 posts and no inbound links is pure dead weight. We had hundreds of these.
If the pages are truly orphaned (no traffic, no backlinks, no internal links), deleting them won't hurt anything. You can't lose what you don't have. Just make sure you return a proper 404 or 410 status code so Google stops trying to crawl them. Don't redirect orphan pages to unrelated content -- that's a soft 404 and Google will penalize it.
Monthly for sites that publish frequently (more than 10 pages per month). Quarterly for smaller sites. After every site redesign or migration, do an audit within the first week -- that's when orphan pages are most likely to appear in bulk. Trust me on the migration part. I learned it with 837 reasons.
SEOJuice Automated Internal Linking -- Continuously scans your site and automatically adds internal links to orphaned content. The best fix is the one that happens without you thinking about it.
Content Silos for SEO Guide -- Orphan pages are a symptom of poor content architecture. This guide covers how to build a silo structure that keeps every page connected.
Internal Link Finder Tool -- Free tool that analyzes your site structure and identifies linking opportunities you're missing.
Every orphan page is a missed opportunity. The content is already written. The page already exists. It just needs a link. That's the lowest-effort, highest-return SEO fix available -- and most sites are sitting on dozens of them right now.
no credit card required