A transparency post on the technical SEO work we've shipped, the reasoning and evidence behind the choices, and the limits of what any single frontend can do.
Why we dug into this
Every year we analyze traffic and performance of website. Ecency's organic search traffic declined slightly over the past year. The instinctive fix people reach for is "submit a sitemap." Before doing that, we measured - and the data reframed the problem.
The most important distinction up front: discovery is not indexing. In recent discussions by
and
sitemap came up again and we shared our thoughts on this topic but we thought dedicated post is essential for transparency not only about sitemaps but our recent changes.
Here is the decisive fact: ecency.com has never had a sitemap, and Google has still discovered the vast majority of its URLs - millions of them. That's proof, not theory - discovery was never the bottleneck. The problem is that Google chooses not to index most of what it found - the "Discovered/Crawled – currently not indexed" buckets in Search Console - because it judges much of that content as low demand or low quality. A sitemap does not change that judgment; it would only restate URLs Google already found and already declined.
What a sitemap does and doesn't do
This is well documented by Google and stated repeatedly by their Search team:
Sitemaps help with change signaling (
lastmod) and crawl prioritization on very large sites.The crawler discovers linked URLs on its own perfectly well - a sitemap is not how Google finds content on a well-linked site.
A sitemap does not make Google index pages it has decided aren't worth indexing.
Per Google's own SEO documentation, duplicate or thin content "is not a violation of our spam policies … [but it's] not something that will cause a manual action" - it's simply not indexed or consolidated away. The lever is the content/quality signal, not the URL list.
Including millions of URLs in a sitemap when most are thin posts, auto-generated reports, or duplicate comment pages would make our signal to Google worse, not better - it would spend crawl budget reaffirming low-value pages.
We did ship dynamic sitemap support into Web (Vision) - but for the one reason it genuinely helps: lastmod-driven revalidation of edited posts. It is deliberately scoped (recent + traffic-bearing + curated, not the whole corpus) and is hygiene, not a ranking fix. We want to be clear about that so expectations are calibrated.
The actual lever: indexing quality
The thing that meaningfully changes how Hive content shows up in Google is reducing the low-quality footprint so Google's site-level quality assessment isn't dragged down. What we shipped:
NSFW / adult content
noindex- via tags, known adult communities, and conservative title heuristics (it catches legacy posts that deliberately omit thensfwtag).Abuse-blacklist + effectively-empty content
noindex- folded into one shared indexability decision so it's consistent everywhere.Comment / short-form canonical consolidation - adopting the model Reddit, Stack Overflow and Quora use: a reply is not its own indexable page; it canonicalizes to the root of its discussion thread. Short-form microblog container anchor posts are
noindex; an individual short post is indexed only if it clears a content + engagement bar. This stops thin comment pages from diluting the domain.
These remove drag. They don't manufacture rankings - we want to be honest that no technical change does.
The canonical change - and the cross-frontend convention
This is the change most relevant to other frontends, so we want to explain it openly.
Historically, Hive frontends follow a convention: point rel=canonical at the frontend a post was first published from. Ecency followed this for years. The intent is good - consolidate signals so the ecosystem isn't fragmented.
We tested whether it actually does that. It does not. On real ecency.com post pages where our HTML declared a cross-domain canonical to another frontend, Google Search Console's URL Inspection showed no user-declared canonical recorded and Google selecting its own representative URL (an ecency.com one). Google's documentation is explicit that a declared canonical "is a hint, not a rule," and that Google picks the version it judges "objectively the most complete and useful for search users."
A second, measurable factor: rendering. Fetched as a crawler with no JavaScript, some Hive frontends are client-side rendered - the initial HTML is a near-empty application shell with the article body absent until scripts execute - while Ecency server-renders the full article in the initial response. This isn't a criticism; client-side rendering is a legitimate architecture with its own benefits. But it has a concrete SEO consequence: when Google chooses among duplicate copies, it favors the version it can parse immediately.
Net effect of the old convention for Ecency: we were declining our own indexable, server-rendered content in favor of a canonical Google was ignoring anyway. So we changed it:
Ecency now self-canonicalizes to a single clean URL per post.
We still fully honor an explicit author-declared
canonical_urlin post metadata - that is genuine syndication intent and is respected first.We only stopped the inferred, app-based cross-frontend canonical - which was a frontend's guess, not the author's declaration, and which Google treated accordingly.
Does this hurt other frontends? Based on the evidence and Google's documentation, no:
There is no cross-domain "duplicate content penalty." Google clusters copies and picks one per query; the others aren't penalized, they're just not the one shown for that query.
The old convention wasn't actually consolidating signals (Google ignored it), so abandoning it doesn't remove protection that was working.
Nothing we changed affects another frontend's inbound links, crawlability, or how Google treats its pages. Google was already choosing per query regardless of the declarations.
We'd genuinely encourage other frontend teams to look at their own Search Console URL-Inspection data on shared posts - the cross-domain canonical behavior is observable, and an honest ecosystem-wide picture helps everyone. We're happy to share our methodology.
What's upstream of any frontend
The largest Hive SEO gaps are not in frontend code:
External backlink authority. Hive content has comparatively little inbound link equity from high-authority sites. No canonical or sitemap change fixes this.
Brand disambiguation. "HIVE Digital Technologies" (publicly traded, NASDAQ/TSXV) dominates branded search for the term "hive," crowding out the blockchain ecosystem. It did improve over last couple years but still requires attention/effort. This is anentity/brand-SEO problem - consistent naming ("Hive blockchain"), entity signals, Wikipedia/Wikidata presence - and it needs ecosystem-level effort, not a single app.
These are worth tackling together, alongside the technical hygiene.
An open invitation: compare notes every 3-6 months
SEO on a platform like Hive is a long game, not a one-time fix. Google's systems change, re-crawl of large sites takes months, and - crucially - the frontends are a natural experiment: the same underlying content, rendered by different architectures (server- vs client-side), with different community mixes and technical choices. None of us sees the full picture alone; together the signal is strong.
So we'd like to propose a lightweight, recurring practice: each frontend runs its own measurements and shares findings every 3–6 months. Nothing heavy - from your own Search Console:
indexed vs. discovered counts, and the top "not indexed" reasons;
canonical behavior on a handful of shared sample posts (URL Inspection);
any structural change you shipped, and what you observed afterward.
Post it on-chain under a common tag - we'll use #hiveseo - so it's discoverable and durable. Ecency will go first and keep to this cadence publicly; this post is our first entry. There's no competition on this layer: if Hive content ranks well on any frontend, the whole ecosystem benefits. We're also happy to share the scripts and methodology we used so comparisons are apples-to-apples.
What we're watching, honestly
Search Console convergence after canonical/redirect changes is per-URL on re-crawl - realistically weeks to months for very large URL buckets. We've set up weekly monitoring and will report back rather than declare victory early.
The short-form quality thresholds are deliberately conservative starting points; we'll tune them from real data and have a contingency checkpoint if they're too strict or too loose.
If our self-canonical choice turns out to be rejected by Google at scale (a metric we're tracking), we'll revisit.
We'd rather be transparent about the uncertainty than overstate the impact. The technical work removes drag and stops us handicapping ourselves; sustained improvement still depends on content quality and the upstream ecosystem factors above.
References
Google Search Central - SEO Starter Guide - duplicate content is "not a violation of our spam policies" and "not something that will cause a manual action."
Google Search Central - Canonicalization - a declared canonical "is a hint, not a rule"; Google picks the version "objectively the most complete and useful for search users."
Google Search Central - Consolidate duplicate URLs - Google consolidates link signals onto a single preferred URL.
Google Search Central Blog (2008) - "Demystifying the duplicate content penalty" - the original public statement that there is no duplicate-content penalty.
Questions and corrections welcome - especially from other frontend teams. The goal is Hive content ranking well, on whichever frontend Google can best serve it. We'll publish updated findings in 3–6 months under #hiveseo - and we hope you'll share yours too.