The Crawl Tool Blog

Expert insights, guides, and tips to improve your website's SEO performance

Home Blog Why Having The Biggest Backlinks Database is Silly
Author
The Crawl Tool Team
March 29, 2025 7 minutes read

How Backlinks Apply to SEO

There's no question that backlinks are important in SEO. If you've read anything on the subject then you'll know that a good backlink gives your web page ranking. I've read lots of explanations of this, some of them good, and some of them bad, but for this article we don't need to go into depth but just know that they pass some sort of magic numbers on that affect how well your website ranks in searches, and ranks for particular topics. A key part of the topic part is what text is used when somebody links to your web pages.

Now this might have you thinking "more backlinks = always good". And you would not be alone in that. But the reality is that you only need the right backlinks - the strong ones from good sites. They're from the sites that also have lots of strong backlinks. Many backlinks are just simply not worth the time and effort to get. You shouldn't say no if they're handed to you on a plate, but they're also not helping you that much.

Our Backlink Database

Because backlinks are so vital to SEO, a number of backlink databases sprung up. These databases provide information about which web pages link to other pages, the anchor text, and often times a lot of extra information that nobody ever asked for but is used to justify the high prices.

To be an SEO tool these days you really need a backlink database. There's two ways to get one - either build your own or buy somebody else's by using APIs. The first way is massively complicated, so most people opt for the second way, but the second way means you are paying high prices to resell data to your customers. Ultimately then you're just passing on excessive costs.

The Crawl Tool is different and we want value for our customers, so our choice after looking into the cost of sources was to build our own database. The justification is that we then offer good value for our customers, but can also provide an API service that helps drive down costs in the industry as a whole.

SEO is, of course, a marketing field. And like many things in the industry a lot of stuff is marketing. Things like database size. To understand that we need to go back to the old days of search.

Search Engines of Old

It's 2005 and you've finally recovered from the anxiety of the year 2000 bug that it turned out didn't end the world after all and you're celebrating the realization that programmers had been using data offsets for decades and you don't need to worry about the next batch of it until 2030. Wait, what year is it actually now? I digress. You go to Google to find concert tickets for some awful group band. You see this:

You pause for a second to admire the modern design. You ask yourself "Am I Feeling Lucky", then just like everyone else decide that you're not and this is the internet so who knows where that button might send you.

Okay, back to 2025. Try to ignore that you're a mere 5 years away from the 2030 bug. You head to the same search engine and you see this:

Or something like that anyway. What do you not see?

Searching 8,048,044,651 web pages

Search for something and it won't even tell you how many results there are! Tell that to someone from 2005.

Back in the day these numbers were important. All the search engines did it - it was their marketing. Bigger is better. Right?

In reality though, who cares if your query has 6437280000 results if nobody is going to bother to click past a couple of pages, yet alone to page 64783. What is the point of 8,048,044,651 pages if 8,048,044,650 of them are spam on some unknown site in the basement of the internet. Of course, I exaggerate somewhat but the point is past the point where the index contains a useful corpus of pages, the rest that the majority of will never get shown isn't really meaningful or anything to boast about.

At some point in the last decade or so, they saw sense and the search engines have stopped doing this. Marketing your worth on a meaningless metric just drives up internal costs to meet your own marketing claims.

Some of Us Don't Learn So Quick

After that brief search engine history, let's get back on the topic of backlink databases for SEO tools. It's largely the same - once you have a corpus of all the important web pages then you have all the significant links. Whilst it might be nice for someone to see the web link from Uncle Ken's off-topic page that only Auntie Sara visits show up, it's not actually do anything for your SEO.

We're building our backlink database. We have 23 billion backlinks. At the end of the first build we should have about 26 billion in the database. If that sounds like a lot, that's because it is. It easily covers that corpus. But take a look at the competitors:

Ahrefs - claims to have the largest index of 35 trillion backlinks

Semrush - claims 43 trillion backlinks

DataForSEO - claims 2.8 trillion backlinks

They're astounding numbers. Ahrefs claims that they'd need to spend $300 million a year on cloud if they didn't have their own infrastructure.

One of the decisions we have to make at The Crawl Tool is do we end up chasing those numbers. Part of the answer is obvious - when you're the best value for SEO tool around you don't have a spare $300 million in your back pocket to equal them. But the other part is maybe only obvious because of the search engine explanation - having trillions of backlinks in your database is completely pointless when nearly all of them have no or negligible SEO value. It's just a number like the search engines used to slap on before they saw sense. A very expensive, utterly pointless, number.

Likely we'll grow the database some more after the 26 billion mark, but you rapidly get diminishing returns. Important for us is coverage of useful backlinks and not a number for marketing that is exorbitantly expensive to achieve and store useless data and the cost is then passed on to customers.

It also gets Worse than just Marketing

Discussing backlinks with people, they showed me Majestic's word cloud of backlink anchor texts. This made it immediately obvious the problem with the approach used by current backlink databases of filling their databases with meaningless data for marketing numbers. It creates a fog that makes analysis difficult. You have 7000 backlinks, great. But how do you know how many of them are worthless? Don't fear, they'll create a method but it'll involve more clicks. This adds up so they add on a word cloud. Now you can visualize the anchor texts all in one image. Except it's full of stop words and things like "https" and "com" because it hasn't been filtered properly and it comes up with weird words because that's what some pages that aren't contributing anything to your sites ranking are talking about. Because that stuff isn't weighted out of the chart because processing all that data of negligible value is too expensive.

It's such a problem that users have difficulty interpreting the data, so the companies themselves try to make it easier, and then realize that actually they can't!

Our Approach

We're building a backlink database of important data that is actionable and where it has real influence on your rankings. It should not be surprising that in a technical marketing industry some things go a little bit insane with marketing claims. Big search engines did it, so why not SEO companies. But we won't be following the trend because we can't afford to and because it's idiotic. We want a genuinely useful database and a genuinely good cost for our customers. If I ever say it has a trillion links, please hit me with a big stick.

Ready to find and fix your website's SEO issues?

Start with a free crawl of up to 1,000 URLs and get actionable insights today.

Try The Crawl Tool Free