To exit from a manual penalty or Google Penguin, it is necessary to identify the bad backlinks that point to your site in order to remove them. In fact, it is even necessary to carry out this analysis of backlinks in any SEO mission to prevent any penalty, algorithmic or manual. This is especially true since Google Penguin is directly registered in the Google algorithm because it is more difficult to know if a site is penalized or not seen that we can no longer make correlations between the date of An update and a drop in traffic.
Edit 05/10/5026: In addition, Penguin 4 tends to devalue the toxic link rather than penalize an entire site, which also makes it more complex to analyze a decrease in traffic that would be more granular. However, a manual penalty is always possible and in any case devaluation of a link always affects a portion of a site or a site globally, as a result of the snowball effect on the transmission of the PageRank. Not to mention the detrimental effect on other algorithmic criteria and the overall relevance of the site implied by a bad neighborhood of links.
It can be a long analysis, because you can quickly end up with thousands of backlinks on small to medium sites, up to several million on large sites. So I will give you my tips to know quickly and easily the bad backlinks. It will remain more than to delete them.
What is a poor quality link?
Before going directly into the operational to know and remove the bad backlinks, it is good to remember what a poor quality link. Many sites have abused and still continue to abuse Google’s algorithm with dummy links. Because the incoming links and the page rank they bring to the site are a ranking factor in SEO, a number of inbound links to a site outweighed their quality.
But this time is gone, it is better to have few quality links than a lot of bad quality. More concretely, The bad backlinks are links:
- Too many and on competitive keywords (often with texts of suprtimtimized links)
- From sites of poor quality or even already penalized
- Non-thematic, non-contextualized links
- And anything that can create too much imbalance in the site’s link profile (I’m talking about the notion of link profile further)
Basically, it’s often directories, link farms, partner links badly placed (in the footer for example), foreign sites whose theme has nothing to do, etc …
Where do these bad incoming links come from?
Well, it often comes from the creation of links (link building) by unscrupulous SEO agencies that use black hat methods or just incompetent. For old links, this is understandable because of a few years ago, Google was less looking and less effective in detecting dummy links. That said, it was already part of the bad practices, and personally it has been more than 10 years that I advise against this facility because I have always registered the SEO in the long term.
But this is obviously not the case for everyone, especially as many SEOs still continue to try to outsmart Google with these short-term strategies.
Dummy links can come from negative SEO attempts, ie malicious people who deliberately enclose your site on poor quality sites so that the site is penalized. It is becoming more and more commonplace, some specialize in it and if there is an offer, there is a demand: The competing sites.
It can also come more simply from sites that automatically aggregate content from other sites, resulting in this type of edge effects.
The first step, you must establish a list of all the backlinks of the site. For this purpose, one or more tools may be used. Personally, I mix search console and majestic SEO (paying). The only search console data is not sufficient because it does not allow to extract all the backlinks and it lacks a lot of interesting data like link text for example.
Other tools exist like ahrefs, opensite explorer or searchmetrics but in any case you will have to go through a paid tool to have a maximum of backlinks and information like link date, link quality index, correspondence between Linking page, etc .. I prefer Majestic SEO because according to my tests it seems to be the biggest base of backlinks and I like its quality scores: TrustFlow, Topical Trustflow and CitationFlow for popularity.
Once the list of incoming links exported to excel, I add a column “binding domain” in order to extract the domain of each backlink. The goal is to easily filter and make dynamic crosstabs on domains. And here is an excel formula to extract the domain from a URL:
E2 étant la case où se trouve l'url dont on veut extraire le domaine
It’s simpler to analyze domains than each backlink one by one.
2 / Analyzing new links
Analyzing the evolution of new incoming links is a very good track for already identifying bad links. The massive influx of new links over a short period will be considered suspicious by Google, and surely it will necessarily be of poor quality links.
Finally, maybe not all of them, for example, I saw a quality blog linking to a site on a thematic article, which is fine, but this same blog had a bad idea to add at the same time this Same link on a “partner” block on all pages of the site. In this case the article, Yes, but the partners block No. This is why in all cases, it will always be necessary to perform a manual check before you want to completely remove a binding domain.
Be careful not to be too quick to remove links, otherwise, you will lose popularity and SEO.
3 / Knowing the distribution of incoming domains
Knowing the distribution of domains linking yours is the basis and with the famous column “binding domain”, it should not be too complicated for an excel user. Here the method is simply going to have a look at the first binding areas and with a little common sense, one realizes right away whether it is a poor quality site or not.
In the same spirit, it is good to look at the distribution of TLDs, ie the extension of the domain name. If your .fr site receives .ru links, beware.
4 / Identify the distribution of Ips
This time, it is a less classical method. This is to extract the IP of each binding domain and make a pivot table to see the distribution of the IP that makes the links. To retrieve IPs, you have the choice: netpeak checker, SEO tools.
The purpose of this analysis is to see the IPs with many binding sites. It is normal to have several sites on the same IP in some cases when it comes to feverous sites on shared servers and one will easily recognize Ops of type OVH.
But if you see for example 10 different sites on the same IP, it is unlikely that it is 10 quality sites unless your site is very well known. There is a good chance that these are bad links from sites automatically generated on the same CMS.
And hop, again domains to add to your list of domains to disavow. No need to contact webmasters in this kind of case, on the contrary, because if this is negative SEO, you will tell them you have identified them and they will quickly remove the domains to create other.
5 / Profile of links and link texts
Having a good link profile means having a distribution and a variety of inbound links that look natural and the ratio is close to the average of the sites in your industry. For example, a site often receives more than 50% of incoming links on its brand and if your site receives more than half of its links on one or a few generic keywords, it will be considered suspicious.
One of the essential elements taken into account by Google Penguin which is now in real time is the Link Text (or link anchor). It is, therefore, necessary to carry out a small pivot table on excel and to look at the distribution of the texts of links received on the site. Then look at the areas that make many links with the same anchors.
Here too, you have to manually check each domain in this situation, there are not necessarily many but it must be done because the solution will not always completely remove all incoming links from the domain but only some. In this kind of case, you have to contact the webmaster so that it deletes, for example, the links of the footer but not the links since certain articles.
There are also other criteria to check to have a profile of natural links like the ratio between links follow and nofollow, the share of the links between homepage and deep pages, the part of the thematic sites and geolocalized, Elsewhere, beyond cleaning backlinks, in developing a netlinking strategy , it is essential to keep a balanced and natural link profile.
6 / Measuring poor quality links
One of the main assets of tools like Majestic SEO is that they give clues about the trust that can be worn to each backlink. The majestic SEO trustflow, for example, is based on a list of trusted sites and a list of suspicious sites. Close links with these sites will depend on this confidence score, high if a site is close to trusted sites, low if it is close to suspicious sites and/or is not close to trusted sites.
The CitationFlow, on the other hand, is a more quantitative criterion, representative of the number of links that a site receives, namely its popularity.
Of course, it’s not perfect, complicated to be exhaustive on all websites worldwide. In addition, young sites often have very low or no trustflow, which is normal and will not be removed.
However, subject to manually verifying them, sites with a 0 trustflow can quickly rough out and identify some bad incoming links before cleaning them.
What I do is add a new column to the excel file I call “TF / CF”, the ratio between TrustFlow (quality) and CitationFlow (popularity). The interest is to identify sites that have a lot of popularity but coming from low-quality sites. Filtering domains with a TF / CF ratio of less than 0.3 often make it possible to know sites that resort to abusive linking strategies. And it’s best to avoid receiving links from this type of site.
Another nice ratio to do is the number of backlinks/number of pages indexed by the binding domain. This makes it easy to reveal sites that abnormally link to your site. On Majestic SEO, this corresponds to the ratio on the “IndexedURLs / ExtBackLinks” criteria.
7 / Remove bad links
Once you know all dummy links and spammy domains, it’s time to remove them. For this, there are mainly two methods:
- Contact the webmaster: We have seen, in some cases, it is better to contact him and even others where to avoid. To know the webmaster of a website, contact or go through a Whois.
- Disavow the links: (Yes it is always necessary Dixit Gary Illyes to avoid manual penalties). A method that tells Google not to take into account the bad quality links that we have previously listed. To do this simply go to the Google disclaimer page. Simply send a simple text file with a URL per line. And to remove all the links to the same domain, just add “domain: example.fr”. More on Google Help
Although this methodology can speed up the discovery process of spammy links, it is often necessary to go through a manual check to remove doubts about suspicious links. And if the removal of bad backlinks will allow the site to recover positions in the results pages of the search engines, be careful not to go too far not to lose … It is necessary also to regularly monitor the new backlinks so To correct them in real time.