It is quite imperative for any website to get indexed by Google.
And being into SEO you must know the fact that without getting your page indexed by Google, it is not possible for anyone to find you on Google, unless they type in your website address directly.
Hence, getting indexed is a necessity.
However, there are websites which do not get indexed by Google.
Now, while working with a website, you may have noticed the fact that not all pages get indexed by Google, and in some cases a few pages take weeks to get indexed.
There are a good lot of reasons behind this occurrence. It can be anything starting from ranking factors like content quality, links, etc., to complex technical issues.
In fact, earlier, websites using complex tech used to find themselves in a lot of mess when it is about indexing, and the problem still persists.
There is even a myth about these technical aspects among SEO executives. They often assume that these technical aspects are the reasons behind Google not crawling through their websites.
This cannot be farther from the truth.
You do need to send consistent signals to make sure that Google understands that you to get your website indexed.
However, having high-quality content is also necessary for the same. Otherwise, getting indexed by Google is of no use.
In case of most websites, no matter what their size is, have lots and lots of content that does not get indexed by Google, as they should.
They also say that JavaScript can complicate your chances of getting indexed by Google, but that is not entirely true as even if you have crafted your website purely via HTML, that too can be left from being indexed by the search engine giant.
Okay, so no more confusions. Here in this post, we will be pointing out the primary issues regarding Google’s indexing and will tell you about how you can solve them in no time.
Why Is Google Not Indexing Your Website or Web Pages?
First of all, to find out which of your web pages are not getting indexed you can simply make use of a good customised indexing checker tool.
In fact, if you use this tool to go through the websites of most popular eCommerce websites, you will find almost 15% of their pages are not available on Google however, they are supposed to be indexable.
So, here the basic question that remains is nothing bur a “Why?”.
Isn’t that the question that you are asking us in your head?
Yes, we all are looking for the answers to questions like what are the reasons behind Google not indexing certain pages when practically they should be indexed?
In fact, if you go through the search console of Google you will find reports regarding several pages that have not been indexed. These are mentioned as “Crawled – currently not indexed”, “Discovered – currently not indexed”, “Duplicate”, etc.
We know this small piece of information is not enough, but it can be considered as a stepping stone to start looking, isn’t it?
Primary Indexing Issues
The most widely seen indexing issues in Google Search Console are discussed below. Go through to know more.
1. “Crawled – currently not indexed”
This report means that Google has visited your web page but has not indexed it yet.
This is mostly related to the quality of content that you have uploaded in those webpages.
Yes, we can expect Google to be even more picky when it is about the quality of content.
Now, if you have a similar issue i.e., “Crawled – currently not indexed”, it is your job to make sure that the quality of your content is high enough.
Here, you can take certain measures which are,
- Use the best quality of titles, meta descriptions, as well as content on the given pages.
- Make sure that the content you put in there is not copied.
- Make use of canonical tags
- Use robots.txt file or noindex tags to make in certain areas where low quality content is present which you do not want Google to index.
If you can ensure to follow these strictly then this problem will get resolved at the earliest.
2. “Discovered – currently not indexed”
The reason behind this issue can be anything from poor quality of content to issues regarding crawling.
This problem mainly occurs in the cases of vast e-commerce stores.
The reasons behind such kind of a problem where Google marks them as “Discovered – currently not indexed” are,
- Issue regarding crawl budget, where there are too many links or URLs which have been lined up to crawl and Google has scheduled them for crawling or indexing later on.
- Issue regarding quality of content, where Google does not consider certain pages, worth crawling and does not visit such websites for certain reasons.
To deal with such kind of a problem you need to have a high level of expertise. But, you do not need to worry as we are here to help you.
To resolve such an issue what you need to do is,
- Find out of the pattern of the concerned pages which fall in the given category. This problem might occur for a particular segment of products.
- Check and better your crawl budget. Find out pages which have low quality of content where Google spends a lot of time in crawling. Such pages might incorporate filtered category pages as well as internal search pages, and there can be a ton of such pages in a vast e-commerce website which fall in this category. Now, if Google can easily get them crawled, then there might be a lack of resources for your website to get the Google bots to reach your pages that require to be indexed.
Keep a keen eye on these issues and resolve them as soon as possible.
3. “Duplicate”
There can a lot of reasons behind a duplicate content being present on your website. A few of them are,
- Variations of language. If you have different versions of your page in British English, American English, Canadian English or anything similar, then Google might consider them as a duplicate of the original page and some of them might end up being unindexed.
- Content duplicated by competitors. This kind of incident happens quite frequently in cases of e-commerce websites where a lot of different websites utilise the same product description that the manufacturer provides them with.
You can simply use “rel=canonical”, “301 redirects”, or even end up providing unique content for the products that you offer.
But, to do something that is out of the box, you can check out how fast-growing-trees.com does their job.
Yes, they have been doing it quite well by providing users what they seek for the most. Their strategy is to provide detailed FAQ for their products which is of a lot of use for their potential buyers and helps resolve their queries.
They also provide an option of comparing several products on their website.
Not just that, they also allow customers to post their queries regarding plants and get them answered directly by the community.
Checking Your Website’s Index Coverage
Want to get your hands on the information regarding the pages of your website which have not been indexed by Google? You can get the sane information quite easily through the Index Coverage report that you can get from the Google Search Console.
Here the first thing that you need to keep an eye on is how many of excluded pages are present there.
Now, as you have a list of the same, look for a pattern among those pages which were not indexed.
If you are in charge of an e-commerce store website, you will surely end up finding a lot of unindexed pages that belong to your website. And, as we mentioned earlier, this mostly happens in case of large e-commerce websites.
The reason behind this is large e-commerce stores have a high number of pages which have duplicate content and also there are certain pages which contain products that are out of stock.
The main issue with such pages is that they have poor quality content that hinders their chances of being at the top of list of product pages which are to be indexed.
Moreover, such huge e-commerce stores also have an issue with their crawl budget. In fact, there are many e-commerce stores which offer numerous products but most of them are marked as “Discovered – currently not indexed”.
Now, if you find any of your important pages marked as the same, then you should definitely go ahead and take necessary action.
Increasing the Probability of Getting Indexed by Google
You can try this to resolve the issues with getting indexed by Google, if you have certain pages which have not been indexed.
- Visit Google Search Console
- Go to the URL inspection tool
- Paste the link of your page that you want Google to index
- Wait for Google to the necessary action
- Click on “Request indexing”
This is the best technique and you should definitely try this after publishing a new page or a new post. This is very important as it effectively tells Google about the content or the page that you have added to your website and you want them to check it out.
Although this technique is not going to help you for the problem that we have been discussing. To solve the same, check out the ways mentioned below:
1. Remove crawl blocks in your robots.txt file
If you find out that Google has not indexed every page of your website, then there might be crawl block present in the codes. It will appear as robots.txt.
You can find it out just by visiting “yourdomain.com/robots.txt”.
Check if these two snippets are present in the code of your website:
1. User-agent: Googlebot
2. Disallow: /
1. User-agent: *
2. Disallow: /
These codes are put in your website intending to prohibit Google from crawling certain pages of your website. And, to resolve this issue, all you need to do is removing them from the script.
If you find out that Google has not indexed a single web page of your website then check for crawl blocks in robots.txt file.
You can simply do that by visiting the URL inspection tool present in Google Search Console and pasting the URL in the necessary area.
Now, just click on “Coverage block” to view more details, followed by looking for “Crawl allowed? No: blocked by robots.txt” error.
The above means that the particular page is blocked from crawling.
Now, for such cases, go through the robots.txt file again to find if there are any “disallow” rules that are present.
If you find any, just remove them.
2. Remove rogue noindex tags
If you tell Google that you do not want them to index a page, they won’t. Using this feature, you can keep a few pages of your website hidden.
You can do that just by following the below mentioned methods.
Method 1: meta tag
Pages that have these meta tags present in the <head> section, are prohibited for Google and that is why they will not index them.
1
<meta name=“robots” content=“noindex”>
1
<meta name=“googlebot” content=“noindex”>
These codes called meta robots tags. These are responsible for letting search engines crawl over a webpage.
Here the primary section is the “noindex” value. If you find this “noindex” value anywhere in the codes of a given page, then the page will not be indexed by Google.
Now, it is not possible for you to spend all you time finding noindex tags in the codes of every page that your website has.
Here what you can do is running a crawl on your website with the Site Audit tool from Ahrefs. After you are done running the crawl, visit the Indexibility report. Check if there are any “Noindex page” notices.
If you find any, just remove them.
Method 2: X‑Robots-Tag
X-Robots-Tag HTTP is also something that prohibits crawlers from indexing a website.
It can be easily implemented using any scripting language that is used on the server-side or by altering the configuration of your server.
As we mentioned earlier, you can find if any page restricts Google from indexing itself via the URL inspection tool that you find in the Search Console.
The Site Audit tool from Ahrefs can sole this too. All you need to do is entering “Robots information in HTTP header” as a filter.
Lastly, ask your website development personnel to remove the pages that need to be indexed from the list.
3. Remove rogue canonical tags
Google gets to know about the important and unimportant pages of your website via the sitemap.
It also gives Google a rough idea about the how frequently these pages should be crawled.
However, it is not necessary that Google will find your website pages only if they are mentioned in the sitemap, however, it is considered as a good practice.
This is because, you should be wanting Google to understand your website easily so that you can get your important pages indexed as soon as possible.
Now, you can easily find out if an important page is present in your sitemap just by using the same URL inspection tool that we mentioned before.
Here if you find any mentions of “Sitemap: N/A”, then you should understand that the page has not been indexed or is not in the provided sitemap.
You can also enter yourdomain.com/sitemap.xml in the search bar and look for the given page.
Now, it is time to add those pages back to your sitemap. Once done with the same, just let Google know via:
http://www.google.com/ping?sitemap=http://yourwebsite.com/sitemap_url.xml
Just swap that last part of the above-mentioned URL with that of your sitemap’s.
You can also try these:
Sidestep the “Soft 404” signals
Check if your web pages do not give out signals referring to “404”. Anything like “Not found” or having “404” mentioned in the URL will do the same.
Utilise internal linking
Internal linking tells Google that the linked page is important of the same. This also means that Google will index the page. So, make sure every page is included in the sitemap.
Use a robust crawling strategy
Make sure Google crawls the important pages of your website first without wasting time and crawling budget on the less important pages. Perform server log analysis to understand how to regulate the same.
Remove poor-quality or duplicate content
Check for duplicate content or content that is of poor quality and remove them as soon as possible. This is because such content prohibits Google from indexing certain pages of your website.
Send consistent SEO signals
By varying canonical tags with the help of JS you can easily send consistent SEO signals to Google which ask the search engine giant to crawl through the concerned website or webpage. Do this often to make sure your website is crawled and indexed.
Bottom Line
Google has played a big role in the last decade by improving their capability to process JS which has significantly made the jobs of SEO executives a lot easier. And, that is why most of the JavaScript made website are getting indexed real quick.
However, we cannot deny that the internet in growing by leaps and bounds with every passing day and Google is trying their best to keep up.
In this process it is not unnatural that your website or a few pages of your website will not get indexed automatically.
If anything as such happens, now you know how to deal with the same.
So, waste no more time, and get started with what needs to be done.
What are you waiting for?