Skip to main content

Everything about Meta Robots and robots.txt

Robots plays important role in the field of SEO. We have two ways to control pages and folders one is using Robots META tag and other is through robots.txt

A web page creator can specify which page should be indexed and which page should not be indexed by search engines by placing Robots META tag in the HTML section

Here are some Robots tags that are common

< content="NOINDEX" name="ROBOTS">- Ignore content and follow links
< content="NOFOLLOW, INDEX " name="ROBOTS">- Include content and do not follow links
< content="NOINDEX,NOFOLLOW" name="ROBOTS">- Ignore content and do not follow links
< content="INDEX,FOLLOW" name="ROBOTS">- Include content and follow links
< content="NOARCHIVE" name="ROBOTS">- Cache link should not show Search results pages
< content="NOODP" name="ROBOTS">- The Open Directory Project (ODP) title and description for the page should not be displayed in Search results
< content="NOYDIR" name="ROBOTS">- The Yahoo Directory title and description for the page should not be displayed in Search results
< content="NOSNIPPET" name="ROBOTS">- Titles are only displayed in Search results page and not description or text context for this page

In addition to manage folder level user agent control robots.txt file can be used. This file can be placed in root of each server and the format is plain text not HTML

Through this file website owner or webmaster can allow access to web page content and disallow access to admin, cgi and any secured files that you don’t want search engines to index

A typical robots.txt file will look like

User-agent: *
Allow: /
Disallow: /admin*
Allow: *content*
Disallow: /test/
Disallow: /paypal/
Disallow: /credit/
Disallow: /cgi/


Explains, all robots can crawl except the admin files, and crawl files named content folder, and should not crawl test, paypal, credit and cgi folder.

Hope this post helps to know more about robots and its act in SEO. You can analyze further on checking Google's robots.txt file http://www.google.com/robots.txt and post comment or tweet @jagadeeshmp if you need better understanding :)

Popular posts from this blog

5 SEO Tips to follow in 2022

You would have already knew these SEO tips but you would have ignored or missed out. If you know or not, do consider following and look very close to these, #1 Disavow file Check your site backlinks using your fav SEO tool and find out backlinks which you think is bad and add to the disavow file .    Reference https://www.google.com/webmasters/tools/disavow-links-main #2 Page Speed Improve your page speed for better user experience. This is one of the important ranking signal. Do an analysis & optimize your website with Page speed tool.     Reference https://pagespeed.web.dev/ #3 User Interaction Make sure you provide a good user experience by not completing your web page design. Make it simple and drive the user through an interaction in the form of simple CTA button to get the data. #4 Page layout Shift Every web page should have visual stability. Image or video with unknown dimensions or fonts that loads large then falls back to small or any wired widget ...

Disavow.txt Smart Code Report to Disavow Spammy Or Low-Quality Links

You need to make sure about the links to be disavowed. The main factor is about low-quality & spammy links those points to the site. The process need to be carried out only if you think the links that are to be mentioned are causing issues to your site's online presence. # This Disavow Report was created on Date, year (Today) for domain.com # This report includes links of 2 prior reports submitted on Month Date, Month Date # This report also contains new data added today, Month Date # We have request webmaster to remove our links from their domain. The links from the following domain were cleaned up domain:1xyz.com domain:2xyz.com domain:3xyz.com http://spam.abc.com/article/comments.html # The webmaster did not respond. The links from the following domain were not cleaned up domain:1abc.com domain:2abc.com domain:3abc.com # The below blogspot has differnt ccTLDs. We assume Google will apply disavow to all ccTLDs. domain:xyz.blogspot.com Customize the disa...

What Google Hates?

As we know Google is trying to bring best search results with wide range of algorithms and the most popular panda update does a lot to meet user’s experience. In some cases, even now we see n number of scrapper sites and search manipulation sites in web search. Google is still fighting to get all these sites dumped. What Google hates? 1. Serving different results for search engines and users. 2. Content copied from different sites. 3. Buying paid links to manipulate site ranking. 4. Keyword stuffing in content and HTML tags. 5. Hiding keywords using style sheet. 6. Too many JavaScript pop-ups in a page. 7. Slow loading sites. 8. Low server response sites. 9. Duplicate URLs and sites that don’t use canonical URL tags. 10. Dynamic sites that doesn’t use parameter handling. 11. Pages that have high bounce rate. 12. Tags that doesn’t meet body content. 13. Low average time spent on each page. 14. Non-search engine friendly URLs 15. Images that are not optimized which res...