You can also receive Free Email Updates:

Adding Custom Robots.txt File in Blogger

Zahri Kahoor | Tuesday, April 12, 2016 |


Description ::

Custom Robots.txt is a piece of code or information that let search engine crawlers know what to crawl i mean index from your blog and what it must don’t. So when search engine  comes to your blog it first comes to Robots.txt File after that it moves to other areas of blog. So robot.txt file is piece of code that tell search engine crawlers what to index and what not. It’s like Traffic Police Warden it can allow or stop search engine crawlers from indexing certain areas of Blog. Robots.txt file interacts with all kind of crawlers or spiders like Googlebot, which is Google search engine's spider. In simple words, search engines always want to index the fresh content on the web so they send their spider or crawler to crawl the new pages on the web. If the spiders find new pages they will likely to index those pages, now robots.txt file comes to the picture on behalf of you, spiders or crawlers only index the pages you are allowing in your robots.txt file. Keep in your mind, the crawler or spider will firstly look at your robots.txt files to obey the rules you have instructed. If you have disallowed a page in your robots.txt file then spiders will follow the rule and they will not index that page into their search engine.

Default Custom Robots.txt of Blogger Blog :: 

Every time you create a blog in blogger a default robots.txt is created and until you change it by the setting is dashboard it remains same.
User-agent: Mediapartners-Google
Disallow: User-agent: *
Disallow: /search
Allow: / Sitemap: http://www.technotricks.net/sitemap.xml


It is same for each blog and it is Adsense friendly. If you are using a custom domain for Blogger the default sitemap will be http://www.technotricks.net/sitemap.xml. This Blogger sitemap is updated from previous ones.

If this default robots.txt is able to satisfy your SEO need for the blog that you do not require to replace it by adding a custom one in by dashboard settings.

Adding Custom Robots.Txt to Blogger ::

Now the main part of this tutorial is how to add custom robots.txt in blogger. So below are steps to add it.

Go to your blogger blog.
  • Navigate to Settings  --> Search Preferences  --> Crawlers and indexing  --> Custom robots.txt  --> Edit 




  •  --> Yes






  • Now paste your robots.txt file code in the box.

  • User-agent: Mediapartners-Google

    Disallow:
    User-agent: *
    Disallow: /search
    Allow: /
    Sitemap: http://technotricks.net/feeds/posts/default?orderby=UPDATED
    Note :- Make sure you add your blog URL in last line where it says Sitemap : and change your url after that. 




    • Click on Save Changes button.




  • Now got to http://www.technotricks.net/robots.txt to confirm .

  • Note :- Change technotricks.net to your blog URL.

    • That's it ! Your Done !

     Explaining Code ::

    1. User-agent: Mediapartners-Google


    This code is for Google Adsense robots which help them to serve better ads on your blog. Either you are using Google Adsense on your blog or not simply leave it as it is.

    2. User-agent: *


    This is for all robots marked with asterisk (*). In default settings our blog’s labels links are restricted to indexed by search crawlers that means the web crawlers will not index our labels page links because of below code. So for those who are not programmers this piece of code tells spiders what to crawl and what to not.

    3. Disallow: /search

    Disallow as name suggest doesn’t allow search engine crawlers to crawl certain area’s of your blog and search is for those pages ,labels in your blog that you don’t want to get indexed.

    That means the links having keyword search just after the domain name will be ignored. See below example which is a link of label page named Softwares.


    http://www.technotricks.net/search/label/Softwares
    And if we remove Disallow: /search from the above code then crawlers will access our entire blog to index and crawl all of its content and web pages.


    Disallow Particular Post

    Now suppose if we want to exclude a particular post from indexing then we can add below lines in the code.

    Disallow: /yyyy/mm/post-url.html

    Here yyyy and mm refers to the publishing year and month of the post respectively. For example if we have published a post in year 2013 in month of March then we have to use below format.

    Disallow: /2013/03/post-url.html


    Disallow Particular Page

    If we need to disallow a particular page then we can use the same method as above. Simply copy the page URL and remove blog address from it which will something look like this:


    Disallow: /p/page-url.html

     4. Allow: / 

    As the name suggest it allows search engines to crawl certain areas of our blog, in our case that is our sitemap, and sitemap includes our blog-posts.

    Here Allow: / refers to the Homepage that means web crawlers can crawl and index our blog’s homepage.

    5. Sitemap: http://technotricks.net/feeds/posts/default?orderby=UPDATED

    This code refers to the sitemap of our blog. By adding sitemap link here we are simply optimizing our blog’s crawling rate. Means whenever the web crawlers scan our robots.txt file they will find a path to our sitemap where all the links of our published posts present. Web crawlers will find it easy to crawl all of our posts. Hence, there are better chances that web crawlers crawl all of our blog posts without ignoring a single one. 

    Note: This sitemap will only tell the web crawlers about the recent 25 posts. If you want to increase the number of link in your sitemap then replace default sitemap with below one. It will work for first 500 recent posts.  


    Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500

    If you have more than 500 published posts in your blog then you can use two sitemaps like below:

    Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500

    Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=500&max-results=1000

    Don’t put any code in your custom robots.txt settings without knowing about it. Simply ask to me to resolve your queries. I’ll tell you everything in detail. Thanks guys to read this tutorial. If you like it then please supports me to spread my words by sharing this post on your social media profiles. Happy Blogging!

    Please leave your comments below ,that help's us a lot ::


    1 comment:

    1. Explore 100+ niche topics in marketing,startups,finance,healthcare,sexual problem,legal,laravel,web development,logo,htaccess and more to improve skills, learning & knowledge Click Here

      ReplyDelete