What is robots.txt?Robots.txt is a text (not html) file you place on your internet site to tell search robots which pages you would like them not to take a look at. Robots.txt is by no suggests mandatory for search engines but frequently SE obey what they are asked not to do. It is fundamental to clarify that robots.txt is not a way from preventing SE from crawling your internet site (i.e. it is not a firewall, or a kind of password protection) and the truth that you place a robots.txt file is a thing like placing a note "Please, do not enter" on an unlocked door - e.g. you can't prevent thieves from coming in but the high-quality guys will not open to door and enter. That is why we say that if you have really sensitive information, it is as well nave to rely on robots.txt to protect it from becoming indexed and displayed in search outcomes. The location of robots.txt is quite fundamental. It will need to be in the principal directory because otherwise user agents (search engines) will not be able to obtain it - they do not search the entire internet site for a file named robots.txt. As an alternative, they appear initial in the principal directory and if they don't obtain it there, they simply assume that this internet site does not have a robots.txt file and subsequently they index everything they obtain along the way. So, if you don't place robots.txt in the right location, do not be shocked that search engines index your entire internet site.
Why is it implemented?It is tremendous when search engines regularly take a look at your internet site and index your content material but generally there are cases when indexing components of your online content material is not what you want. if you come about to have sensitive information on your internet site that you do not want the globe to see, you will also prefer that search engines do not index these pages (despite the fact that in this situation the only sure way for not indexing sensitive information is to preserve it offline on a separate machine). Also, if you want to save some bandwidth by excluding images, style sheets and JavaScript from indexing, you also have a way to tell spiders to preserve away from these items. A single way to tell search engines which files and folders on your Internet internet site to stay clear of is with the use of the Robots Meta tag. But considering that not all search engines read Meta tags, the Robots Meta tag can simply go un noticed. A greater way to inform SE about your will is to use a robots.txt file. Structure of robot.txt:The structure of a robots.txt is pretty easy (and barely flexible) - it is an endless list of user agents and disallowed files and directories. Fundamentally, the syntax is as follows: User-agent:Disallow:
"User-agent:" Right here user agents are search engines' crawlers and disallow: lists the files and directories to be excluded from indexing. In addition to "user-agent:" and "disallow:" entries, you can incorporate comment lines - just place the # sign at the beginning of the line: # All user agents are disallowed to see the /temp directory. User-agent: *Disallow: /temp/
SEO Service
No comments:
Post a Comment