How to make a robot, robots for kids, robots, build a robot, robot kits, etc this not same it’s “robots.txt” file is a file that tells search engine that search engine will crawl any page of a site and will not crawl some pages. This robots.txt file is in root folder in a webpage or website.
How to create a Robots.txt file
Robots.txt file with search engine bots, crawl and spider sites, which pages will be viewed and which pages will not be visible. Control of this system is called Robots Exclusion Protocol (Robots Exclusion Protocol) or Robots Exclusion Standard. Before making this file, let us identify some of the symbols used here
Robots.txt Protocol Standard Syntax & Semantics
Part / sign description
User-agent: refers to robot
Wildcard User-agent: It means all robots
Disallow: Each line starts with disallow. Then you can set the path to / with the URL. In this way, the path or file or the pages will not crumble the robots. If there is no path left, then disallow will work.
To make a comment. After this, a line is written so that this line can then be understood as to what the code below will be.
Disallow field can represent partial or full URL. / Path that is followed by a mark will not visit the path robot.
Disallow: / help
Disallows both /help.html and /help/index.html, while
Disallow: / help /
Will disallow /help/index.html but allow /help.html
Some Examples How to Make a Robot
All robots will allow all file visits (wildcard “*” indicating all robots)
All robots will not visit any files
Only the visit of Googlebot will be allowed
the rest will not be able to visit
Only the visit of Googlebot and Yahoo! Spark will have the approval, the rest will not be available
If you want to stop a specific bot visit then
With this file, if you stop crawling any URL or pages on your site, these pages can show anywhere, due to some problems. For example, the referral log may show URLs. Besides, there are some search engines whose algorithms are not very advanced, hence these engines When the spider / boat sends to crawl, they will crawl all your URLs by ignoring the instructions in the robots.txt file.