Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
547 views
in Technique[技术] by (71.8m points)

user agent - order of directives in robots.txt, do they overwrite each other or complement each other?

User-agent: Googlebot
Disallow: /privatedir/

User-agent: *
Disallow: /

Now, what are disallowed for Googlebot: /privatedir/, or the whole website / ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

According to the original robots.txt specification:

  1. A bot must follow the first record that matches its user-agent name.

  2. If such a record doesn’t exist, it must follow the record with User-agent: * (this line may not appear in more than one record).

  3. If such a record doesn’t exist, it doesn’t have to follow any record.

So a bot never follows more than one record.


For your example this means:

  • A bot that matches the name "Googlebot" is not allowed to crawl URLs with a path that starts with /privatedir/.
  • A bot that doesn’t match the name "Googlebot" is not allowed to crawl any URL.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...