Skip to content

Old Rust docs still appear in search results #14111

Closed
@ericsampson

Description

@ericsampson

In #9955, @brson added a robots.txt file (thanks!) to prevent search engines crawling non-current docs. However, I just noticed that this doesn't really address the issue - I used Google to search for 'Rust pointers', and the third result was static.rust-lang.org/doc/0.6/tutorial-borrowed-ptr.html , although the robots.txt did stop Google from providing a description of the result ;)

Apparently robots.txt will stop Google/etc from crawling a file, but not from indexing a file if anyone on the internet has linked to it:
https://support.google.com/webmasters/answer/156449?hl=en
Not sure what web server rust-lang runs on, but e.g. in Apache you can use .htaccess to write a X-Robots-Tag header to set noindex/nofollow on entire directories, instead of having to add it in the header of each page:
http://perishablepress.com/taking-advantage-of-the-x-robots-tag/
One note is that to make this approach work, I believe you have to not block crawling using robots.txt, or else the crawlers will never notice the X-Robots-Tag :)

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions