{
  "DOI": "10.5281/zenodo.1453453",
  "abstract": "Free-range what!?\n\n\nThe robots exclusion standard, a.k.a. robots.txt, is used to give instructions as to which resources of a website can be scanned and crawled by bots.\nInvalid or overzealous robots.txt files can lead to a loss of important data, breaking archives, search engines, and any app that links or remixes scholarly data.\n\n\nWhy should I care?\n\n\nYou care about open access, don\u2019t you? This is about open access for bots, which fosters open access for humans.\n\n\nMind your manners\n\n\nThe standard is purely advisory, it relies on the politeness of the bots. Disallowing access to a page doesn\u2019t protect it: if it is referenced or linked to, it can be found.\nWe don\u2019t advocate the deletion of robots.txt files. They are a lightweight mechanism to convey crucial information, e.g. the location of sitemaps. We want better robots.txt files.\n\n\nBots must be allowed to roam the scholarly web freely\n\n\nMetadata harvesting protocols are great, but there is a lot of data, e.g. pricing, recommendations, that they do not capture, and, at the scale of the web, few content providers actually use these protocols.\nThe web is unstable: content drifts and servers crash, this is inevitable. Lots of copies keep stuff safe, and crawlers are essential in order to maintain and analyze the permanent record of science.\nWe want to start an informal open collective to lobby publishers, aggregators, and other stakeholders to standardize and minimize their robots.txt files, and other related directives like noindex tags.\n\n\nOur First Victory\n\n\nIn September, we noticed that Hindawi prevented polite bots from accessing pages relating to retracted articles and peer-review fraud. Hindawi fixed their robots.txt after we brought the problem to their attention via Twitter. We can fix the web, one domain at a time!",
  "author": [
    {
      "family": "Boruta",
      "given": "Luc"
    }
  ],
  "id": "1453453",
  "issued": {
    "date-parts": [
      [
        "2018",
        "10",
        "09"
      ]
    ]
  },
  "language": "eng",
  "publisher": "Zenodo",
  "title": "Free-Range Spiderbots!",
  "type": "graphic"
}