How to add user-agent / bot to robots.txt file to prevent crawling / trolling
- Article Type: General
- Product: Primo
Desired Outcome Goal:
Add a specific user-agent / bot to the robots.txt file to prevent crawling or trolling
Procedure:
IMPORTANT NOTE: THIS IS RELEVANT FOR LOCAL CUSTOMERS ONLY! IF YOU ARE A DIRECT-DEDICATED OR TOTAL-CARE CUSTOMER PLEASE CONTACT EXLIBRIS SUPPORT FOR ASSISTANCE WITH THIS PROCESS!
See "Additional Information" section for further details on the robots.txt file
The path to locate the robots.txt file is: /exlibris/primo/p4_1/ng/primo/home/system/tomcat/search/webapps/ROOT/robots.txt
By default (OTB) the file contains:
---
# Disallow all robots from this directory structure.
User-agent: *
Disallow: /
---
This means that all robots should be disallowed from the entire site. If you continue to experience issues with a specific robot trolling/crawling your site, this user-agent can be added specifically to the file.
Using the above format, user-agents/bots can be added. For example:
---
# Disallow all robots from this directory structure.
User-agent: Googlebot
Disallow: /
---
The robots.txt file can only be edited as root user.
Additional Information
Before making any changes to this file, we highly recommend researching the topic.
See: http://en.wikipedia.org/wiki/Robots_exclusion_standard
- Article last edited: 6/3/2014