Details
-
Type: Bug
-
Status: Reopened
-
Priority: Major
-
Resolution: Unresolved
-
Affects Version/s: unspecified
-
Fix Version/s: None
-
Component/s: Spider
-
Labels:None
-
Environment:
Operating System: Windows XP
Platform: PC
-
Bugzilla Id:3589
Description
Prerequisites:
Edit "robots.txt" file in root folder as
--------------
User-Agent: Googlebot
Disallow: /Spider/simple.htm
--------------
and save it
Steps:
1. Log into AdMax application.
2. Navigate to SEO section, click on "Spider"
3. Enter a valid URL to spider say
"http://pvwb-of1pvd0010.ri.thesearchagency.com/gurpreet/Spider/newlink.htm"
4. Select "custom spider options, ensure "Honor Robots" check box is checked
5. Spider the URL and download the generated ADR report
Expected Result:
In ADR, the link to "Spider/simple.htm" should not be listed in "Spidered
Urls_1" as they are disallowed
Actual Result:
In ADR, the link to "Spider/simple.htm" is listed in "Spidered Urls_1"
Note:
Observed that Honor "robots.txt" file is working fine, when used a wild card
character (User-Agent: *) to exclude them all