Details
- 
    Type:Bug 
- 
    Status: Reopened
- 
    Priority:Major 
- 
    Resolution: Unresolved
- 
    Affects Version/s: unspecified
- 
    Fix Version/s: None
- 
    Component/s: Spider
- 
    Labels:None
- 
    Environment:Operating System: Windows XP 
 Platform: PC
- 
        Bugzilla Id:3589
Description
Prerequisites:
Edit "robots.txt" file in root folder as 
--------------
User-Agent: Googlebot
Disallow: /Spider/simple.htm
--------------
and save it
Steps:
1. Log into AdMax application.
2. Navigate to SEO section, click on "Spider"
3. Enter a valid URL to spider say
"http://pvwb-of1pvd0010.ri.thesearchagency.com/gurpreet/Spider/newlink.htm"
4. Select "custom spider options, ensure "Honor Robots" check box is checked
5. Spider the URL and download the generated ADR report
Expected Result:
In ADR, the link to "Spider/simple.htm" should not be listed in "Spidered
Urls_1" as they are disallowed
Actual Result:
In ADR, the link to "Spider/simple.htm" is listed in "Spidered Urls_1"
Note:
Observed that Honor "robots.txt" file is working fine, when used a wild card
character (User-Agent: *) to exclude them all