Details
-
Type: Bug
-
Status: Closed
-
Priority: Minor
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: Cloud Spider 3.02
-
Component/s: Cloud Spider
-
Labels:None
-
Environment:
Test Env: (192.168.70.86)
Services - Trunk Maven Build - 917
AWS - CSA - Trunk Build - 46
UI - AdMax-Trunk Build - 81 (client-pc)
Description
1. Login to Admax- spider application
2. Enter the valid url to spider as "http://money.cnn.com/"
3. select mac number of pages to crawl as "1000" and ADR for generation
4. Open Custom Crawl Options pane
=> select "Crawl Specific Subdomains" option button
=> Enter the subdomains to crawl as
5. Select the spider to generate ADR
Open the generated ADR and observe
Found that the spider crawls only the "http://money.cnn.com/" url and doesn't crawls http://realestate.money.cnn or any other subdomain urls
Note:
1. Also tried with a crawl of 5000 ulrs and other subdomain urls and find the same issue (S - 790, 791, 793, 7945, 797, 798)
"http://tech.fortune.cnn.com/
http://jobsearch.money.cnn.com/
2. Found that the "spider_service.crawlRequestSubdomains" table is updated with the subdomain urls specified correctly