Uploaded image for project: 'AdMax'
  1. AdMax
  2. ADMAX-2919

Cloud Spider: Selecting "Crawl Any Subdomain Of The Requested Root Domain" option crawls all urls

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: Cloud Spider 3.03
    • Component/s: Cloud Spider
    • Labels:
      None
    • Environment:

      Test Env: (192.168.70.86)

      AWS - CSA #2 - Bug Fix Build - 19

      Services - CS #2 - Bug Fix Build - 29

      UI - AdMax-Trunk Build - 85 (client-pc)

      Description

      1. Login to Admax- spider application

      2. Enter the valid url to spider as "http://money.cnn.com/"

      3. select max number of pages to crawl as "1000" and ADR for generation

      4. Open Custom Crawl Options pane

      => select "Crawl Any Subdomain Of The Requested Root Domain" option button

      5. Select the spider to generate ADR, open the generated ADR and observe

      Actual Result:

      Found that the spider crawls all the subdomain urls (root and its subdomain)

      http://jobsearch.money.cnn.com/

      http://sportsillustrated.cnn.com/mobile/

      http://tech.fortune.cnn.com/

      http://ireport.cnn.com/

      Expected Result:

      The spider should crawl only the subdomain urls of root domain "http://cnn.com"

      "http://jobsearch.money.cnn.com/" should not listed as it is the subdomain url of the requested ubdomain "http://money.cnn.com"

      Note:

      It works fine for "http://advertsing.monster.com/" by listing only urls form its root domain

      http://career-advice.monster.com/

      http://jobs.monster.com/

      http://my.monster.com/

      http://excelle.monster.com/

        Attachments

          Activity

            People

            • Assignee:
              antony Antony Rajiv (Inactive)
              Reporter:
              saravanan.t Saravanan (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: