Uploaded image for project: 'AdMax'
  1. AdMax
  2. ADMAX-2838

Cloud Spider: Crawl specific subdomains options works incorrectly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Cloud Spider 3.02
    • Component/s: Cloud Spider
    • Labels:
      None
    • Environment:

      Test Env: (192.168.70.86)

      Services - Trunk Maven Build - 917

      AWS - CSA - Trunk Build - 46

      UI - AdMax-Trunk Build - 81 (client-pc)

      Description

      1. Login to Admax- spider application

      2. Enter the valid url to spider as "http://money.cnn.com/"

      3. select mac number of pages to crawl as "1000" and ADR for generation

      4. Open Custom Crawl Options pane

      => select "Crawl Specific Subdomains" option button

      => Enter the subdomains to crawl as

      http://realestate.money.cnn

      5. Select the spider to generate ADR

      Open the generated ADR and observe

      Found that the spider crawls only the "http://money.cnn.com/" url and doesn't crawls http://realestate.money.cnn or any other subdomain urls

      Note:

      1. Also tried with a crawl of 5000 ulrs and other subdomain urls and find the same issue (S - 790, 791, 793, 7945, 797, 798)

      "http://tech.fortune.cnn.com/

      http://realestate.money.cnn

      http://jobsearch.money.cnn.com/

      http://realestate.money.cnn"

      2. Found that the "spider_service.crawlRequestSubdomains" table is updated with the subdomain urls specified correctly

        Attachments

          Activity

            People

            • Assignee:
              antony Antony Rajiv (Inactive)
              Reporter:
              saravanan.t Saravanan (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: