Uploaded image for project: 'AdMax'
  1. AdMax
  2. ADMAX-2818

Cloud Spider: ADR generated from some aborted crawls contains Urls from previous successfull crawls

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Cloud Spider 3.02
    • Component/s: Cloud Spider
    • Labels:
      None
    • Environment:

      Test Env: (192.168.70.86)

      Services - Trunk Maven Build - 895

      AWS - CSA - Trunk Build - 32

      UI - AdMax-Trunk Build - 80 (client-pc)

      Description

      1. Log into AdMax application and navigate to SEO-> Spider module

      2. Spider a valid url say "http://ec2-184-72-66-227.compute-1.amazonaws.com/" to generate ADR (S-702)

      => Wait for the crawl to complete and let its ADR generation starts

      3. Again spider the same url "http://ec2-184-72-66-227.compute-1.amazonaws.com/" to generate ADR (S -705)

      4. In the View Crawl Reports, stop the crawl before it gets started (ensure when the crawl is aborted, the Spider Crawl Details shows estimated complete time and the total pages spidered as 1/0)

      5. Now download the ADR generated (S 705. R 2134)

      Found that the Spider Crawl Details box (S-705) shows the Total number of URLs crawled as 0 but the ADR shows there were 17 spidered URLS

      Observed the following in ADR's generated from aborted crawls, when testing ADMAX-2771

      1. In some cases the ADR's generated for the aborted crawls contains the different number of URLs to the ones they crawled

      2. When ADR (R-2133) is generated with a single page crawled, the summary sheet shows the Remaining URL's(non issues) count is 1 and spidered Urls(issues)count is 0 , but there will be tabs for other issues

      3. When empty ADRs (R-2094) are generated with zero pages crawled, the summary sheet shows the Remaining URL's(non issues) count is -1 and spidered Urls(issues)count is 1, showing the remaining urls as negative value

        Attachments

        1. ADR-2133.xlsm
          341 kB
          Patrick Wynne
        2. crawled_667_2094.csv
          0.5 kB
          Abhiram Bhagwat
        3. crawled_704_2133.csv
          0.7 kB
          Abhiram Bhagwat
        4. crawled_705_2134.csv
          4 kB
          Abhiram Bhagwat
        5. ec21847213788compute1amazonawscom-ADR-20110908-2094.xlsm
          115 kB
          Saravanan
        6. ec21847266227compute1amazonawscom-ADR-20110909-2133.xlsm
          138 kB
          Saravanan
        7. ec21847266227compute1amazonawscom-ADR-20110909-2134.xlsm
          182 kB
          Saravanan

          Activity

            People

            • Assignee:
              abhiram Abhiram Bhagwat
              Reporter:
              saravanan.t Saravanan (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: