Details
-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: Cloud Spider 3.01
-
Fix Version/s: Cloud Spider 3.02
-
Component/s: Cloud Spider
-
Labels:None
Description
Bug in Nutch (Related: https://issues.apache.org/jira/browse/NUTCH-578)
Nutch schedules URLs with exception to retry fetching again and again, until all iterations are complete. This needs to be changed to permanently mark the URLs as failed. This would also cause URLs Found count to be wrong, as failed URLs could be added again. This should be a one line change in Fetcher.java.