Details
-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Fixed
-
Affects Version/s: None
-
Fix Version/s: Sustaining
-
Component/s: Cloud Spider
-
Labels:None
-
Environment:
AWS
-
Sprint:Sprint 2
Description
Need to find the root cause if "too many open files" exception
Wed 2012/08/15 17:05:00.615| |Thread-217|StreamGobbler|STDERR: 12/08/15 17:05:00 INFO mapred.LocalJobRunner: 20 threads, 92101 pages, 128 errors, 1.3 pages/s, 2203 kb/s,
Wed 2012/08/15 17:05:00.728| |Thread-217|StreamGobbler|STDERR: 12/08/15 17:05:00 INFO fetcher.Fetcher: <crawl 3352> total_urls_spidered: 298827 (max:1000000)
Wed 2012/08/15 17:05:00.729| |Thread-217|StreamGobbler|STDERR: 12/08/15 17:05:00 INFO fetcher.Fetcher: <crawl 3352> fetching: http://www.remax.com/property/85812259-60050095/Off-of-Hereford-Road-Taylor-AZ-85939/
Wed 2012/08/15 17:05:01.025| |Thread-217|StreamGobbler|STDERR: 12/08/15 17:05:01 INFO fetcher.Fetcher: <crawlid 3352> Sending status message now
Wed 2012/08/15 17:05:01.051| |Thread-217|StreamGobbler|STDERR: 12/08/15 17:05:01 INFO fetcher.Fetcher: -activeThreads=20, spinWaiting=0, fetchQueues.totalSize=999
Wed 2012/08/15 17:05:01.885| |main|StreamGobbler|Got message: id: 29ea66d2-2f29-4c41-a3b3-26b04cd06cc1 body: {"msgID":"c2341558-9be4-4aee-93b7-2087ea0102e3","replyQueue":"cloudmgmtsvc_production","replyDestType":"JMS_QUEUE","replyDest":"/queue/SpiderService-ServerResponseQueue","msgType":"start_crawl","body":"
"}
Wed 2012/08/15 17:05:01.886| |main|StreamGobbler|Getting RequestMsg...
Wed 2012/08/15 17:05:01.893| |main|StreamGobbler|Getting Start Crawl Msg...
Wed 2012/08/15 17:05:01.893| |main|StreamGobbler|Cleaning up old Crawls...
Wed 2012/08/15 17:05:01.894| |main|StreamGobbler|Execute Command: bash su nutch -c "/nutch/search/scripts/clean-up.sh"
Wed 2012/08/15 17:05:01.895| |main|StreamGobbler|#### ERROR: executeCommand Cannot run program "bash": java.io.IOException: error=24, Too many open files
Wed 2012/08/15 17:05:01.896| |main|StreamGobbler|Starting Crawl...
Wed 2012/08/15 17:05:01.909| |main|StreamGobbler|<CRAWLID 3376> Attempting create crawl config dir, flag false
Wed 2012/08/15 17:05:01.910| |main|StreamGobbler|Execute Command: mkdir /mnt/nutch/CRAWL_config/CRAWL_3376
Wed 2012/08/15 17:05:01.911| |main|StreamGobbler|#### ERROR: executeCommand Cannot run program "mkdir": java.io.IOException: error=24, Too many open files
Wed 2012/08/15 17:05:01.911| |main|StreamGobbler|Execute Command: chmod 777 -R /mnt/nutch/CRAWL_config/CRAWL_3376
Wed 2012/08/15 17:05:01.912| |main|StreamGobbler|#### ERROR: executeCommand Cannot run program "chmod": java.io.IOException: error=24, Too many open files
Wed 2012/08/15 17:05:01.912| |main|StreamGobbler|<CRAWLID 3376> crawl config dir created and permissions set to /mnt/nutch/CRAWL_config/CRAWL_3376
Wed 2012/08/15 17:05:01.913| |main|StreamGobbler|#### ERROR: writeFile java.io.FileNotFoundException: /mnt/nutch/CRAWL_config/CRAWL_3376/subDomains_3376 (No such file or directory)
Wed 2012/08/15 17:05:01.913| |main|StreamGobbler|#### ERROR: writeFile java.io.FileNotFoundException: /mnt/nutch/CRAWL_config/CRAWL_3376/allowed_3376 (No such file or directory)
Wed 2012/08/15 17:05:01.914| |main|StreamGobbler|#### ERROR: writeFile java.io.FileNotFoundException: /mnt/nutch/CRAWL_config/CRAWL_3376/exclude_3376 (No such file or directory)
Wed 2012/08/15 17:05:01.915| |main|StreamGobbler|#### ERROR: writeFile java.io.FileNotFoundException: /mnt/nutch/CRAWL_config/CRAWL_3376/skipParams_3376 (No such file or directory)
Wed 2012/08/15 17:05:01.931| |main|StreamGobbler|Execute Command: chmod 777 /nutch/Crawl_3376_javaexec.sh
Wed 2012/08/15 17:05:01.931| |main|StreamGobbler|#### ERROR: executeCommand Cannot run program "chmod": java.io.IOException: error=24, Too many open files
Wed 2012/08/15 17:05:01.932| |main|StreamGobbler|Execute Command: su nutch -c "/nutch/Crawl_3376_javaexec.sh"
Wed 2012/08/15 17:05:01.933| |main|StreamGobbler|#### ERROR: executeCommandPID Cannot run program "su": java.io.IOException: error=24, Too many open files
Wed 2012/08/15 17:05:01.933| |main|StreamGobbler|Kicking off Crawl Process, ProcessID: -1
Wed 2012/08/15 17:05:01.933| |main|StreamGobbler|Sending Crawl Status...