Details
-
Type: Bug
-
Status: Open
-
Priority: Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: Sustaining
-
Component/s: Data Summarization
-
Labels:None
-
Environment:
dc1bos
Description
This bug could be triggered by database connection errors, someone killing whsumm while it's running lowgran, and there may also be a race condition if multiple lowgran whsumms are running for different dates.
=== 1) identifying the problem for account 733 ===
select * from dataSources where accountid=733 and type in ('wh_lowgran','wh_lowgran_hourly');
----------------------------------------+
id | type | typeID | accountID |
----------------------------------------+
252075 | wh_lowgran | NULL | 733 |
252085 | wh_lowgran_hourly | NULL | 733 |
----------------------------------------+
select * from dataAvailability where dataSourceID in (252075,252085) order by id desc limit 20;
-----------------------------------------------------------------------------------------------------
id | date | oldUpdated | updated | dataSourceID | status | statusText | needsRerun | override |
-----------------------------------------------------------------------------------------------------
127711205 | 2011-03-31 | 00:00:00 | 2011-04-01 11:23:20 | 252085 | success | NULL | false | false |
127534955 | 2011-03-31 | 00:00:00 | 2011-04-01 11:14:23 | 252075 | in progress | NULL | false | false |
127448735 | 2011-03-30 | 00:00:00 | 2011-03-31 03:51:33 | 252085 | success | NULL | false | false |
127273735 | 2011-03-30 | 00:00:00 | 2011-03-31 09:36:03 | 252075 | success | NULL | false | false |
=== 2) manual fix for problem above where date=2011-03-31 and updated=2011-04-01 ===
mysql> update dataAvailability set status='success' where id=127534955;
Query OK, 1 row affected (0.02 sec)
Rows matched: 1 Changed: 1 Warnings: 0
=== 3) identifying across all account databases ===
Account shard 5 is the only one that appears to have this problem, and it has it for a few different accounts.
select dataAvailability.id, accountID, date, updated, status, type, dataSourceID from dataSources join dataAvailability on dataSourceID=dataSources.id where type in ('wh_lowgran','wh_lowgran_hourly') and status='in progress' and date > '2011-03-01';
----------------------------------------------------------------------------------
id | accountID | date | updated | status | type | dataSourceID |
----------------------------------------------------------------------------------
120516865 | 706 | 2011-03-03 | 2011-03-10 09:26:20 | in progress | wh_lowgran | 224355 |
121242915 | 706 | 2011-03-06 | 2011-03-09 17:28:26 | in progress | wh_lowgran | 224355 |
123218655 | 733 | 2011-03-14 | 2011-03-31 07:57:53 | in progress | wh_lowgran | 252075 |
125705645 | 523 | 2011-03-24 | 2011-04-01 11:50:27 | in progress | wh_lowgran | 83205 |
----------------------------------------------------------------------------------
4 rows in set (0.03 sec)
Once you verify nothing is running for those accounts, you can set them all to 'success' and it will stop blocking.
If you are having this issue for dates
=== code ===
grep -r 'before checking to update' * --include '*.java'
src/main/java/com/thesearchagency/perf/WarehouseSummarizer.java: Debug.msg(Debug.INFORMATION, "Sleeping... " + INPROGRESS_RETRY_WAIT_TIME + " ms before checking to update low resoution tables...");
DataSourceTable.TypeValue dataSourceType = (theResolution == DAILY_RESOLUTION) ? DataSourceTable.TypeValue.WH_LOWGRAN : DataSourceTable.TypeValue.WH_LOWGRAN_HOURLY;
dataSource = DataAvailability.findDataSource(theThreadDatabase, dataSourceType, null, theAccountID.toString());
while(!done) {
// TODO: this loop will be infinite if the app is killed in here, and will require a manual database update setting the status = something other than 'in progress' to break the loop.
// This was added for the publisher feed project with the intent of preventing low res from being run twice in the same account. should set a max time to expire.
if (!isStatusInProgress(dataSource))
{ setDateRangeDA(dataSource, DataAvailabilityTable.StatusValue.IN_PROGRESS, theStartDate, theEndDate); Debug.debug(Debug.INFORMATION, "Starting " + acctDesc + " aggregates for " + theStartDate + " to <" + theEndDate); updateLowResTables(theStartDate, theEndDate, false); Debug.debug(Debug.INFORMATION, "Finished " + acctDesc + " aggregates for " + theStartDate + " to <" + theEndDate); setDateRangeDA(dataSource, DataAvailabilityTable.StatusValue.SUCCESS, theStartDate, theEndDate); done = true; }else {
try
{ Debug.msg(Debug.INFORMATION, "Sleeping... " + INPROGRESS_RETRY_WAIT_TIME + " ms before checking to update low resoution tables..."); Thread.sleep(INPROGRESS_RETRY_WAIT_TIME); }catch (InterruptedException ie)
{ Debug.msg(Debug.WARNING, "interrupted while sleeping in checking DataAvailability status retry, ignoring"); }}
}
private boolean isStatusInProgress(Object aDataSourceID)
{
boolean ret = false;
Calendar currentDate = Calendar.getInstance();
currentDate.setTime((Date)theStartDate.clone());
// check all dates in date range. might need to eventually return a list
while (!ret && currentDate.getTime().before(theEndDate)) {
DataAvailabilityTable.StatusValue daStatus = DataAvailabilityTable.StatusValue.intern(
DataAvailability.getAvailability(theThreadDatabase, aDataSourceID, currentDate.getTime()));
if (daStatus != null && daStatus == DataAvailabilityTable.StatusValue.IN_PROGRESS)
{ ret = true; }currentDate.add(Calendar.DATE, 1);
}
return ret;
}