Details
-
Type: Bug
-
Status: Resolved
-
Priority: Major
-
Resolution: Cannot Reproduce
-
Affects Version/s: None
-
Fix Version/s: Sustaining
-
Component/s: Data Summarization
-
Labels:None
Description
the sources table has keyword and description as utf8 columns. The temp table being created doesn't appear to be setting utf8 columns and it is not setting the character set for the entire table.
Account 799 (Small Luxury Hotels) was just added Saturday
rgardner@xml-06:~$ ls l /tmp/2011-03??/sesumm*799*
rw-rr- 1 tsaapp tsaapp 36167 2011-03-26 04:53 /tmp/2011-03-26/sesummarize.sh_d_3_S_3_T_2011_03_25_a_12,16,782,799_04:52:26.log
rw-rr- 1 tsaapp tsaapp 46003 2011-03-27 04:55 /tmp/2011-03-27/sesummarize.sh_d_3_S_3_T_2011_03_26_a_12,16,782,799_04:54:35.log
rw-rr- 1 tsaapp tsaapp 4607 2011-03-28 08:04 /tmp/2011-03-28/sesummarize.sh-799-whsumm-rerun.log
rw-rr- 1 tsaapp tsaapp 41734 2011-03-28 04:53 /tmp/2011-03-28/sesummarize.sh_d_3_S_3_T_2011_03_27_a_12,16,782,799_04:52:26.log
rw-rr- 1 tsaapp tsaapp 39692 2011-03-29 04:53 /tmp/2011-03-29/sesummarize.sh_d_3_S_3_T_2011_03_28_a_12,16,782,799_04:52:29.log
and there was a similar error on Sunday:
2011-03-27 04:55:36.350 (2) [P10T1]: Exception [creating staging temp table]:com.mysql.jdbc.exceptions.MySQLIntegrityConstraintViolationException: Duplicate entry '799-hoteldorf grĂ¼ner baum' for key 'description'
There appear to be similar sources for this account:
mysql> select id, keyword, description from sources where accountID = 799 and distributionID = 3 and keyword = 'Im Weissen Rossl';
------------------------------------------------------------
id | keyword | description |
------------------------------------------------------------
925871415 | Im Weissen Rssl | Keyword: [Im Weissen Rssl] broad |
925871455 | Im Weissen Rssl | Keyword: [Im Weissen Rssl] exact |
925895145 | Im Weissen Rossl | Keyword: [Im Weissen Rossl] broad |
925895175 | Im Weissen Rossl | Keyword: [Im Weissen Rossl] exact |
926513915 | im weissen rssl | Keyword "im weissen rssl" |
------------------------------------------------------------
5 rows in set (0.02 sec)
It does appear that the temp table is confusing these slightly different variations.