Advanced Data Mining and Applications: 10th International by Xudong Luo, Jeffrey Xu Yu, Zhi Li

This publication constitutes the lawsuits of the tenth foreign convention on complicated facts Mining and purposes, ADMA 2014, held in Guilin, China in the course of December 2014. The forty eight standard papers and 10 workshop papers provided during this quantity have been conscientiously reviewed and chosen from ninety submissions. They take care of the subsequent issues: information mining, social community and social media, suggest structures, database, dimensionality aid, boost desktop studying innovations, type, huge information and functions, clustering tools, desktop studying, and information mining and database.

Definition 3 (utility of an itemset in a database). The utility of an itemset X is denoted as u(X) and defined as u(X) = Tc ∈g(X) u(X, Tc ), where g(X) is the set of transactions containing X. Example 3. The utility of the itemset {c, e} is u({c, e}) = (u(c, T2 ) + u(e, T2 ))+ (u(c, T3 ) + u(e, T3 ))+ (u(c, T4 ) + u(e, T4 ))+ (u(c, T5 ) + u(e, T5 )) = (6 + 6) + (1 + 3) + (3 + 3) + (2 + 3) = 27. The utility of the itemset {a, d, f } is u({a, d, f }) = (u(a, T3 ) + u(d, T3 )) + u(f, T3 )) = −5 + 12 + 5 = 12.

In terms of memory usage, FHN uses much less memory than HUINIV-Mine. 97 GB while FHN was using only up to 250 MB. On kosarak, chess, psumb and accidents HUINIV-Mine ran out of memory under our 5 GB memory limit while FHN was respectively using 20 MB, 1179 MB, 100 MB and 350 MB for the lowest minutil values. Lastly, for the retail dataset, the memory usage of FHN was about five times less than HUINIV-Mine. Overall, FHN used up to 250 times less memory than HUINIV-Mine. 28 P. Fournier-Viger An interesting observation is that FHN performs very well on dense datasets such as mushroom compared to HUINIV-Mine.

The utility of the itemset {a, e} in the transaction T0 is u({a, e}, T0) = u({a}, T0 ) + u({e}, T0) = 1×5 + 1×3 = 8. The utility of {a, e} is u({a, e}, T0) + u({a, e}, T3) = 8 + 16 = 24. Definition 2 (Problem of HUI mining). An itemset X is a high utility itemset if its utility is no less than a user-specified minimum utility threshold minutil given by the user. Otherwise, X is a low utility itemset. The problem of high utility itemset mining is to discover all high utility itemsets in the database.

