Paper accepted at ICDM: DESQ: Frequent Sequence Mining with Subsequence Constraints

The paper "DESQ: Frequent Sequence Mining with Subsequence Constraints" by Kaustubh Beedkar and Rainer Gemulla has been accepted at the 2016 IEEE International Conference on Data Mining (ICDM).

Abstract:

Frequent sequence mining methods often make use of constraints to control which subsequences should be mined; e.g., length, gap, span, regular-expression, and hierarchy constraints. We show that many subsequence constraints—including and beyond those considered in the literature—can be unified in a single framework. In more detail, we propose a set of simple and intuitive “pattern expressions” to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners.