Pattern mining has been an attractive topic for many researchers since its first introduction. Clickstream mining, a specific version of sequential pattern mining, has been shown to be important in the age of the Internet. However, most previous works have simply exploited and applied existing sequential pattern algorithms to the mining of clickstream patterns, and few have studied clickstreams with weights, which also have a wide range of application. In this paper, we address this problem by proposing an approach based on the average weight measure for clickstream pattern mining and adapting a previous state-of-the-art algorithm to deal with the problem of weighted clickstream pattern mining. Following this, we propose an improved method named Compact-SPADE to enhance both the efficiency and memory consumption. Through various tests on both real-life and synthetic databases, we show that our proposed algorithms outperform state-of-the-art alternatives in terms of efficiency, memory requirements and scalability.
- Data mining
- Sequential pattern mining
- Weighted clickstream pattern mining