Compact prediction tree: A lossless model for accurate sequence prediction

Ted Gueniche, Philippe Fournier-Viger, S. Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

30 Scopus citations

Abstract

Predicting the next item of a sequence over a finite alphabet has important applications in many domains. In this paper, we present a novel prediction model named CPT (Compact Prediction Tree) which losslessly compress the training data so that all relevant information is available for each prediction. Our approach is incremental, offers a low time complexity for its training phase and is easily adaptable for different applications and contexts. We compared the performance of CPT with state of the art techniques, namely PPM (Prediction by Partial Matching), DG (Dependency Graph) and All-K-th-Order Markov. Results show that CPT yield higher accuracy on most datasets (up to 12% more than the second best approach), has better training time than DG and PPM, and is considerably smaller than All-K-th-Order Markov.

Original languageEnglish
Title of host publicationAdvanced Data Mining and Applications - 9th International Conference, ADMA 2013, Proceedings
Pages177-188
Number of pages12
EditionPART 2
DOIs
StatePublished - 1 Dec 2013
Event9th International Conference on Advanced Data Mining and Applications, ADMA 2013 - Hangzhou, China
Duration: 14 Dec 201316 Dec 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume8347 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Conference on Advanced Data Mining and Applications, ADMA 2013
CountryChina
CityHangzhou
Period14/12/1316/12/13

Keywords

  • accuracy
  • compression
  • next item prediction
  • sequence prediction

Fingerprint Dive into the research topics of 'Compact prediction tree: A lossless model for accurate sequence prediction'. Together they form a unique fingerprint.

Cite this