Early classification of multivariate time series on distributed and in-memory platforms

Vincent S. Tseng*, Huai Shuo Huang, Chia Wei Huang, Ping Feng Wang, Chu Feng Li

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the popularity of Internet of Things (IOT) applications, various kinds of active sensors are deployed and multivariate time series datasets are generated rapidly. Early classification of multivariate time series is an emerging topic in data mining due to the wide applications in many domains. The unique part of early classification lies in that it uses only earlier part of time series data to reach classification results with the same accuracy as by methods using complete time series information. Although a number of relevant studies have been presented recently, most of them didn’t consider the issues of data scale and execution efficiency simultaneously. The main research issue of this paper falls in how to mine interpretable patterns from multivariate time series data, with which effective classification models can be constructed to ensure the accuracy as well as earliness. To take into account the issues of data scale and execution efficiency simultaneously, we explore distributed in-memory computing techniques and multivariate shapelets-based approaches to construct a Spark-based in-memory mining framework to parallelize the feature extraction process. We implement a framework with Multivariate Shapelets Detection (MSD) method as a based example. Through empirical evaluation on various kinds of sensory datasets, the scalability of the framework is evaluated in terms of efficiency while ensuring the same accuracy and reliability in early classification of multivariate time series. This work is the first one to realize multivariate time series early classification on Spark distributed in-memory computing platform, which can serve as a good base for an extension to other time series classification methods based on shapelet feature extraction.

Original languageEnglish
Title of host publicationTrends and Applications in Knowledge Discovery and Data Mining - PAKDD 2017 Workshops, MLSDA, BDM, DM-BPM, Revised Selected Papers
EditorsYang-Sae Moon, U Kang, Jeffrey Xu Yu, Ee-Peng Lim
PublisherSpringer Verlag
Pages3-14
Number of pages12
ISBN (Print)9783319672731
DOIs
StatePublished - 1 Jan 2017
Event21st Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2017 held in conjuction with the Workshop on Machine Learning for Sensory Data Analysis, MLSDA 2017, Workshop on Biologically Inspired Data-Mining Techniques, BDM 2017, Pacific Asia Workshop on Intelligence and Security Informatics, PAISI 2017 and Workshop on Data Mining in Business Process Management, DM-BPM 2017 - Jeju, Korea, Republic of
Duration: 23 May 201723 May 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10526 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2017 held in conjuction with the Workshop on Machine Learning for Sensory Data Analysis, MLSDA 2017, Workshop on Biologically Inspired Data-Mining Techniques, BDM 2017, Pacific Asia Workshop on Intelligence and Security Informatics, PAISI 2017 and Workshop on Data Mining in Business Process Management, DM-BPM 2017
CountryKorea, Republic of
CityJeju
Period23/05/1723/05/17

Keywords

  • Early classification
  • Multivariate time series
  • Parallel and distributed computing
  • Shapelets

Fingerprint Dive into the research topics of 'Early classification of multivariate time series on distributed and in-memory platforms'. Together they form a unique fingerprint.

Cite this