Incremental Checkpointing for Fault-Tolerant Stream Processing Systems: A Data Structure Approach

Chia Yu Lin, Li Chun Wang, Shu Ping Chang

Research output: Contribution to journalArticle

Abstract

As the demand of high-speed stream processing grows, in-memory databases are widely used to analyze streaming data. It is challenging for in-memory systems to meet the requirements of high throughput and data persistence at the same time since data are not stored in disks. ARIES logging and command logging are two popular logging methods. In current applications, both ARIES logging and command logging are necessary. However, no checkpointing mechanism includes both the functions of ARIES logging method and command logging method. Besides, adopting ARIES logging method in an in-memory database creates high overhead. Command logging records redundant commands and has high storage cost. To address the above issues, we utilize order-irrelevant characteristics of data structure and incremental checkpointing concepts to devise a data structure based incremental checkpointing (DSIC) mechanism. DSIC mechanism is a very low overhead checkpointing approach while retaining the features of ARIES logging and command logging. DSIC mechanism reduces more than 70% logging time of the existing logging scheme and saves 40% storage costs of the existing logging scheme.

Original languageEnglish
JournalIEEE Transactions on Emerging Topics in Computing
DOIs
StateAccepted/In press - 2020

Keywords

  • Checkpointing
  • Data structures
  • Databases
  • Fault tolerance
  • in-memory databases
  • incremental checkpointing
  • key-value stores
  • Metadata
  • Real-time systems
  • Servers
  • Stream processing systems

Fingerprint Dive into the research topics of 'Incremental Checkpointing for Fault-Tolerant Stream Processing Systems: A Data Structure Approach'. Together they form a unique fingerprint.

  • Cite this