LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Introducing Polyglot-Based Data-Flow Awareness to Time-Series Data Stores

Photo from wikipedia

The rising interest in extracting value from data has led to a broad proliferation of monitoring infrastructures, most notably composed by sensors, intended to collect this new oil. Thus, gathering… Click to show full abstract

The rising interest in extracting value from data has led to a broad proliferation of monitoring infrastructures, most notably composed by sensors, intended to collect this new oil. Thus, gathering data has become fundamental for a great number of applications, such as predictive maintenance techniques or anomaly detection algorithms. However, before data can be refined into insights and knowledge, it has to be efficiently stored and prepared for its later retrieval. As a consequence of this sensor and IoT boom, Time-Series databases (TSDB), designed to manage sensor data, became the fastest-growing database category since 2019. Here we propose a holistic approach intended to improve TSDB’s performance and efficiency. More precisely, we introduce and evaluate a novel polyglot-based approximation, aimed to tailor the data store, not only to time-series data–as it is done conventionally– but also to the data flow itself: From its ingestion, until its retrieval. In order to evaluate the approach, we materialize it in an alternative implementation of NagareDB, a resource-efficient time-series database, based on MongoDB, in turn, the most popular NoSQL storage solution. After implementing our approach into the database, we observe a global speed up, solving queries up to 12 times faster than MongoDB’s recently launched Time-series capability, as well as generally outperforming InfluxDB, the most popular time-series database. Our polyglot-based data-flow aware solution can ingest data more than two times faster than MongoDB, InfluxDB, and NagareDB’s original implementation, while using the same disk space as InfluxDB, and half of the requested by MongoDB.

Keywords: time; polyglot based; data flow; time series

Journal Title: IEEE Access
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.