8 Ways to Data Scientist’s Can Optimize Their Parquet Queries

Some data formats are columnar. This means they store information in columns or rows. They are popular because they can be used for certain types of queries more easily than row-based ones. Parquet supports parallel query processing, meaning it can split up your data into several files in order to read in multiple processors at […]

Deep Dive into Databricks Tempo for Time Series Analytics

Time-series data has typically been fit imperfectly into whatever database we were using at the time for other tasks. There are time series databases¬† (TSDB) coming to market. TSDBs are optimized to store and retrieve associated pairs of times and values. TSDB’s architecture focuses on time-stamp data storage and the compressions, summarization and life-cycle management […]