Druid

What is Druid?

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

A modern cloud-native, stream-native, analytics database

Druid is designed for workflows where fast queries and ingest really matter. Druid excels at instant data visibility, ad-hoc queries, operational analytics, and handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases.

Easy integration with your existing data pipelines

Druid can natively stream data from message buses such as Kafka, Amazon Kinesis, and more, and batch load files from data lakes such as HDFS, Amazon S3, and more.

Up to 100x faster than traditional solutions

Druid has been benchmarked to greatly outpeform legacy solutions for data ingestion and data querying. Druid’s novel architecture combines the best of data warehouses, timeseries databases, and search systems.

Unlock new workflows

Druid unlocks new types of queries and workflows for clickstream, APM, supply chain, network telemetry, digital marketing, and many other forms of event-driven data. Druid is purpose built for rapid, ad-hoc queries on both real-time and historical data.

Deploy in AWS/GCP/Azure, hybrid clouds, Kubernetes, and bare metal

Druid can be deployed in any *NIX environment on commodity hardware, both in the cloud and on premise. Deploying Druid is easy: scaling up and down is as simple as adding and removing Druid services.

official druid.apache.org


src stackshare.io/druid