Platform
The open source data platform
Combining best practices
Popular data apps, simple to use
Stackable gives you a curated selection of the best open source data apps like Kafka, Druid, Trino or Spark. Store, process and visualize your data with the latest versions. Stay with the curve, not behind it.
All apps seamingly work together and can be added or removed in no time. Based on Kubernetes, it runs everywhere – on prem or in the cloud.
Use it to create unique and enterprise-level data architectures. It supports e.g. modern Data Warehouses, Data Lakes, Event Streaming, Machine Learning or Data Meshes.
Operators of the Platform
Stackable modules are regular Kubernetes operators. Because of the excellent performance, the low memory requirements as well as the memory and thread security we decided to use Rust as the programming language.
The Stackable Kafka Operator is a tool for automatically rolling out and managing Apache Kafka in Kubernetes clusters. It is supporting Stackable authorization and monitoring.
The Stackable Druid Operator is a tool that can manage Apache Druid clusters. Apache Druid is a real-time database to power modern analytics applications.
The Spark Operator is a tool that makes it possible to roll out a Spark cluster on Kubernetes in standalone mode. It also offers the possibility to start Spark jobs on the cluster.
The Stackable Apache Superset Operator is a tool that can manage Apache Superset. Apache Superset is a modern data exploration and visualization platform. With Stackable, Superset is configured to work with Trino and Apache Druid.
The Stackable Trino operator is a tool that is configured to access data stored in HDFS or any S3 compatible cloud storage. Trino is a fast, highly parallel and distributed query engine for Big Data analytics.
Stackable Airflow Operator is a tool that can manage Apache Airflow clusters. Airflow is a workflow engine that allows you to programmatically create, run, and monitor data pipelines and is your replacement if you use Apache Oozie.
The Stackable Nifi Operator is a tool for automatically rolling out and managing Apache Nifi. Nifi supports powerful and scalable data flows.
The Stackable OPA (OpenPolicyAgent) Operator is a tool that can manage OPA servers. With OPA, rules and guidelines for data access can be flexibly defined “as code”.
The Stackable Hbase Operator is a tool that can manage Apache HBase clusters. HBase is a distributed, scalable, big data store.
The Stackable HDFS Operator is a tool that can manage Apache HDFS clusters. HDFS is a distributed file system that provides high-throughput access to application data.
The Stackable Hive Operator is a tool that can manage Apache Hive. Currently, it supports the Hive Metastore. The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.
The Stackable ZooKeeper Operator is a tool that can automatically roll out and manage Apache ZooKeeper ensembles. Apache Zookeeper is used by many Big Dat aProducts as highly reliable coordinator of distributed systems.
Operators of the Platform
Stackable modules are regular Kubernetes Operators. Because of the excellent performance, the low memory requirements as well as the memory and thread security we decided to use Rust as the programming language.
The Stackable Kafka Operator is a tool for automatically rolling out and managing Apache Kafka in Kubernetes clusters. It is supporting Stackable authorization and monitoring.
The Stackable Druid Operator is a tool that can manage Apache Druid clusters. Apache Druid is a real-time database to power modern analytics applications.
The Spark Operator is a tool that makes it possible to roll out a Spark cluster on Kubernetes in standalone mode. It also offers the possibility to start Spark jobs on the cluster.
The Stackable Apache Superset Operator is a tool that can manage Apache Superset. Apache Superset is a modern data exploration and visualization platform. With Stackable, Superset is configured to work with Trino and Apache Druid.
The Stackable Trino operator is a tool that is configured to access data stored in HDFS or any S3 compatible cloud storage. Trino is a fast, highly parallel and distributed query engine for Big Data analytics.
Stackable Airflow Operator is a tool that can manage Apache Airflow clusters. Airflow is a workflow engine that allows you to programmatically create, run, and monitor data pipelines and is your replacement if you use Apache Oozie.
The Stackable Nifi Operator is a tool for automatically rolling out and managing Apache Nifi. Nifi supports powerful and scalable data flows.
The Stackable OPA (OpenPolicyAgent) Operator is a tool that can manage OPA servers. With OPA, rules and guidelines for data access can be flexibly defined “as code”.
The Stackable Hbase Operator is a tool that can manage Apache HBase clusters. HBase is a distributed, scalable, big data store.
The Stackable HDFS Operator is a tool that can manage Apache HDFS clusters. HDFS is a distributed file system that provides high-throughput access to application data.
The Stackable Hive Operator is a tool that can manage Apache Hive. Currently, it supports the Hive Metastore. The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.
The Stackable ZooKeeper Operator is a tool that can automatically roll out and manage Apache ZooKeeper ensembles. Apache Zookeeper is used by many Big Dat aProducts as highly reliable coordinator of distributed systems.
How it works
From simple to complex environments with infrastructure-as-code
Stackable gives the flexibility to define both simple and complex data scenarios. Either way, the setup is always as simple as this:
1. in step one, you select the Stackable operators for the data apps you need for your data platform and install them using stackablectl or directly via Helm.
2. in step two, you install your data apps in the Kubernetes cluster by passing the appropriate configurations (CRDs) to the operators using stackablectl or directly via kubectl.
All of these definitions are maintained in an infrastructure-as-code fashion so that even the setup remains testable, repeatable and allows for standardization.
Newsletter
Subscribe to the newsletter
With the Stackable newsletter you’ll always be up to date when it comes to updates around Stackable!