This is Lakekeeper: An Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.
Commits
Stars
Pull Requests
Forks
Iceberg achieves transactional consistency of data warehouses, while enabling modular horizontal scaling of compute and storage on data lakes.
Manage data access policies centrally - without duplication in compute engines.
Lakekeeper connects to open external permission systems like OpenFGA and can exposes permissions via Open Policy Agent (OPA). This enables best-in-class integration with query engines like trino that support external permission systems.
Use your own identity provider for authentication. Lakekeeper never generates (API)-tokens itself. You already have an IdP. Lets use it!
Optimize query performance for all your data lake engines with out-of-the-box automated compaction and maintenance strategies. (Coming soon)
Single binary executable for all major platforms; no JVM or Python env required. Native Kubernetes deployments with Helm chart or k8s operator. UI and batteries included.
Lakekeeper secures access to your data for on-premise and cloud deployments using Vended-Credentials and remote signing for S3.
Lakekeeper can emit change events to Event Queues like Nats or Kafka to keep stakeholders informed.
Lakekeeper uses a normalized relational Database model internally. This allows us to add powerful new endpoints and statistics in the future without file-system access!
There is no local state - the catalog can be scaled horizontally easily. Autoscaling in helm included.
See something that's missing? Build it! Lakekeeper is meant to be extended through. And because Lakekeeper is written in Rust, you can use powerful Rust traits to do so.
We are not bound to a query engine vendor. We care about the whole ecosystem and interoperability.
Lakekeeper is written in Rust and based on `iceberg-rust`. No unsafe Code - guaranteed!
Checkout The CodeThe first meetup will be on April 2nd in Amsterdam at 17:00. Sign up and find more details on the event page.
Join us on the trino community broadcast at on Thursday March 13th, where we demonstrate Lakekeeper's OPA integration with Trino.
Release 0.7.0 adds support for s3a and s3n Filesystems, improves reverse-proxy support and introduces table and view statistics!
Release 0.6.0 focuses on security, introducing Lakekeeper's OPA bridge for Trino integration, with a new "check" endpoint for simpler permission queries. It also supports Iceberg versions 1.5 to 1.7, adds automatic file cleanup for managed tables, and fixes ADLS cleanup issues.
Release 0.5.0 is the biggest yet, featuring a new UI, detailed docs, and table-level access controls. It adds native support for Kubernetes Service Accounts and improves integration with external IdPs like EntraID and Keycloak.