Lakehouse analytics

Suggest edits

EDB Postgres Lakehouse extends the power of Postgres to analytical workloads, by adding a vectorized query engine and separating storage from compute. Building a data Lakehouse has never been easier: just use Postgres.

Rapid analytics for Postgres

Postgres Lakehouse is a core offering of the EDB Postgres® AI platform, extending Postgres to support analytical queries over columnar data in object storage, while keeping the simplicity and ease of use that Postgres users love.

With Postgres Lakehouse, you can query your Postgres data with a Lakehouse node, an ephemeral, scale-to-zero compute resource powered by Postgres that's optimized for vectorized query execution over columnar data.

Postgres native

Never leave the Postgres ecosystem.

Postgres Lakehouse nodes run either EDB Advanced Server (EPAS) or EDB Postgres Extended (PGE) as the Postgres engine, with data for analytics stored as columnar tables in object storage using the open source Delta Lake protocol.

EDB Postgres Lakehouse is “just Postgres” – you can query it with any Postgres client, and it fully supports all Postgres queries, functions and statements, so there's no need to change existing queries or reconfigure business intelligence software.

Vectorized execution

Postgres Lakehouse uses Apache DataFusion's vectorized SQL query engine to execute analytical queries 5-100x faster (30x on average) compared to native Postgres, while still falling back to native execution when necessary.

Columnar storage

Postgres Lakehouse is optimized to query "Lakehouse Tables" in object storage, extending the power of open source database to open table formats. Currently, it supports querying "Delta Tables" stored according to the Delta Lake protocol.

Lakehouse Sync

You can sync your own data from tables in transactional sources (initially, EDB Postgres® AI Cloud Service databases) into Lakehouse Tables in Storage Locations (initially, managed locations in S3 object storage).

Fully managed service

You can launch Postgres Lakehouse nodes using the EDB Postgres AI Cloud Service (formerly EDB BigAnimal). Point a Lakehouse node at a storage bucket with some Delta Tables in it, and get results of analytical (OLAP) queries in less time than if you queried the same data in a transactional Postgres database.

Postgres Lakehouse nodes are available now for customers using EDB Postgres AI - Hosted environments on AWS, and will be rolling out to additional cloud environments soon.

Try it today

It's easy to start using Postgres Lakehouse. Provision a Lakehouse node in five minutes, and start querying pre-loaded benchmark data like TPC-H, TPC-DS, Clickbench, and the 1 Billion Row challenge.

Concepts

Learn about the ideas and terminology behind EDB Postgres Lakehouse for Analytics workloads.

Quick Start

Launch a Lakehouse node and query sample data.

Reference

Things to know about EDB Postgres Lakehouse

Lakehouse Sync

How to perform a Lakehouse Sync.


Could this page be better? Report a problem or suggest an addition!