Alternatives
Below is a summary of how BemiDB compares to other solutions for analytical workloads.
BemiDB Performance Highlights
BemiDB can run complex analytical queries much faster than PostgreSQL. On the TPC-H benchmark (22 sequential queries) at different scale factors:
Scale Factor | Configuration | Time | Notes |
---|---|---|---|
0.1 | BemiDB (unindexed) | 2.3s | |
0.1 | PostgreSQL (unindexed) | 1h23m13s | ~2,170× slower |
0.1 | PostgreSQL (indexed) | 1.5s | ~99.97% bottleneck reduction |
1.0 | BemiDB (unindexed) | 25.6s | |
1.0 | PostgreSQL (unindexed) | ∞ | Could not complete |
1.0 | PostgreSQL (indexed) | 1h34m40s | ~220× slower |
BemiDB vs. PostgreSQL
PostgreSQL Pros
- Widely adopted general-purpose transactional (OLTP) database.
- Works for small-scale analytical queries.
PostgreSQL Cons
- Slower for analytical (OLAP) queries on medium/large datasets.
- Must create indexes for specific analytical queries, impacting write performance.
- Materialized views need manual upkeep and get slow to refresh as data grows.
- Tuning may not help when ad-hoc analytical queries vary.
BemiDB vs. PostgreSQL Extensions
PostgreSQL Extensions Pros
- Large ecosystem of open-source extensions.
- Community-driven development.
PostgreSQL Extensions Cons
- Extensions can degrade analytical query performance and affect transactional queries.
- Many managed PostgreSQL services limit which extensions you can install.
- Increases maintenance overhead when upgrading PostgreSQL versions.
- Often require manual data syncing or schema mapping if they store data differently.
Main categories of PostgreSQL extensions for analytics:
-
Foreign Data Wrapper (FDW) extensions (e.g.,
parquet_fdw
,parquet_s3_fdw
):- Pros: Query columnar formats like Parquet directly from PostgreSQL.
- Cons: Not optimized for large-scale analytical queries.
-
OLAP query engine extensions (e.g.,
pg_duckdb
,pg_analytics
):- Pros: Provide an integrated analytical engine.
- Cons: Require creating foreign tables and functions; data layers are usually separate and unoptimized.
BemiDB vs. DuckDB
DuckDB Pros
- Built for OLAP workloads.
- Simple setup with a single binary.
DuckDB Cons
- Limited ecosystem support in notebooks or BI tools.
- Requires manual syncing and schema mapping for best performance.
- Does not offer all features of a full database (e.g., no native Iceberg writes).
BemiDB vs. Real-Time OLAP Databases (ClickHouse, Druid, etc.)
Real-Time OLAP Databases Pros
- Very fast for real-time analytics.
- Purpose-built for high-performance queries.
Real-Time OLAP Databases Cons
- Complex to deploy and maintain at scale.
- Data mutability can be restricted.
- Steeper learning curve.
- Often need manual data syncing and schema mapping.
BemiDB vs. Big Data Query Engines (Spark, Trino, etc.)
Big Data Query Engines Pros
- Distributed SQL engines for large-scale datasets.
- Handle batch analytics well.
Big Data Query Engines Cons
- Require multiple components (Zookeeper, JVM, etc.), increasing setup complexity.
- Typically lack a native storage layer.
- Rely on manual data syncing and schema mapping.
BemiDB vs. Proprietary Solutions (Snowflake, Redshift, BigQuery, Databricks, etc.)
Proprietary Solutions Pros
- Fully managed cloud data warehouses or lakehouses.
- Optimized for large-scale OLAP queries.
Proprietary Solutions Cons
- Costs can be higher.
- Lock-in to a single vendor’s ecosystem.
- Require separate systems or workflows for data syncing and schema management.