A plain-language guide to what Apache Iceberg is, why it won the table format race, how it works under the hood, and what makes it the right foundation for a modern lakehouse.
What Is a Table Format? (The Filing Cabinet Analogy)
Imagine object storage (S3, GCS, Azure Blob) as a giant warehouse full of boxes. Each box is a Parquet file — it contains data, neatly organized in columns. But the warehouse has no index, no catalog, no version history. You can throw boxes in, but you can't reliably find, update, or delete anything. Want to know what changed yesterday? Good luck.
A table format is the filing system you put on top of that warehouse. It provides:
- A catalog — "here are all your tables and where they live"
- An index — "this file contains rows where
date = 2026-05-01, skip the rest" - A version history — "here's what the table looked like an hour ago"
- Transaction guarantees — "two writers can't corrupt the same table"
Without a table format — a library where books are dumped in piles on the floor. You can store millions of books, but finding one requires searching every pile.
With a table format — that same library with a card catalog, Dewey Decimal system, and a checkout log. Same books, but now you can find anything in seconds and know exactly who has what.
Three table formats competed for the industry standard:
- Apache Iceberg (Netflix, 2018) — designed for correctness and engine independence
- Delta Lake (Databricks, 2019) — designed for the Spark ecosystem
- Apache Hudi (Uber, 2019) — designed for streaming upserts
Iceberg won. Adopted by Snowflake, Databricks (which now supports it alongside Delta), AWS, Google Cloud, Apple, Netflix, and effectively every major data platform vendor. It is the only format that all major engines support natively.
What Iceberg Actually Does (Plain Language)
Here's what Iceberg gives you, explained without jargon:
ACID Transactions — Two ETL jobs writing to the same table at 2am won't corrupt each other's work. Every write is an atomic commit: it either fully succeeds or nothing changes. No "half-written" tables.
Schema Evolution — Need to add a column? Rename one? Drop one? Iceberg handles it without rewriting a single data file. Your historical data stays untouched, and new queries see the updated schema instantly.
Partition Evolution — Started partitioning by day, but now need to partition by hour? Iceberg changes the strategy going forward without rewriting existing data. Other formats require a full rewrite — which can take hours or days for large tables.
Hidden Partitioning — Users just write normal SQL. Iceberg automatically applies the right partitioning under the hood. No one has to remember to add WHERE date = '2026-05-01' to avoid full scans. The format does it for you.
Time Travel — Query data as it looked at any point in time. "Show me the customer table as of last Tuesday at 3pm." Built in. Every write creates an immutable snapshot, so you can always go back.
Row-Level Updates and Deletes — Before table formats, object storage was append-only. You couldn't delete one row without rewriting the entire file. Iceberg makes UPDATE and DELETE on individual rows possible and efficient.
Metadata Layer — Iceberg's manifest files track exactly which data files contain which row ranges, min/max values, and partition values. This enables partition pruning (skip irrelevant files entirely) and predicate pushdown (read only the columns and rows you need). Queries touch a fraction of the data.
Why Iceberg Beat Delta Lake and Hudi
Apache Iceberg
- Vendor-neutral — Apache Foundation project, no single company controls it
- Designed by Netflix for correctness at petabyte scale
- Adopted by Snowflake, Databricks, AWS, Google, Apple, Netflix, Dremio, StarRocks, Trino, Spark, and Flink
- Best-in-class schema and partition evolution
- Hidden partitioning — unique to Iceberg
Delta Lake
- Created by Databricks
- Historically tied to Spark — other engines had limited support
- Now more open (UniForm bridges to Iceberg), but remains Databricks-first
- Databricks itself now supports Iceberg natively
- Technically solid, but carries vendor baggage
Apache Hudi
- Created by Uber for streaming upserts
- More complex operational model
- Smaller ecosystem, fewer engine integrations
- Strong for specific CDC and streaming use cases
- Less momentum in the broader market
Iceberg is the only table format that every major engine supports natively. If you bet on Iceberg, you can use any engine, today or tomorrow. That's why it won.
Iceberg Under the Hood
Iceberg organizes data using a metadata tree. Think of it like a chain of pointers: each layer tells you where to find the next, and every layer is immutable — never changed, only replaced.
Catalog → Metadata File → Manifest List → Manifest Files → Data Files (Parquet)
"where is current schema, which snapshots which data files, in S3/GCS/ADLS
this table?" partitioning, exist stats, partition
properties info
Each write creates a new metadata snapshot. Old snapshots remain untouched — that's how time travel works. Reads always see a consistent, complete snapshot.
Three properties follow directly from this design:
- Immutability = safety. Old data is never overwritten. If something goes wrong, roll back to a previous snapshot.
- Metadata = speed. Manifest files contain statistics (min/max values, row counts, partition info) so the engine can skip 99% of files without reading them.
- Snapshots = auditability. You can always prove what the data looked like at any point in time — critical for regulated industries.
What Iceberg Means for Lock-In
Iceberg fundamentally changes the lock-in equation. Your data is stored as standard Parquet files plus open Iceberg metadata in your own object storage (S3, GCS, ADLS). Any engine that speaks Iceberg can read your data. If you want to switch engines tomorrow, your data stays right where it is — fully readable by Spark, Trino, StarRocks, Snowflake, Databricks, Flink, or any other Iceberg-compatible engine.
Zero migration. Zero export. Zero conversion.
Compare that to the legacy world:
| Vendor | Storage Model | What Leaving Looks Like |
|---|---|---|
| Oracle | Proprietary .dbf files, ASM disk groups |
Must export every table; multi-month migration projects |
| Teradata | Proprietary storage on proprietary hardware | Data extraction requires TPT or JDBC exports; historically the hardest platform to leave |
| Snowflake | Internal micro-partitions | Must COPY INTO or UNLOAD; Iceberg tables are a reading layer only — native tables remain proprietary |
| Iceberg lakehouse | Parquet + open Iceberg metadata in your own S3 | Switch engines with a configuration change; data never moves |
"You're not locked in to a vendor. You're locked in to an open standard that is supported by every major vendor in the industry. That's the whole point."
Iceberg Table Maintenance Explained
Even though Iceberg is brilliantly designed, tables require ongoing maintenance. Think of it like a car: the engine is great, but you still need oil changes.
| Maintenance Task | What Happens Without It | What It Does |
|---|---|---|
| Compaction | Hundreds of tiny files accumulate (especially from streaming). Queries slow down because each file has overhead. | Merges small files into larger, optimally-sized files. Read performance improves dramatically. |
| Snapshot Expiration | Old snapshots pile up. Storage costs grow even if the underlying data doesn't. Metadata operations slow down. | Removes snapshots older than a configured retention period. Frees storage, speeds up metadata. |
| Orphan File Cleanup | Failed writes leave data files that no snapshot references. They sit in object storage indefinitely, accumulating cost. | Identifies and deletes files no longer referenced by any snapshot. |
Without maintenance — a filing cabinet where you keep adding folders but never remove old ones, never consolidate loose papers, and never throw away misfiled documents. Eventually the drawers won't close and finding anything takes forever.
With automated maintenance — a filing clerk who works overnight: consolidates loose papers into proper folders, removes expired records, and cleans up misfiled documents. You come in each morning to a clean, fast filing system.
Many teams leave Iceberg maintenance to manual Spark jobs scheduled with cron. When maintenance falls behind, query performance degrades and storage costs balloon. Automating compaction, snapshot expiration, and orphan file cleanup — and making them observable — is what separates a well-run lakehouse from one that slowly deteriorates.
Key Questions for Evaluating Your Current Setup
If you're assessing whether your current data platform is holding you back, these are the right questions to ask:
"What format is your data stored in today?" — If it's a proprietary format (Oracle, Teradata, Snowflake native), you're locked in. If it's raw Parquet without a table format, you have no transactions or schema evolution.
"Can you query last week's version of a table?" — If not, you lack time travel. Debugging data issues means guessing. Audit and compliance are manual.
"What happens when two ETL jobs write to the same table at the same time?" — If the answer is "we schedule them sequentially," you don't have real ACID. Sequential scheduling wastes time and creates fragile pipelines.
"How do you handle schema changes in production?" — If it involves downtime, rewriting tables, or coordinating across teams, you're paying a large hidden tax on every schema change.
"Who maintains your Iceberg tables?" — If it's custom scripts, a dedicated engineer, or "we don't really," your tables are gradually degrading in performance and accumulating storage waste.
"If you needed to switch your query engine tomorrow, how long would it take?" — If the answer is "months" or "we'd have to export everything," you're locked in — and paying for that lock-in every time you negotiate a renewal.
Iceberg addresses each of these. The open format and immutable metadata design aren't just architectural elegance — they have direct operational and commercial consequences for every team running data at scale.