An iceberg is a piece of ice broken off from a glacier that floats freely in open water, and this definition also provides an appropriate name for the database table format project Iceberg.
Open-source Apache Iceberg provides full database functionality on top of cloud object stores. It exemplifies how the separation of storage from compute in modern data stacks has allowed scalable, cost-effective computing and improved interaction between various systems.
Two engineers at Netflix Inc. created Iceberg to overcome the challenges they encountered using existing data lake formats such as Apache Spark or Apache Hive. The engineers needed a solution to navigate their employers’ massive Amazon S3-stored media streaming files.
“We had the same problems that everybody else did, but 10 times worse,” recalled Ryan Blue, co-founder and chief executive officer of Tabular Technologies Inc. and creator of Apache Iceberg. “Every request to [Amazon’s] S3 was not seven milliseconds, it was 70 milliseconds … so all the …