Artikel

What Are Open Table Formats (OTFs)?

Learn about more Open Table Formats.

Table formats work to enhance the efficiency and effectiveness of data lakes. By providing a structured approach to data storage and management, open table formats introduce a layer of organization that is often missing in traditional data lakes. They provide a layer of abstraction on top of data lakes and bring database-like features to them. This structured approach enables more efficient data querying and analysis, as data is stored in a manner optimized for access patterns and query performance.

One of the key ways table formats streamline data lakes is by enabling schema-on-read capabilities. This allows data lakes to accommodate data from various sources with different formats and structures, without the need for up-front schema definition. As a result, data engineers and analysts can focus on deriving insights from the data, rather than spending time on data preparation and transformation tasks. Furthermore, the ability to enforce schema validation at write time ensures data quality and consistency, reducing the likelihood of errors and anomalies in the data.

Table formats also introduce transactional support and ACID compliance to data lakes, ensuring data integrity and consistency. This is particularly important in environments where data is frequently updated or where multiple users access and modify the data concurrently. By supporting atomic transactions, open table formats ensure that data lakes can serve as a reliable source of truth for the organization, facilitating accurate and timely decision-making. Additionally, features like incremental processing and time travel enhance the flexibility of data lakes, allowing organizations to track changes over time and access historical data as needed. These capabilities make open table formats an indispensable tool for optimizing data lake operations and unlocking the full potential of data assets.

There is functional parity between three common open table formats in the industry today: Apache Iceberg, Linux Foundation Delta Lake, and Apache Hudi. Their ecosystems, developers, and contributor communities differ, so it may make sense to choose an OTF based on the available and supported ecosystem for your use cases and specific requirements for your workloads. All three OTFs support ACID transactions and versioning, schema evolution, and time travel, and all three can handle complex query workloads with high performance and writes from many concurrent users.

Teradata provides an open ecosystem for OTFs, catalogs, and cloud service providers (CSPs) in multi-cloud and multi-data lake environments.

This unique, open, and connected approach to supporting OTFs enables cross-read, cross-write, and cross-query of data stored in Apache Iceberg and Delta Lake tables using open catalogs such as Amazon Web Services (AWS) Glue, Hive Metastore, or Unity.

This future-ready approach allows enterprises to employ a truly modern data strategy, with unmatched agility and flexibility to deliver Trusted AI at scale—all without the need to move, replicate, or transform data.

Bleiben Sie auf dem Laufenden

Abonnieren Sie den Blog von Teradata, um wöchentliche Einblicke zu erhalten



Ich erkläre mich damit einverstanden, dass mir die Teradata Corporation als Anbieter dieser Website gelegentlich Marketingkommunikations-E-Mails mit Informationen über Produkte, Data Analytics und Einladungen zu Events und Webinaren zusendet. Ich nehme zur Kenntnis, dass ich mein Einverständnis jederzeit widerrufen kann, indem ich auf den Link zum Abbestellen klicke, der sich am Ende jeder von mir erhaltenen E-Mail befindet.

Der Schutz Ihrer Daten ist uns wichtig. Ihre persönlichen Daten werden im Einklang mit der globalen Teradata Datenschutzrichtlinie verarbeitet.