Traincrest IT Training logo

DENG-251: Building an Open Data Lakehouse Using Apache Iceberg( Building an Open Data Lakehouse Using Apache Iceberg) Course Overview

Category: ClouderaLevel: BeginnerDuration: 32 HoursPrice: $1,075

DENG-251: Building an Open Data Lakehouse Using Apache Iceberg equips data engineers, analysts, and architects with essential skills to create scalable and efficient data lakehouses. This course emphasizes the importance of managing large datasets, enabling organizations to harness real-time insights and foster data-driven decision-making. Ideal for professionals aiming to enhance their data management expertise in today's big data landscape.

Enroll or book a demo

Course outline & what you'll learn

  • Definition and benefits of data lakehouses
  • Comparison between data lakes and data warehouses
  • Use cases for data lakehouses
  • Introduction to Apache Iceberg
  • Key features and architecture
  • Supported file formats
  • Installation requirements
  • Configuration of Apache Iceberg with Hadoop and Spark
  • Setting up a local development environment
  • Ingesting data into Iceberg tables
  • Schema evolution and management
  • Partitioning strategies for optimized performance
  • Using SQL with Iceberg
  • Integration with Apache Spark and other query engines
  • Performance tuning and optimization techniques
  • Implementing access controls and permissions
  • Data auditing and lineage tracking
  • Best practices for data governance in a lakehouse
  • Time travel and snapshot management
  • Merge, update, and delete operations
  • Handling large datasets and optimizing performance
  • Working with data from various sources (e.g., Kafka, databases)
  • Integration with BI tools and analytics platforms
  • Real-time data processing and streaming
  • Real-world implementations of Iceberg in organizations
  • Lessons learned and best practices for building a data lakehouse
  • Future trends and developments in data lakehouse architecture
  • Designing and implementing a sample data lakehouse using Apache Iceberg
  • Presentation and peer review of projects
  • Course wrap-up and certification of completion

Why train with Traincrest

This Cloudera course is delivered by Traincrest's certified instructors, live online or in the classroom, with hands-on labs and a 98% exam success rate. Trusted by 500+ companies and 50,000+ students worldwide.