DP 203: Data Engineering on Microsoft Azure -

DP-203: Data Engineering on Microsoft Azure

Duration: 4 Days (32 Hours)

Overview

Course Details

Prerequisites

FAQ

DP-203: Data Engineering on Microsoft Azure Course Overview:

The course aims to provide students with the essential skills to implement and manage data engineering workloads effectively on Microsoft Azure. Participants will gain hands-on experience with various Azure services, including Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and other relevant tools.

Throughout the course, students will focus on executing common data engineering tasks. They will learn to orchestrate data transfer and transformation pipelines, ensuring efficient data movement. Working with data files in a data lake will be emphasized, covering aspects such as organization and accessibility.

Additionally, the course covers the creation and loading of relational data warehouses, teaching students how to structure and optimize data for analytical purposes. The curriculum also includes capturing and aggregating real-time data streams, enabling learners to process and analyze data in real-time.

Furthermore, participants will explore the importance of tracking data assets and lineage. They will understand how to maintain a comprehensive understanding of the origins, transformations, and dependencies of data sets, contributing to improved data governance and decision-making.

By the end of the course, students will have a solid foundation in implementing and managing data engineering workloads on Microsoft Azure. They will be equipped to tackle real-world data engineering challenges confidently, leveraging their skills and knowledge acquired throughout the course.

Audience Profile

The target audience for the DP203 course primarily consists of data professionals, data architects, and business intelligence professionals seeking to expand their knowledge in data engineering and the development of analytical solutions utilizing Microsoft Azure’s data platform technologies. These individuals aim to enhance their skills in working with Azure for data engineering purposes.

Additionally, the course is also relevant for data analysts and data scientists who are involved in working with analytical solutions built on Microsoft Azure. These professionals can benefit from the insights gained in the course to further improve their abilities in leveraging Azure for data analytics and data science tasks.

Overall, the DP203 course caters to a diverse range of professionals, from data engineers to data analysts and data scientists, all seeking to maximize their understanding and proficiency in utilizing Microsoft Azure’s data platform technologies for building robust and effective analytical solutions.

Job role: Data Engineer

Certification Path:

Introduction to data engineering on Azure

Identify common data engineering tasks
Describe common data engineering concepts
Identify Azure services for data engineering

Introduction to Azure Data Lake Storage Gen2

Describe the key features and benefits of Azure Data Lake Storage Gen2
Enable Azure Data Lake Storage Gen2 in an Azure Storage account
Compare Azure Data Lake Storage Gen2 and Azure Blob storage
Describe where Azure Data Lake Storage Gen2 fits in the stages of analytical processing
Describe how Azure data Lake Storage Gen2 is used in common analytical workloads

Introduction to Azure Synapse Analytics

Identify the business problems that Azure Synapse Analytics addresses.
Describe core capabilities of Azure Synapse Analytics.
Determine when to use Azure Synapse Analytics.

Use Azure Synapse serverless SQL pool to query files in a data lake

Identify capabilities and use cases for serverless SQL pools in Azure Synapse Analytics
Query CSV, JSON, and Parquet files using a serverless SQL pool
Create external database objects in a serverless SQL pool

Use Azure Synapse serverless SQL pools to transform data in a data lake

Use a CREATE EXTERNAL TABLE AS SELECT (CETAS) statement to transform data.
Encapsulate a CETAS statement in a stored procedure.
Include a data transformation stored procedure in a pipeline.

Create a lake database in Azure Synapse Analytics

Understand lake database concepts and components
Describe database templates in Azure Synapse Analytics
Create a lake database

Analyze data with Apache Spark in Azure Synapse Analytics

Identify core features and capabilities of Apache Spark.
Configure a Spark pool in Azure Synapse Analytics.
Run code to load, analyze, and visualize data in a Spark notebook.

Transform data with Spark in Azure Synapse Analytics

Use Apache Spark to modify and save dataframes
Partition data files for improved performance and scalability.
Transform data with SQL

Use Delta Lake in Azure Synapse Analytics

Describe core features and capabilities of Delta Lake.
Create and use Delta Lake tables in a Synapse Analytics Spark pool.
Create Spark catalog tables for Delta Lake data.
Use Delta Lake tables for streaming data.
Query Delta Lake tables from a Synapse Analytics SQL pool.

Analyze data in a relational data warehouse

Design a schema for a relational data warehouse.
Create fact, dimension, and staging tables.
Use SQL to load data into data warehouse tables.
Use SQL to query relational data warehouse tables.

Load data into a relational data warehouse

Load staging tables in a data warehouse
Load dimension tables in a data warehouse
Load time dimensions in a data warehouse
Load slowly changing dimensions in a data warehouse
Load fact tables in a data warehouse
Perform post-load optimizations in a data warehouse

Build a data pipeline in Azure Synapse Analytics

Describe core concepts for Azure Synapse Analytics pipelines.
Create a pipeline in Azure Synapse Studio.
Implement a data flow activity in a pipeline.
Initiate and monitor pipeline runs.

Use Spark Notebooks in an Azure Synapse Pipeline

Describe notebook and pipeline integration.
Use a Synapse notebook activity in a pipeline.
Use parameters with a notebook activity.

Plan hybrid transactional and analytical processing using Azure Synapse Analytics

Describe Hybrid Transactional / Analytical Processing patterns.
Identify Azure Synapse Link services for HTAP.

Implement Azure Synapse Link with Azure Cosmos DB

Configure an Azure Cosmos DB Account to use Azure Synapse Link.
Create an analytical store enabled container.
Create a linked service for Azure Cosmos DB.
Analyze linked data using Spark.
Analyze linked data using Synapse SQL.

Implement Azure Synapse Link for SQL

Understand key concepts and capabilities of Azure Synapse Link for SQL.
Configure Azure Synapse Link for Azure SQL Database.
Configure Azure Synapse Link for Microsoft SQL Server.

Get started with Azure Stream Analytics

Understand data streams.
Understand event processing.
Understand window functions.
Get started with Azure Stream Analytics.

Ingest streaming data using Azure Stream Analytics and Azure Synapse Analytics

Describe common stream ingestion scenarios for Azure Synapse Analytics.
Configure inputs and outputs for an Azure Stream Analytics job.
Define a query to ingest real-time data into Azure Synapse Analytics.
Run a job to ingest real-time data, and consume that data in Azure Synapse Analytics.

Visualize real-time data with Azure Stream Analytics and Power BI

Configure a Stream Analytics output for Power BI.
Use a Stream Analytics query to write data to Power BI.
Create a real-time data visualization in Power BI.

Introduction to Microsoft Purview

Evaluate whether Microsoft Purview is appropriate for data discovery and governance needs.
Describe how the features of Microsoft Purview work to provide data discovery and governance.

Integrate Microsoft Purview and Azure Synapse Analytics

Catalog Azure Synapse Analytics database assets in Microsoft Purview.
Configure Microsoft Purview integration in Azure Synapse Analytics.
Search the Microsoft Purview catalog from Synapse Studio.
Track data lineage in Azure Synapse Analytics pipelines activities.

Explore Azure Databricks

Provision an Azure Databricks workspace.
Identify core workloads and personas for Azure Databricks.
Describe key concepts of an Azure Databricks solution.

Use Apache Spark in Azure Databricks

Describe key elements of the Apache Spark architecture.
Create and configure a Spark cluster.
Describe use cases for Spark.
Use Spark to process and analyze data stored in files.
Use Spark to visualize data.

Run Azure Databricks Notebooks with Azure Data Factory

Describe how Azure Databricks notebooks can be run in a pipeline.
Create an Azure Data Factory linked service for Azure Databricks.
Use a Notebook activity in a pipeline.
Pass parameters to a notebook.

DP-203: Data Engineering on Microsoft Azure Course Prerequisites:

Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions.

Specifically completing:

Q: What is DP-203?

A: DP-203 is the code for the Microsoft Azure Data Engineer Associate certification. It validates the skills and knowledge required to design and implement the management, monitoring, security, and privacy of data using Azure data services.

Q: What is the purpose of DP-203 training?

A: DP-203 training is designed to equip individuals with the necessary skills to become proficient Azure Data Engineers. It covers various topics such as data storage, data processing, data integration, and data security in the Azure environment. The training helps individuals gain the knowledge required to design and implement effective data solutions using Azure data services.

Q: Who should take DP 203 training?

A: DP 203 training is suitable for data professionals who want to become Azure Data Engineers or enhance their existing skills in designing and implementing data solutions using Azure services. It is particularly relevant for individuals involved in tasks such as data integration, data transformation, data storage, and data analytics.

Q: What are the prerequisites for DP 203 training?

A: To enroll in DP 203 training, Microsoft recommends having a foundational knowledge of Azure fundamentals and basic understanding of data-related concepts and technologies. It is also beneficial to have experience with data processing, data integration, and data analytics.

Q: How can I prepare for DP203 certification?

A: To prepare for DP203, Microsoft offers official training courses, online learning paths, and self-paced study materials. You can explore the Microsoft Learning website to access these resources. Additionally, there are various third-party study guides and practice tests available to help you prepare for the certification exam.

Q: What is the format of the DP 203 certification exam?

A: The DP203 certification exam consists of multiple-choice questions, case studies, and other question formats that assess your ability to design and implement data solutions using Azure services. The exact number of questions and the duration of the exam may vary, so it’s recommended to check the official Microsoft certification website for the latest information.

Q: How long is DP203 certification valid?

A: DP203 certification is valid for two years. After that, you will need to renew your certification by passing a renewal exam or meeting other criteria specified by Microsoft.

Q: What are the benefits of earning DP203 certification?

A: Earning DP203 certification demonstrates your expertise in designing and implementing data solutions using Azure data services, which can significantly enhance your career prospects as an Azure Data Engineer. It validates your skills and knowledge in working with various data technologies and showcases your ability to provide effective data solutions on the Azure platform.

Discover the perfect fit for your learning journey

Choose Learning Modality

Live Online

Convenience
Cost-effective
Self-paced learning
Scalability

Classroom

Interaction and collaboration
Networking opportunities
Real-time feedback
Personal attention

Onsite

Familiar environment
Confidentiality
Team building
Immediate application

Training Exclusives

This course comes with following benefits:

Practice Labs.
Get Trained by Microsoft Certified Trainers (MCT).
Access to the recordings of your class sessions for 90 days.
Digital courseware
Experience 24*7 learner support.

Request Free Demo

Got more questions? We’re all ears and ready to assist!