Dagster & ETL
Learn how to ingest data to power your assets. You’ll build custom pipelines and see how to use Embedded ETL and Dagster Components to build out your data platform.
This course is geared towards those who have some familiarity with Dagster. You don't need to be an expert, but you should know your way around a Dagster project.
Intermediate level
You'll need to know the basics of Dagster to complete this course. We recommend completing check out the Dagster Essentials course if you've never used Dagster before or want a refresher before getting started.
An understanding of data engineering. While you do not need a deep understanding of ETL before taking the class, some examples if you are more familiar with concepts around databases and data warehouses.
While you don’t need to be a Python expert to get started, you do need some Python familiarity to complete this course and use Dagster. In Lesson 2, we’ll cover Dagster’s specific installation requirements. Here are some Pythonic skills that you’ll be using, along with resources to learn about them:
You won’t be writing complex SQL, but you will need to understand the concept of SELECT statements, what tables are, and how to make them. If you’d like a 5-minute crash course, here’s a short article and cheatsheet on using SQL.
About this course
What is ETL?
ETL and Dagster
Project preview
Prerequisites and installation
Set up local
Set up codespaces
Overview
File import
Data integrity
Partitions
Complex partitions
Triggering partitions
Knowledge check
Cloud storage
Overview
APIs
API resource
ETL with API
API Dagster assets
Knowledge check
Triggering API job
Backfilling from APIs
Overview
dlt
Basic dlt
Dagster and dlt
Knowledge check
Refactoring static data with dlt
Refactoring APIs with dlt
Overview
Database replication
Knowledge check
Sling
Sling database replication set up
Dagster and Sling
Managing Sling assets