Skip to main contentSkip to Xpert Chatbot

IBM: Building ETL and Data Pipelines with Bash, Airflow and Kafka

4.5 stars
6 ratings

This course provides you with practical skills to build and manage data pipelines and Extract, Transform, Load (ETL) processes using shell/python scripts, Airflow and Kafka.

Building ETL and Data Pipelines with Bash, Airflow and Kafka
5 weeks
2–4 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available

There is one session available:

8,467 already enrolled! After a course session ends, it will be archivedOpens in a new tab.
Starts Dec 3

About this course

Skip About this course

Well-designed and automated data pipelines and ETL processes are the foundation of a successful Business Intelligence platform. Defining your data workflows, pipelines and processes early in the platform design ensures the right raw data is collected, transformed and loaded into desired storage layers and available for processing and analysis as and when required.

This course is designed to provide you the critical knowledge and skills needed by Data Engineers and Data Warehousing specialists to create and manage ETL, ELT, and data pipeline processes.

Upon completing this course you’ll gain a solid understanding of Extract, Transform, Load (ETL), and Extract, Load, and Transform (ELT) processes; practice extracting data, transforming data, and loading transformed data into a staging area; create an ETL data pipeline using Bash shell-scripting, build a batch ETL workflow using Apache Airflow and build a streaming data pipeline using Apache Kafka.

You’ll gain hands-on experience with practice labs throughout the course and work on a real-world inspired project to build data pipelines using several technologies that can be added to your portfolio and demonstrate your ability to perform as a Data Engineer.

This course pre-requisites that you have prior skills to work with datasets, SQL, relational databases, and Bash shell scripts.

At a glance

  • Associated skills:Extract Transform Load (ETL), Scripting, Business Intelligence, Python (Programming Language), Data Warehousing, Workflows, Apache Kafka, Relational Databases, Apache Airflow, Bash (Scripting Language), Staging Area, Data Pipeline, Shell Script, SQL (Programming Language), Workflow Management

What you'll learn

Skip What you'll learn
  • Describe and differntiate between Extract, Transform, Load (ETL) and Extract, Load, Transform (ELT) processes
  • Define data pipeline components, processes, tools and technologies
  • Create ETL processes using Bash shell scripts
  • Develop batch data pipelines using Apache Airflow
  • Create streaming data pipelines using Apache Kafka

This course is part of Data Engineering Professional Certificate Program

Learn more 
Expert instruction
14 skill-building courses
Self-paced
Progress at your own speed
1 year 2 months
3 - 4 hours per week

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.