Skip to main contentSkip to Xpert Chatbot

AI: Python and Pandas for Data Engineering

4.5 stars
30 ratings

Master Python essentials and Pandas for data engineering. Learn to set up development environments, manipulate data, and efficiently solve real-world problems.

Python and Pandas for Data Engineering
4 weeks
3–6 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available

There is one session available:

After a course session ends, it will be archivedOpens in a new tab.
Starts Nov 21

About this course

Skip About this course

In this course, you'll gain the Python and Pandas skills essential for data engineering:

  • Set up version-controlled Python environments with necessary libraries
  • Write Python programs using key language features and data structures
  • Manipulate and analyze data using the powerful Pandas library
  • Explore alternative data structures like NumPy arrays and PySpark DataFrames
  • Utilize Vim, Visual Studio Code, and Git for productive development

Whether you're a beginner or have some programming experience, you'll learn to harness Python and Pandas to tackle data engineering challenges. Hands-on exercises reinforce your learning each step of the way.

At a glance

What you'll learn

Skip What you'll learn
  • Python environment setup and package management
  • Core Python syntax and data structures
  • Pandas DataFrames for data manipulation
  • Alternatives to Pandas for big data
  • Development with Vim, VS Code, and Git

Module 1: Getting Started with Python (14 hours)

\- Overview of Python, Bash and SQL Essentials for Data Engineering (video, 7 minutes)

\- Meet your Course Instructor: Kennedy Behrman (video, 0 minutes)

\- Overview of Key Concepts (video, 5 minutes)

\- Introduction to Setting Up Your Python Environment (video, 0 minutes)

\- Installing Packages with pip in Python (video, 6 minutes)

\- Saving Requirements File in Python (video, 3 minutes)

\- Creating and Using a Python Virtual Environment (video, 5 minutes)

\- Expression Statements in Python (video, 3 minutes)

\- Assignment Statements in Python (video, 5 minutes)

\- Import Statements in Python (video, 4 minutes)

\- Other Simple Statements in Python (video, 5 minutes)

\- Compound Statements in Python (video, 5 minutes)

\- If Statements in Python (video, 6 minutes)

\- While Loops in Python (video, 4 minutes)

\- Functions in Python (video, 7 minutes)

\- Key Terms (reading, 10 minutes)

\- Key Terms (reading, 10 minutes)

\- Meet your Supporting Instructors: Alfredo Deza and Noah Gift (reading, 10 minutes)

\- Course Structure and Discussion Etiquette (reading, 10 minutes)

\- Getting Started and Best Practices (reading, 10 minutes)

\- Key Terms (reading, 10 minutes)

\- Lesson Reflection (reading, 10 minutes)

\- Key Terms (reading, 10 minutes)

\- Lesson Reflection (reading, 10 minutes)

\- Key Terms (reading, 10 minutes)

\- Evaluating to True or False (reading, 10 minutes)

\- Lesson Reflection (reading, 10 minutes)

\- Python Statements (quiz, 30 minutes)

\- Assignment Statements (quiz, 30 minutes)

\- Import Statements (quiz, 30 minutes)

\- If Statements (quiz, 30 minutes)

\- While Loops (quiz, 30 minutes)

\- Quiz-Setting Up Your Python Environment (assignment, 180 minutes)

\- Meet and Greet (optional) (discussion prompt, 10 minutes)

\- Install a Package with the pip Command (ungraded lab, 60 minutes)

\- Export a Requirements File (ungraded lab, 60 minutes)

\- Create a Virtual Environment (ungraded lab, 60 minutes)

\- Practicing with Expression Statements (ungraded lab, 60 minutes)

\- Decorator Functions (ungraded lab, 60 minutes)

\- Setting up a Python Environment (ungraded lab, 60 minutes)

****

Module 2: Essential Python (11 hours)

- Introduction to Python Essentials (video, 0 minutes)

- Sequences in Python (video, 8 minutes)

- Lists and Tuples in Python (video, 5 minutes)

- Strings in Python (video, 10 minutes)

- Creating Range Objects in Python (video, 2 minutes)

- Creating Dictionaries in Python (video, 4 minutes)

- Accessing Dictionary Data in Python (video, 3 minutes)

- Dictionary Views in Python (video, 2 minutes)

- Sets and Set Operations in Python (video, 6 minutes)

- List Comprehensions in Python (video, 6 minutes)

- Generator Expressions in Python (video, 4 minutes)

- Generator Functions in Python (video, 7 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Essential Python Concepts (quiz, 30 minutes)

- Sequence Operations (quiz, 30 minutes)

- Lists and Tuples (quiz, 30 minutes)

- Range Objects (quiz, 30 minutes)

- Accessing Data in Dictionaries (quiz, 30 minutes)

- Sets and Set Operations (quiz, 30 minutes)

- List Comprehensions (quiz, 30 minutes)

- Generator Expressions (quiz, 30 minutes)

- Practicing with Strings in Python (ungraded lab, 60 minutes)

- Creating Dictionaries in Python (ungraded lab, 60 minutes)

- Dictionary Views in Python (ungraded lab, 60 minutes)

- Comprehensions and Generators in Python (ungraded lab, 60 minutes)

- Practicing Essential Python (ungraded lab, 60 minutes)

****

Module 3: Data in Python: Pandas and Alternatives (12 hours)

- Introduction to Data in Python: Pandas and Alternatives (video, 0 minutes)

- Creating Pandas DataFrames in Python (video, 4 minutes)

- Investigating Data in a Pandas DataFrame (video, 6 minutes)

- Selecting Data in a Pandas DataFrame (video, 6 minutes)

- Manipulating Pandas DataFrames (video, 4 minutes)

- Updating Pandas DataFrame Data (video, 5 minutes)

- Applying Functions in a Pandas DataFrame (video, 6 minutes)

- Creating NumPy Arrays in Python (video, 15 minutes)

- Spark and PySpark DataFrames in Python (video, 6 minutes)

- Creating Dask DataFrames in Python (video, 6 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Key Terms (reading, 10 minutes)

- Polars (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Pandas and Alternatives (quiz, 30 minutes)

- NumPy (quiz, 30 minutes)

- PySpark (quiz, 30 minutes)

- Dask (quiz, 30 minutes)

- Creating DataFrames (ungraded lab, 60 minutes)

- Looking at Data in DataFrames (ungraded lab, 60 minutes)

- Selecting Data in a Pandas DataFrame (ungraded lab, 60 minutes)

- Manipulating DataFrames (ungraded lab, 60 minutes)

- Updating Data in a DataFrame (ungraded lab, 60 minutes)

- Applying Functions in a Pandas DataFrame (ungraded lab, 60 minutes)

- Manipulate DataFrames with Polars to gain insights (ungraded lab, 60 minutes)

- Pandas and Alternatives (ungraded lab, 60 minutes)

****

Module 4: Python Development Environments (13 hours)

- Introduction to Python Development Environments (video, 0 minutes)

- Introduction to Vim Normal Mode (video, 6 minutes)

- Switching from Normal to Insert and Visual Modes in Vim (video, 4 minutes)

- Working with the Vim Command Line (video, 6 minutes)

- Vim Configuration (video, 3 minutes)

- Introduction to Visual Studio Code (video, 1 minute)

- Setting Up Visual Studio Code (video, 2 minutes)

- Debugging Visual Studio Code (video, 3 minutes)

- What is Version Control? (video, 3 minutes)

- Introduction to Git and Git Concepts (video, 7 minutes)

- Version Control with GitHub (video, 6 minutes)

- Summary of Python and Pandas for Data Engineering (video, 0 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Key Terms (reading, 10 minutes)

- Lesson Reflection (reading, 10 minutes)

- Next Steps (reading, 10 minutes)

- Cumulative Python and Pandas for Data Engineering Quiz (quiz, 45 minutes)

- Insert and Visual Modes (quiz, 30 minutes)

- Vim Command Line Mode (quiz, 30 minutes)

- Features of Visual Studio Code (quiz, 30 minutes)

- Version Control (quiz, 30 minutes)

- Git Commands (quiz, 30 minutes)

- Hosted Git (quiz, 30 minutes)

- Basic Vim Commands (ungraded lab, 60 minutes)

- Explore Visual Studio Code (ungraded lab, 60 minutes)

- Visual Studio Code Debugger (ungraded lab, 60 minutes)

- Setup and Provision a Python Project (ungraded lab, 60 minutes)

- Pandas Final Challenge: Life Expectancy and Happiness (ungraded lab, 60 minutes)

- Final Jupyter Sandbox (ungraded lab, 60 minutes)

- Final VS Code Sandbox (ungraded lab, 60 minutes)

- Final Sandbox Linux Desktop (ungraded lab, 60 minutes)

This course is part of Data Engineering Foundations Professional Certificate Program

Learn more 
Expert instruction
8 skill-building courses
Self-paced
Progress at your own speed
8 months
3 - 6 hours per week

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX For Business.