Python for Data Analysis

Code: PYDA

Overview

This course is comprised of 4 sections.

Section 1 helps you get started with data analysis as quickly and effectively as possible. You’ll learn how to use JupyterLab and Jupyter Notebooks to organize and develop your analyses. You’ll learn how to use a subset of the Pandas module for data analysis and visualization. And you’ll learn how to use a subset of the Seaborn module to create professional data visualizations that can be used for presentations. At the end of this section, you’ll be able to start doing analyses of your own.

Most analysis is descriptive analysis in which you analyze past data to help you gain new insights. That’s why section 2 presents the critical descriptive analysis skills that you need for success on the job. That includes:

  • How to read data into a Pandas DataFrame
  • How to clean the data by dropping unneeded rows and columns and fixing missing values, data types, and outliers
  • How to prepare the data by adding columns, modifying the data in columns, and combining DataFrames
  • How to analyze the data by grouping and aggregating the data, using pivot tables, and more
  • How to analyze time-series data by reindexing, downsampling, and working with rolling windows and running totals

Predictive analysis takes data analysis to another level by using statistical models to predict unknown or future values. Although a complete treatment of predictive analysis is far beyond the scope of this course, all analysts should know the basic concepts and skills. That’s why section 3 presents those concepts and gets you started doing your own predictions. This introduction includes how to find the correlations between variables, how to use Scikit-learn to work with linear regression models, and how to use Seaborn to create and plot linear regression models. It also shows you how to select the right variables and the right number of variables for multiple regressions... one of the critical skills for doing an effective job of making predictions.

Section 4 presents a number of case studies that show you how the skills you’ve been learning can be applied to real-world datasets:

  • The polling data for the 2016 presidential election
  • The US Forest Service data for forest fires
  • The US social survey data taken from hundreds of polls
  • The basketball shot location data for NBA player Stephen Curry

Audience

Today, data analysis is an essential skill in the fields of business, science, and social science, and Python has become the preferred language for doing that data analysis. Adding Python data analysis to your skillset can lead to new career opportunities - that's where this course comes in, it's for all those who want to master the fundamentals of data analysis using Python.

Prerequisites

You should have experience with Python, this can be gained by attending our Python Programming courses.

Objectives

You will learn:

  • Section 1
  • Introduction to Python for data analysis
  • The Pandas essentials for data analysis
  • The Pandas essentials for data visualization
  • The Seaborn essentials for data visualization
  • Section 2
  • How to get the data
  • How to clean the data
  • How to prepare the data
  • How to analyze the data
  • How to analyze time-series data
  • Section 3
  • How to make predictions with a linear regression model
  • How to make predictions with a multiple regression model
  • Section 4
  • The Polling case study
  • The Forest Fires case study
  • The Social Survey case study
  • The Sports Analytics case study

Price (ex. VAT)

€ 2.460,00 per person

Duration

4 days

Schedule

Please send us a message with the form below

Delivery methods

  • Classroom
  • On-site (at your location)
  • Virtual (instructor online)

Inquire

We will contact you to discuss your requirements