Python for Data Analysis

Code: PYDA

Overview

This course is comprised of 4 sections.

Section 1 helps you get started with data analysis as quickly and effectively as possible. You’ll learn how to use JupyterLab and Jupyter Notebooks to organize and develop your analyses. You’ll learn how to use a subset of the Pandas module for data analysis and visualization. And you’ll learn how to use a subset of the Seaborn module to create professional data visualizations that can be used for presentations. At the end of this section, you’ll be able to start doing analyses of your own.

Most analysis is descriptive analysis in which you analyze past data to help you gain new insights. That’s why section 2 presents the critical descriptive analysis skills that you need for success on the job. That includes:

How to read data into a Pandas DataFrame
How to clean the data by dropping unneeded rows and columns and fixing missing values, data types, and outliers
How to prepare the data by adding columns, modifying the data in columns, and combining DataFrames
How to analyze the data by grouping and aggregating the data, using pivot tables, and more
How to analyze time-series data by reindexing, downsampling, and working with rolling windows and running totals

Predictive analysis takes data analysis to another level by using statistical models to predict unknown or future values. Although a complete treatment of predictive analysis is far beyond the scope of this course, all analysts should know the basic concepts and skills. That’s why section 3 presents those concepts and gets you started doing your own predictions. This introduction includes how to find the correlations between variables, how to use Scikit-learn to work with linear regression models, and how to use Seaborn to create and plot linear regression models. It also shows you how to select the right variables and the right number of variables for multiple regressions... one of the critical skills for doing an effective job of making predictions.

Section 4 presents a number of case studies that show you how the skills you’ve been learning can be applied to real-world datasets:

The polling data for the 2016 presidential election
The US Forest Service data for forest fires
The US social survey data taken from hundreds of polls
The basketball shot location data for NBA player Stephen Curry

Audience

Today, data analysis is an essential skill in the fields of business, science, and social science, and Python has become the preferred language for doing that data analysis. Adding Python data analysis to your skillset can lead to new career opportunities - that's where this course comes in, it's for all those who want to master the fundamentals of data analysis using Python.

Prerequisites

You should have experience with Python, this can be gained by attending our Python Programming courses.

Objectives

You will learn:

Section 1
Introduction to Python for data analysis
The Pandas essentials for data analysis
The Pandas essentials for data visualization
The Seaborn essentials for data visualization

Section 2
How to get the data
How to clean the data
How to prepare the data
How to analyze the data
How to analyze time-series data

Section 3
How to make predictions with a linear regression model
How to make predictions with a multiple regression model

Section 4
The Polling case study
The Forest Fires case study
The Social Survey case study
The Sports Analytics case study

Price (ex. VAT)

€ 2.460,00 per person

Duration

4 days

Schedule

Please send us a message with the form below

Delivery methods

Classroom
On-site (at your location)
Virtual (instructor online)

Questions?

Write us and we will contact you to discuss your requirements