Are you looking for a hands-on approach to learn Data Preprocessing techniques fast?
Do you need to start learning Python for Data Preparation from Scratch?
This book is for you.
This book is dedicated to data preparation and explains how to perform different data preparation techniques on a variety of datasets using various data preparation libraries written in the Python programming language. It is suggested that you use this book for data preparation purposes only and not for data science or machine learning.
For the application of data preparation in data science and machine learning, read this book in conjunction with dedicated books on machine learning and data science.
This book explains the process of data preparation using various libraries from scratch. All the codes and datasets have been provided. However, to download data preparation libraries, you will need the internet.
In addition to beginners to data preparation with Python, this book can also be used as a reference manual by intermediate and experienced programmers as it contains data preparation code samples using multiple data visualization libraries.
What this book offers…
The book follows a very simple approach. It is divided into nine chapters. Chapter 1 introduces the basic concept of data preparation, along with the installation steps for the software that we will need to perform data preparation in this book. Chapter 1 also contains a crash course on Python. A brief overview of different data types is given in Chapter 2. Chapter 3 explains how to handle missing values in the data, while the categorical encoding of numeric data is explained in Chapter 4. Data discretization is presented in Chapter 5. Chapter 6 explains the process of handline outliers, while Chapter 7 explains how to scale features in the dataset. Handling of mixed and DateTime data type is explained in Chapter 8, while data balancing and resampling has been explained in Chapter 9. A full data preparation final project is also available at the end of the book.
In each chapter, different types of data preparation techniques have been explained theoretically, followed by practical examples. Each chapter also contains an exercise that students can use to evaluate their understanding of the concepts explained in the chapter.
Clear and Easy to Understand Solutions
All solutions in this book are extensively tested by a group of beta readers. The solutions provided are simplified as much as possible so that they can serve as examples for you to refer to when you are learning a new skill.
- What Is Data Preparation
- Python Crash Course
- Different Libraries for Data Preparation
- Understanding Data Types
- Handling Missing Data
- Encoding Categorical Data
- Data Discretization
- Outlier Handling
- Feature Scaling
- Handling Mixed and DateTime Variables
- Handling Imbalanced Datasets
- A Complete Data Preparation Pipeline
- Project 1 – Data Preparation
- Project 2 – Classification Project
- Project 3 – Regression Project