更新时间:2021-07-09 18:52:29
封面
版权信息
Credits
Preface
Part 1. Module 1
Chapter 1. Introducing Data Analysis and Libraries
Data analysis and processing
An overview of the libraries in data analysis
Python libraries in data analysis
Summary
Chapter 2. NumPy Arrays and Vectorized Computation
NumPy arrays
Array functions
Data processing using arrays
Linear algebra with NumPy
NumPy random numbers
Chapter 3. Data Analysis with Pandas
An overview of the Pandas package
The Pandas data structure
The essential basic functionality
Indexing and selecting data
Computational tools
Working with missing data
Advanced uses of Pandas for data analysis
Chapter 4. Data Visualization
The matplotlib API primer
Exploring plot types
Legends and annotations
Plotting functions with Pandas
Additional Python data visualization tools
Chapter 5. Time Series
Time series primer
Working with date and time objects
Resampling time series
Downsampling time series data
Upsampling time series data
Time zone handling
Timedeltas
Time series plotting
Chapter 6. Interacting with Databases
Interacting with data in text format
Interacting with data in binary format
Interacting with data in MongoDB
Interacting with data in Redis
Chapter 7. Data Analysis Application Examples
Data munging
Data aggregation
Grouping data
Chapter 8. Machine Learning Models with scikit-learn
An overview of machine learning models
The scikit-learn modules for different models
Data representation in scikit-learn
Supervised learning – classification and regression
Unsupervised learning – clustering and dimensionality reduction
Measuring prediction performance
Part 2. Module 2
Chapter 1. Getting Started with Predictive Modelling
Introducing predictive modelling
Applications and examples of predictive modelling
Python and its packages – download and installation
Python and its packages for predictive modelling
IDEs for Python
Chapter 2. Data Cleaning
Reading the data – variations and examples
Various methods of importing data in Python
The read_csv method
Use cases of the read_csv method
Case 2 – reading a dataset using the open method of Python
Case 3 – reading data from a URL
Case 4 – miscellaneous cases
Basics – summary dimensions and structure
Handling missing values
Creating dummy variables
Visualizing a dataset by basic plotting
Chapter 3. Data Wrangling
Subsetting a dataset
Generating random numbers and their usage
Grouping the data – aggregation filtering and transformation
Random sampling – splitting a dataset in training and testing datasets
Concatenating and appending data
Merging/joining datasets
Chapter 4. Statistical Concepts for Predictive Modelling
Random sampling and the central limit theorem
Hypothesis testing
Chi-square tests
Correlation
Chapter 5. Linear Regression with Python
Understanding the maths behind linear regression
Making sense of result parameters