Skip to main content

Module 5: Pandas

NumPy arrays are fast, but work best if all components are of the same time. Pandas adds functionality that combines the speed of NumPy with the general applicability of a database table. Pandas is built on top of NumPy, which is one of the reasons for the need to study NumPy. Pandas defines one-dimensional Data Series and two-dimensional Data Frames. It allows to attach labels to data, makes it easier to import data, works better with missing data, and allows grouping and pivoting.

As usual, you get Pandas either as part of a package or by installing it via pip3. There are some recommended modules that you can install, such as "pandas[performance]" to improve performance.

5.1. Pandas Data Series

A Pandas Data Series is a one-dimensional array of values with either a default or an explicit index. It functions a bit like a dictionary. You can create a Pandas series using a dictionary, a list, a NumPy array, a value, or by reading it from a file. The basic method to create a Series is the Series constructor