Pandas Basics — 1

Devil’s Advocate
3 min readFeb 22, 2023

--

Pandas is a popular Python library used mainly for data manipulation and data analysis.

Let’s look at some of the common use cases for Pandas.

Data I/O: Pandas can read data from various file formats, including CSV, Excel, SQL databases, and other data sources.

Data Cleaning and Transformation: Pandas provides a wide range of tools for cleaning, transforming and merging data, making it an important library for tasks involving data wrangling.

Data Exploration and Analysis: Pandas allows for data exploration and analysis by providing easy-to-use tools for data visualization, summary statistics, and grouping data by different criteria.

Apart from these, Pandas library is also used for working with time series data, including resampling, rolling window calculations, and has several methods for handling missing data, including interpolation, filling, etc…

We can look at some of the most common methods used in the Pandas library.

read_csv(): reads data from a CSV file and returns a DataFrame. For example:

import pandas as pd
df = pd.read_csv('data.csv')

head(): returns the first n rows of a DataFrame. For example:

import pandas as pd
df = pd.read_csv('data.csv')
df.head(10)

By default, head() returns the first 5 rows of a data frame. If your DataFrame has fewer than the specified number of rows, head() will return all the available rows.

For example, if you call df.head(10) on a DataFrame with only 7 rows, it will return all 7 rows.

tail(): returns the last n rows of a DataFrame. For example:

import pandas as pd
df = pd.read_csv('data.csv')
df.tail(10)

The other aspects are very similar to the head() function.

describe(): returns a summary of statistics for each column of a DataFrame. For example:

import pandas as pd
df = pd.read_csv('data.csv')
df.describe()

If the DataFrame contains non-numerical columns, they will be excluded from the output.

The describe() function can be used to generate summary statistics for a specific column by passing the column name to the function, for example:

import pandas as pd
df = pd.read_csv('sensors.csv')
df['device'].describe()

We will look at more such useful methods in subsequent posts.

--

--

Devil’s Advocate

Seeker for life. Looking to make technology simpler for everyone.