Python Pandas is a helpful tool for analyzing and modifying tabular data. It uses Python and SQL to make data analysis more efficient compared to traditional methods.

Pandas data types
Photo by Michael Burrows on Pexels.com

The Pandas is written in C language. The Pandas module is a high-performance, highly efficient, and high-level data analysis library. It allows us to work with large sets of data called DataFrames.

Purpose of Pandas

  • Calculate statistics and answer questions about the data like average, median, max, and min of each column
  • Finding correlations between columns
  • Tracking the distribution of one or more columns
  • Visualizing the data with the help of matplotlib, using plot bars, histograms, etc.
  • Cleaning and filtering data, whether it’s missing or incomplete, just by applying a user-defined function (UDF) or built-in function
  • Transforming tabular data into Python to work with
  • Exporting the data into a CSV, other file, or database
  • Feature engineer new columns that can be applied to your analysis

Pandas data types

  1. Series ➤ One-dimensional labeled array capable of holding data of any type
  2. DataFrame ➤ Spreadsheet
  3. Axis ➤ Column or row, axis = 0 by row; axis = 1 by column
  4. Record ➤ A single row
  5. dtype ➤ Data type for DataFrame or series object
  6. Time Series ➤ Series object that uses time intervals, like tracking weather by the hour

How to create a dataframe in Pandas

The Datafrme carnation from the dictionary can be seen in the example. What is the purpose of the seed() method? it customizes the start number of the random number generator.

import random
import pandas as pd
random.seed(3)     # generate same random numbers every time, number used doesn't matter

names = [ "Jess", "Jordan", "Sandy", "Ted", "Barney", "Tyler", "Rebecca" ]
ages = [ random.randint(18, 35) for x in range( len(names) )]
people = { "names" : names, "ages" : ages }

df = pd.DataFrame.from_dict(people)
print(df)

The output

     names  ages
0     Jess    25
1   Jordan    35
2    Sandy    22
3      Ted    29
4   Barney    33
5    Tyler    20
6  Rebecca    18

Related