SlideShare uma empresa Scribd logo
1 de 26
INTRODUCTION TO PANDAS
A LIBRARY THAT IS USED FOR DATA MANIPULATION AND ANALYSIS TOOL
USING POWERFUL DATA STRUCTURES
TYPES OF DATA STRUCTUE IN PANDAS
Data Structure Dimensions Description
Series 1 1D labeled homogeneous
array, sizeimmutable.
Data Frames 2 General 2D labeled, size-
mutable tabular structure
with potentially
heterogeneously typed
columns.
Panel 3 General 3D labeled, size-
mutable array.
SERIES
• Series is a one-dimensional array like structure with homogeneous
data. For example, the following series is a collection of integers 10,
23, 56,
• … 10 23 56 17 52 61 73 90 26 72
DataFrame
• DataFrame is a two-dimensional array with heterogeneous data. For
example,
Name Age Gender Rating
Steve 32 Male 3.45
Lia 28 Female 4.6
Vin 45 Male 3.9
Katie 38 Female 2.78
Data Type of Columns
Column Type
Name String
Age Integer
Gender String
Rating Float
PANEL
• Panel is a three-dimensional data structure with heterogeneous data.
It is hard to represent the panel in graphical representation. But a
panel can be illustrated as a container of DataFrame.
DataFrame
• A Data frame is a two-dimensional data structure, i.e., data is aligned
in a tabular fashion in rows and columns.
• Features of DataFrame
• Potentially columns are of different types
• Size – Mutable
• Labeled axes (rows and columns)
• Can Perform Arithmetic operations on rows and columns
Structure
pandas.DataFrame
pandas.DataFrame(data, index , columns , dtype , copy )
• data
• data takes various forms like ndarray, series, map, lists, dict, constants and also
another DataFrame.
• index
• For the row labels, the Index to be used for the resulting frame is Optional Default
np.arrange(n) if no index is passed.
• columns
• For column labels, the optional default syntax is - np.arrange(n). This is only true if
no index is passed.
• dtype
• Data type of each column.
• copy
• This command (or whatever it is) is used for copying of data, if the default is False.
• Create DataFrame
• A pandas DataFrame can be created using various inputs like −
• Lists
• dict
• Series
• Numpy ndarrays
• Another DataFrame
Example
• Example
• import pandas as pd
• Data = [_______]
• Df = pd.DataFrame(data)
• Print df
Example 2
Import pandas as pd
Data = {‘Name’ : [‘__’. ‘__’],’Age’: [___]}
Df = pd.DataFrame(data)
print df
Example
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• df = pd.DataFrame(data)
• print df
• ________________________________________
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• df = pd.DataFrame(data, index=['first', 'second'])
• print df
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• #With two column indices, values same as dictionary keys
• df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
• #With two column indices with one index with other name
• df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
• print df1print df2
The following example shows how to create a DataFrame with
a list of dictionaries, row indices, and column indices.
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• #With two column indices, values same as dictionary keys
• df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
• #With two column indices with one index with other name
• df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
• print df1
• print df2
Create a DataFrame from Dict of Series
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1,
2, 3, 4], index=['a', 'b', 'c', 'd'])}
• df = pd.DataFrame(d)
• print df
Column Addition
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2,
3, 4], index=['a', 'b', 'c', 'd'])}
• df = pd.DataFrame(d)
• # Adding a new column to an existing DataFrame object with column label
by passing new series
• print ("Adding a new column by passing as Series:")
• df['three']=pd.Series([10,20,30],index=['a','b','c'])
• print dfprint ("Adding a new column using the existing columns in
DataFrame:")
• df['four']=df['one']+df['three']
• print df
Column Deletion
• # Using the previous DataFrame, we will delete a
column
• # using del function
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
• df = pd.DataFrame(d)
• print ("Our dataframe is:")
• print df
• # using del function
• print ("Deleting the first column using DEL function:")
• del df['one']
• print df
# using pop function
• print ("Deleting another column using POP function:")
• df.pop('two')
• print df
Slicing in python
•import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c',
'd'])}
•df = pd.DataFrame(d)
•print df[2:4]
Addition of rows
• Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’])
• Df = df.append(df2 )
• Print df
Deletion of rows
• Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’])
• Df = df.drop(0)
• Print df
Reindexing
• import pandas as pd
• import numpy as np
df1 =
pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])
• df2 =
pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])df1
= df1.reindex_like(df2)print df1
Concatenating objects
• import pandas as pd
• One = pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’:
[‘__’]}, index = [] )
• two= pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’:
[‘__’]}, index = [] )
• Print pd.concat([one, two])
Handling categorical data
• There are many data that are repetitive for example gender , country , and codes are
always repetitive .
• Categorical variables can take on only a limited
• The categorical data type is useful in the following cases −
• A string variable consisting of only a few different values. Converting such a string
variable to a categorical variable will save some memory.
• The lexical order of a variable is not the same as the logical order (“one”, “two”,
“three”). By converting to a categorical and specifying an order on the categories,
sorting and min/max will use the logical order instead of the lexical order.
• As a signal to other python libraries that this column should be treated as a
categorical variable (e.g. to use suitable statistical methods or plot types).
• import pandas as pd
• cat = pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])
• print cat
• ____________________________________________
• import pandas as pd
• import numpy as np
• cat = pd.Categorical(["a", "c", "c", np.nan], categories=["b", "a", "c"])
• df = pd.DataFrame({"cat":cat, "s":["a", "c", "c", np.nan]})
• print df.describe()
• print df["cat"].describe()
Introduction to pandas

Mais conteúdo relacionado

Mais procurados

Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in PythonMarc Garcia
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization Sourabh Sahu
 
Data Visualization in Python
Data Visualization in PythonData Visualization in Python
Data Visualization in PythonJagriti Goswami
 
Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]Alexander Hendorf
 
Python NumPy Tutorial | NumPy Array | Edureka
Python NumPy Tutorial | NumPy Array | EdurekaPython NumPy Tutorial | NumPy Array | Edureka
Python NumPy Tutorial | NumPy Array | EdurekaEdureka!
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for PythonWes McKinney
 
Data Visualization(s) Using Python
Data Visualization(s) Using PythonData Visualization(s) Using Python
Data Visualization(s) Using PythonAniket Maithani
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Data Analysis and Visualization using Python
Data Analysis and Visualization using PythonData Analysis and Visualization using Python
Data Analysis and Visualization using PythonChariza Pladin
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesPython - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesAndrew Ferlitsch
 
Introduction to matplotlib
Introduction to matplotlibIntroduction to matplotlib
Introduction to matplotlibPiyush rai
 

Mais procurados (20)

NumPy
NumPyNumPy
NumPy
 
Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in Python
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
 
Python libraries
Python librariesPython libraries
Python libraries
 
Pandas
PandasPandas
Pandas
 
Data Visualization in Python
Data Visualization in PythonData Visualization in Python
Data Visualization in Python
 
Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]Introduction to Pandas and Time Series Analysis [PyCon DE]
Introduction to Pandas and Time Series Analysis [PyCon DE]
 
Python NumPy Tutorial | NumPy Array | Edureka
Python NumPy Tutorial | NumPy Array | EdurekaPython NumPy Tutorial | NumPy Array | Edureka
Python NumPy Tutorial | NumPy Array | Edureka
 
Numpy
NumpyNumpy
Numpy
 
Numpy tutorial
Numpy tutorialNumpy tutorial
Numpy tutorial
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
 
Data Visualization(s) Using Python
Data Visualization(s) Using PythonData Visualization(s) Using Python
Data Visualization(s) Using Python
 
Python Modules
Python ModulesPython Modules
Python Modules
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Pandas csv
Pandas csvPandas csv
Pandas csv
 
Python basic
Python basicPython basic
Python basic
 
Python Functions
Python   FunctionsPython   Functions
Python Functions
 
Data Analysis and Visualization using Python
Data Analysis and Visualization using PythonData Analysis and Visualization using Python
Data Analysis and Visualization using Python
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesPython - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning Libraries
 
Introduction to matplotlib
Introduction to matplotlibIntroduction to matplotlib
Introduction to matplotlib
 

Semelhante a Introduction to pandas

Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptxKirti Verma
 
pandas for series and dataframe.pptx
pandas for series and dataframe.pptxpandas for series and dataframe.pptx
pandas for series and dataframe.pptxssuser52a19e
 
pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdfAjeshSurejan2
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxprakashvs7
 
Presentation on Pandas in _ detail .pptx
Presentation on Pandas in _ detail .pptxPresentation on Pandas in _ detail .pptx
Presentation on Pandas in _ detail .pptx16115yogendraSingh
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptxSumitMajukar
 
Python Pandas.pptx
Python Pandas.pptxPython Pandas.pptx
Python Pandas.pptxSujayaBiju
 
Pandas-(Ziad).pptx
Pandas-(Ziad).pptxPandas-(Ziad).pptx
Pandas-(Ziad).pptxSivam Chinna
 
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdfXII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdfKrishnaJyotish1
 
Unit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxUnit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxprakashvs7
 
Pandas pythonfordatascience
Pandas pythonfordatasciencePandas pythonfordatascience
Pandas pythonfordatascienceNishant Upadhyay
 
Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Dr. Volkan OBAN
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxParveenShaik21
 
Working with Graphs _python.pptx
Working with Graphs _python.pptxWorking with Graphs _python.pptx
Working with Graphs _python.pptxMrPrathapG
 
Getting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdfGetting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdfSudhakarVenkey
 

Semelhante a Introduction to pandas (20)

ppanda.pptx
ppanda.pptxppanda.pptx
ppanda.pptx
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptx
 
pandas for series and dataframe.pptx
pandas for series and dataframe.pptxpandas for series and dataframe.pptx
pandas for series and dataframe.pptx
 
DataFrame Creation.pptx
DataFrame Creation.pptxDataFrame Creation.pptx
DataFrame Creation.pptx
 
pandas dataframe notes.pdf
pandas dataframe notes.pdfpandas dataframe notes.pdf
pandas dataframe notes.pdf
 
Unit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptxUnit 4_Working with Graphs _python (2).pptx
Unit 4_Working with Graphs _python (2).pptx
 
Presentation on Pandas in _ detail .pptx
Presentation on Pandas in _ detail .pptxPresentation on Pandas in _ detail .pptx
Presentation on Pandas in _ detail .pptx
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
 
Python Pandas.pptx
Python Pandas.pptxPython Pandas.pptx
Python Pandas.pptx
 
Pandas-(Ziad).pptx
Pandas-(Ziad).pptxPandas-(Ziad).pptx
Pandas-(Ziad).pptx
 
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdfXII -  2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
XII - 2022-23 - IP - RAIPUR (CBSE FINAL EXAM).pdf
 
Unit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxUnit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptx
 
2 pandasbasic
2 pandasbasic2 pandasbasic
2 pandasbasic
 
Pandas pythonfordatascience
Pandas pythonfordatasciencePandas pythonfordatascience
Pandas pythonfordatascience
 
Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet Python Pandas for Data Science cheatsheet
Python Pandas for Data Science cheatsheet
 
Lecture 9.pptx
Lecture 9.pptxLecture 9.pptx
Lecture 9.pptx
 
Python-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptxPython-for-Data-Analysis.pptx
Python-for-Data-Analysis.pptx
 
Working with Graphs _python.pptx
Working with Graphs _python.pptxWorking with Graphs _python.pptx
Working with Graphs _python.pptx
 
Getting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdfGetting started with Pandas Cheatsheet.pdf
Getting started with Pandas Cheatsheet.pdf
 
Pandas.pptx
Pandas.pptxPandas.pptx
Pandas.pptx
 

Último

Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 

Último (20)

Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 

Introduction to pandas

  • 1. INTRODUCTION TO PANDAS A LIBRARY THAT IS USED FOR DATA MANIPULATION AND ANALYSIS TOOL USING POWERFUL DATA STRUCTURES
  • 2. TYPES OF DATA STRUCTUE IN PANDAS Data Structure Dimensions Description Series 1 1D labeled homogeneous array, sizeimmutable. Data Frames 2 General 2D labeled, size- mutable tabular structure with potentially heterogeneously typed columns. Panel 3 General 3D labeled, size- mutable array.
  • 3. SERIES • Series is a one-dimensional array like structure with homogeneous data. For example, the following series is a collection of integers 10, 23, 56, • … 10 23 56 17 52 61 73 90 26 72
  • 4. DataFrame • DataFrame is a two-dimensional array with heterogeneous data. For example, Name Age Gender Rating Steve 32 Male 3.45 Lia 28 Female 4.6 Vin 45 Male 3.9 Katie 38 Female 2.78
  • 5. Data Type of Columns Column Type Name String Age Integer Gender String Rating Float
  • 6. PANEL • Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the panel in graphical representation. But a panel can be illustrated as a container of DataFrame.
  • 7. DataFrame • A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. • Features of DataFrame • Potentially columns are of different types • Size – Mutable • Labeled axes (rows and columns) • Can Perform Arithmetic operations on rows and columns
  • 10. • data • data takes various forms like ndarray, series, map, lists, dict, constants and also another DataFrame. • index • For the row labels, the Index to be used for the resulting frame is Optional Default np.arrange(n) if no index is passed. • columns • For column labels, the optional default syntax is - np.arrange(n). This is only true if no index is passed. • dtype • Data type of each column. • copy • This command (or whatever it is) is used for copying of data, if the default is False.
  • 11. • Create DataFrame • A pandas DataFrame can be created using various inputs like − • Lists • dict • Series • Numpy ndarrays • Another DataFrame
  • 12. Example • Example • import pandas as pd • Data = [_______] • Df = pd.DataFrame(data) • Print df Example 2 Import pandas as pd Data = {‘Name’ : [‘__’. ‘__’],’Age’: [___]} Df = pd.DataFrame(data) print df
  • 13. Example • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • df = pd.DataFrame(data) • print df • ________________________________________ • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • df = pd.DataFrame(data, index=['first', 'second']) • print df
  • 14. • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • #With two column indices, values same as dictionary keys • df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b']) • #With two column indices with one index with other name • df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1']) • print df1print df2
  • 15. The following example shows how to create a DataFrame with a list of dictionaries, row indices, and column indices. • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • #With two column indices, values same as dictionary keys • df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b']) • #With two column indices with one index with other name • df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1']) • print df1 • print df2
  • 16. Create a DataFrame from Dict of Series • import pandas as pd • d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} • df = pd.DataFrame(d) • print df
  • 17. Column Addition • import pandas as pd • d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} • df = pd.DataFrame(d) • # Adding a new column to an existing DataFrame object with column label by passing new series • print ("Adding a new column by passing as Series:") • df['three']=pd.Series([10,20,30],index=['a','b','c']) • print dfprint ("Adding a new column using the existing columns in DataFrame:") • df['four']=df['one']+df['three'] • print df
  • 18. Column Deletion • # Using the previous DataFrame, we will delete a column • # using del function • import pandas as pd • d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']), 'three' : pd.Series([10,20,30], index=['a','b','c'])} • df = pd.DataFrame(d)
  • 19. • print ("Our dataframe is:") • print df • # using del function • print ("Deleting the first column using DEL function:") • del df['one'] • print df # using pop function • print ("Deleting another column using POP function:") • df.pop('two') • print df
  • 20. Slicing in python •import pandas as pd d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} •df = pd.DataFrame(d) •print df[2:4]
  • 21. Addition of rows • Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’]) • Df = df.append(df2 ) • Print df Deletion of rows • Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’]) • Df = df.drop(0) • Print df
  • 22. Reindexing • import pandas as pd • import numpy as np df1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3']) • df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])df1 = df1.reindex_like(df2)print df1
  • 23. Concatenating objects • import pandas as pd • One = pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’: [‘__’]}, index = [] ) • two= pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’: [‘__’]}, index = [] ) • Print pd.concat([one, two])
  • 24. Handling categorical data • There are many data that are repetitive for example gender , country , and codes are always repetitive . • Categorical variables can take on only a limited • The categorical data type is useful in the following cases − • A string variable consisting of only a few different values. Converting such a string variable to a categorical variable will save some memory. • The lexical order of a variable is not the same as the logical order (“one”, “two”, “three”). By converting to a categorical and specifying an order on the categories, sorting and min/max will use the logical order instead of the lexical order. • As a signal to other python libraries that this column should be treated as a categorical variable (e.g. to use suitable statistical methods or plot types).
  • 25. • import pandas as pd • cat = pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c']) • print cat • ____________________________________________ • import pandas as pd • import numpy as np • cat = pd.Categorical(["a", "c", "c", np.nan], categories=["b", "a", "c"]) • df = pd.DataFrame({"cat":cat, "s":["a", "c", "c", np.nan]}) • print df.describe() • print df["cat"].describe()