SlideShare uma empresa Scribd logo
1 de 14
Python Numpy/Pandas Libraries
Machine Learning
Portland Data Science Group
Created by Andrew Ferlitsch
Community Outreach Officer
July, 2017
Libraries - Numpy
• A popular math library in Python for Machine Learning
is ‘numpy’.
import numpy as np
Keyword to import a library Keyword to refer to library by an alias (shortcut) name
Numpy.org : NumPy is the fundamental package for scientific computing with Python.
• a powerful N-dimensional array object
• sophisticated (broadcasting) functions
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and random number capabilities
Libraries - Numpy
The most import data structure for scientific computing in Python
is the NumPy array. NumPy arrays are used to store lists of numerical
data and to represent vectors, matrices, and even tensors.
NumPy arrays are designed to handle large data sets efficiently and
with a minimum of fuss. The NumPy library has a large set of routines
for creating, manipulating, and transforming NumPy arrays.
Core Python has an array data structure, but it’s not nearly as versatile,
efficient, or useful as the NumPy array.
http://www.physics.nyu.edu/pine/pymanual/html/chap3/chap3_arrays.html
Numpy – Multidimensional Arrays
• Numpy’s main object is a multi-dimensional array.
• Creating a Numpy Array as a Vector:
data = np.array( [ 1, 2, 3 ] )
Numpy function to create a numpy array
Value is: array( [ 1, 2, 3 ] )
• Creating a Numpy Array as a Matrix:
data = np.array( [ [ 1, 2, 3 ], [ 4, 5, 6 ], [ 7, 8, 9 ] ] )
Outer Dimension Inner Dimension (rows)
Value is: array( [ 1, 2, 3 ],
[ 4, 5, 6 ],
[ 7, 8, 9 ] )
Numpy – Multidimensional Arrays
• Creating an array of Zeros:
data = np.zeros( ( 2, 3 ), dtype=np.int )
Numpy function to create an array of zeros
Value is: array( [ 0, 0, 0 ],
[ 0, 0, 0 ] )
• Creating an array of Ones:
data = np.ones( (2, 3), dtype=np.int )
rows
columns
data type (default is float)
Numpy function to create an array of onesValue is: array( [ 1, 1, 1 ],
[ 1, 1, 1 ] )
And many more functions: size, ndim, reshape, arange, …
Libraries - Pandas
• A popular library for importing and managing datasets in Python
for Machine Learning is ‘pandas’.
import pandas as pd
Keyword to import a library Keyword to refer to library by an alias (shortcut) name
PyData.org : high-performance, easy-to-use data structures and data analysis tools for the
Python programming language.
Used for:
• Data Analysis
• Data Manipulation
• Data Visualization
Pandas – Indexed Arrays
• Pandas are used to build indexed arrays (1D) and matrices (2D),
where columns and rows are labeled (named) and can be accessed
via the labels (names).
1 2 3 4
4 5 6 7
8 9 10 11
1 2 3 4
4 5 6 7
8 9 10 11
one
two
three
x1 x2 x3 x4
raw data
Row (samples)
index
Columns (features)
index
Panda Indexed Matrix
Pandas – Series and Data Frames
• Pandas Indexed Arrays are referred to as Series (1D) and
Data Frames (2D).
• Series is a 1D labeled (indexed) array and can hold any data type,
and mix of data types.
s = pd.Series( data, index=[ ‘x1’, ‘x2’, ‘x3’, ‘x4’ ] )
Series Raw data Column Index Labels
• Data Frame is a 2D labeled (indexed) matrix and can hold any
data type, and mix of data types.
df = pd.DataFrame( data, index=[‘one’, ‘two’], columns=[ ‘x1’, ‘x2’, ‘x3’, ‘x4’ ] )
Data Frame Row Index Labels Column Index Labels
Pandas – Selecting
• Selecting One Column
x1 = df[ ‘x1’ ]
Selects column labeled x1 for all rows
1
4
8
• Selecting Multiple Columns
x1 = df[ [ ‘x1’, ‘x3’ ] ]
Selects columns labeled x1 and x3 for all rows
1 3
4 6
8 10
x1 = df.ix[ :, ‘x1’:’x3’ ]
Selects columns labeled x1 through x3 for all rows
1 2 3
4 5 6
8 9 10
Note: df[‘x1’:’x3’ ] this python syntax does not work!
rows (all) columns
Slicing function
And many more functions: merge, concat, stack, …
Libraries - Matplotlib
• A popular library for plotting and visualizing data in Python
import matplotlib.pyplot as plt
Keyword to import a library Keyword to refer to library by an alias (shortcut) name
matplotlib.org: Matplotlib is a Python 2D plotting library which produces publication quality
figures in a variety of hardcopy formats and interactive environments across platforms.
Used for:
• Plots
• Histograms
• Bar Charts
• Scatter Plots
• etc
Matplotlib - Plot
• The function plot plots a 2D graph.
plt.plot( x, y )
Function to plot
X values to plot
Y values to plot
• Example:
plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ] ) # Draws plot in the background
plt.show() # Displays the plot
X Y
1
2
4
6
8
2 3
Matplotlib – Plot Labels
• Add Labels for X and Y Axis and Plot Title (caption)
plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ] )
plt.xlabel( “X Numbers” ) # Label on the X-axis
plt.ylabel( “Y Numbers” ) # Label on the Y-axis
plt.title( “My Plot of X and Y”) # Title for the Plot
plt.show()
1
2
4
6
8
2 3
X Numbers
YNumbers
My Plot of X and Y
Matplotlib – Multiple Plots and Legend
• You can add multiple plots in a Graph
plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ], label=‘ 1st Line’ ) # Plot for 1st Line
plt.plot( [ 1, 2, 3 ], [ 2, 4, 6 ], label=‘2nd Line’ ) # Plot for 2nd Line
plt.xlabel( “X Numbers” )
plt.ylabel( “Y Numbers” )
plt.title( “My Plot of X and Y”)
plt.legend() # Show Legend for the plots
plt.show()
1
2
4
6
8
2 3
X Numbers
YNumbers
My Plot of X and Y
---- 1st Line
---- 2nd Line
Matplotlib – Bar Chart
• The function bar plots a bar graph.
plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ] ) # Plot for 1st Line
plt.bar() # Draw a bar chart
plt.show()
1
2
4
6
8
2 3
And many more functions: hist, scatter, …

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

NUMPY
NUMPY NUMPY
NUMPY
 
Data Analysis and Visualization using Python
Data Analysis and Visualization using PythonData Analysis and Visualization using Python
Data Analysis and Visualization using Python
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
 
Python programming : Classes objects
Python programming : Classes objectsPython programming : Classes objects
Python programming : Classes objects
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
 
Datastructures in python
Datastructures in pythonDatastructures in python
Datastructures in python
 
Datatypes in python
Datatypes in pythonDatatypes in python
Datatypes in python
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
 
Python Scipy Numpy
Python Scipy NumpyPython Scipy Numpy
Python Scipy Numpy
 
Seaborn.pptx
Seaborn.pptxSeaborn.pptx
Seaborn.pptx
 
Introduction to numpy
Introduction to numpyIntroduction to numpy
Introduction to numpy
 
Python: Modules and Packages
Python: Modules and PackagesPython: Modules and Packages
Python: Modules and Packages
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
Python OOPs
Python OOPsPython OOPs
Python OOPs
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
PPT on Data Science Using Python
PPT on Data Science Using PythonPPT on Data Science Using Python
PPT on Data Science Using Python
 
Python programming : Strings
Python programming : StringsPython programming : Strings
Python programming : Strings
 
Data visualization in Python
Data visualization in PythonData visualization in Python
Data visualization in Python
 
Python Variable Types, List, Tuple, Dictionary
Python Variable Types, List, Tuple, DictionaryPython Variable Types, List, Tuple, Dictionary
Python Variable Types, List, Tuple, Dictionary
 
Function in Python
Function in PythonFunction in Python
Function in Python
 

Semelhante a Python - Numpy/Pandas/Matplot Machine Learning Libraries

Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
ActiveState
 
Comparing EDA with classical and Bayesian analysis.pptx
Comparing EDA with classical and Bayesian analysis.pptxComparing EDA with classical and Bayesian analysis.pptx
Comparing EDA with classical and Bayesian analysis.pptx
PremaGanesh1
 
Lecture_3.5-Array_Type Conversion_Math Class.pptx
Lecture_3.5-Array_Type Conversion_Math Class.pptxLecture_3.5-Array_Type Conversion_Math Class.pptx
Lecture_3.5-Array_Type Conversion_Math Class.pptx
ShahinAhmed49
 
Data Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with NData Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with N
OllieShoresna
 

Semelhante a Python - Numpy/Pandas/Matplot Machine Learning Libraries (20)

python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptx
 
Numpy.pdf
Numpy.pdfNumpy.pdf
Numpy.pdf
 
NumPy.pptx
NumPy.pptxNumPy.pptx
NumPy.pptx
 
getting started with numpy and pandas.pptx
getting started with numpy and pandas.pptxgetting started with numpy and pandas.pptx
getting started with numpy and pandas.pptx
 
Unit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxUnit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptx
 
Unit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptxUnit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptx
 
Unit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptxUnit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptx
 
NUMPY-2.pptx
NUMPY-2.pptxNUMPY-2.pptx
NUMPY-2.pptx
 
Language R
Language RLanguage R
Language R
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
numpy.pdf
numpy.pdfnumpy.pdf
numpy.pdf
 
R programming & Machine Learning
R programming & Machine LearningR programming & Machine Learning
R programming & Machine Learning
 
Comparing EDA with classical and Bayesian analysis.pptx
Comparing EDA with classical and Bayesian analysis.pptxComparing EDA with classical and Bayesian analysis.pptx
Comparing EDA with classical and Bayesian analysis.pptx
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
Lecture_3.5-Array_Type Conversion_Math Class.pptx
Lecture_3.5-Array_Type Conversion_Math Class.pptxLecture_3.5-Array_Type Conversion_Math Class.pptx
Lecture_3.5-Array_Type Conversion_Math Class.pptx
 
NumPy.pptx
NumPy.pptxNumPy.pptx
NumPy.pptx
 
Numpy ndarrays.pdf
Numpy ndarrays.pdfNumpy ndarrays.pdf
Numpy ndarrays.pdf
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.ppt
 
Data Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with NData Manipulation with Numpy and Pandas in PythonStarting with N
Data Manipulation with Numpy and Pandas in PythonStarting with N
 
Lecture 9.pptx
Lecture 9.pptxLecture 9.pptx
Lecture 9.pptx
 

Mais de Andrew Ferlitsch

Mais de Andrew Ferlitsch (20)

AI - Intelligent Agents
AI - Intelligent AgentsAI - Intelligent Agents
AI - Intelligent Agents
 
Pareto Principle Applied to QA
Pareto Principle Applied to QAPareto Principle Applied to QA
Pareto Principle Applied to QA
 
Whiteboarding Coding Challenges in Python
Whiteboarding Coding Challenges in PythonWhiteboarding Coding Challenges in Python
Whiteboarding Coding Challenges in Python
 
Object Oriented Programming Principles
Object Oriented Programming PrinciplesObject Oriented Programming Principles
Object Oriented Programming Principles
 
Python - OOP Programming
Python - OOP ProgrammingPython - OOP Programming
Python - OOP Programming
 
Python - Installing and Using Python and Jupyter Notepad
Python - Installing and Using Python and Jupyter NotepadPython - Installing and Using Python and Jupyter Notepad
Python - Installing and Using Python and Jupyter Notepad
 
Natural Language Processing - Groupings (Associations) Generation
Natural Language Processing - Groupings (Associations) GenerationNatural Language Processing - Groupings (Associations) Generation
Natural Language Processing - Groupings (Associations) Generation
 
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
 
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural NetworksMachine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural Networks
 
Machine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural NetworksMachine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural Networks
 
Machine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksMachine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural Networks
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion Matrix
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear Regression
 
Machine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionMachine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable Conversion
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
Machine Learning - Introduction to Tensorflow
Machine Learning - Introduction to TensorflowMachine Learning - Introduction to Tensorflow
Machine Learning - Introduction to Tensorflow
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Python - Numpy/Pandas/Matplot Machine Learning Libraries

  • 1. Python Numpy/Pandas Libraries Machine Learning Portland Data Science Group Created by Andrew Ferlitsch Community Outreach Officer July, 2017
  • 2. Libraries - Numpy • A popular math library in Python for Machine Learning is ‘numpy’. import numpy as np Keyword to import a library Keyword to refer to library by an alias (shortcut) name Numpy.org : NumPy is the fundamental package for scientific computing with Python. • a powerful N-dimensional array object • sophisticated (broadcasting) functions • tools for integrating C/C++ and Fortran code • useful linear algebra, Fourier transform, and random number capabilities
  • 3. Libraries - Numpy The most import data structure for scientific computing in Python is the NumPy array. NumPy arrays are used to store lists of numerical data and to represent vectors, matrices, and even tensors. NumPy arrays are designed to handle large data sets efficiently and with a minimum of fuss. The NumPy library has a large set of routines for creating, manipulating, and transforming NumPy arrays. Core Python has an array data structure, but it’s not nearly as versatile, efficient, or useful as the NumPy array. http://www.physics.nyu.edu/pine/pymanual/html/chap3/chap3_arrays.html
  • 4. Numpy – Multidimensional Arrays • Numpy’s main object is a multi-dimensional array. • Creating a Numpy Array as a Vector: data = np.array( [ 1, 2, 3 ] ) Numpy function to create a numpy array Value is: array( [ 1, 2, 3 ] ) • Creating a Numpy Array as a Matrix: data = np.array( [ [ 1, 2, 3 ], [ 4, 5, 6 ], [ 7, 8, 9 ] ] ) Outer Dimension Inner Dimension (rows) Value is: array( [ 1, 2, 3 ], [ 4, 5, 6 ], [ 7, 8, 9 ] )
  • 5. Numpy – Multidimensional Arrays • Creating an array of Zeros: data = np.zeros( ( 2, 3 ), dtype=np.int ) Numpy function to create an array of zeros Value is: array( [ 0, 0, 0 ], [ 0, 0, 0 ] ) • Creating an array of Ones: data = np.ones( (2, 3), dtype=np.int ) rows columns data type (default is float) Numpy function to create an array of onesValue is: array( [ 1, 1, 1 ], [ 1, 1, 1 ] ) And many more functions: size, ndim, reshape, arange, …
  • 6. Libraries - Pandas • A popular library for importing and managing datasets in Python for Machine Learning is ‘pandas’. import pandas as pd Keyword to import a library Keyword to refer to library by an alias (shortcut) name PyData.org : high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Used for: • Data Analysis • Data Manipulation • Data Visualization
  • 7. Pandas – Indexed Arrays • Pandas are used to build indexed arrays (1D) and matrices (2D), where columns and rows are labeled (named) and can be accessed via the labels (names). 1 2 3 4 4 5 6 7 8 9 10 11 1 2 3 4 4 5 6 7 8 9 10 11 one two three x1 x2 x3 x4 raw data Row (samples) index Columns (features) index Panda Indexed Matrix
  • 8. Pandas – Series and Data Frames • Pandas Indexed Arrays are referred to as Series (1D) and Data Frames (2D). • Series is a 1D labeled (indexed) array and can hold any data type, and mix of data types. s = pd.Series( data, index=[ ‘x1’, ‘x2’, ‘x3’, ‘x4’ ] ) Series Raw data Column Index Labels • Data Frame is a 2D labeled (indexed) matrix and can hold any data type, and mix of data types. df = pd.DataFrame( data, index=[‘one’, ‘two’], columns=[ ‘x1’, ‘x2’, ‘x3’, ‘x4’ ] ) Data Frame Row Index Labels Column Index Labels
  • 9. Pandas – Selecting • Selecting One Column x1 = df[ ‘x1’ ] Selects column labeled x1 for all rows 1 4 8 • Selecting Multiple Columns x1 = df[ [ ‘x1’, ‘x3’ ] ] Selects columns labeled x1 and x3 for all rows 1 3 4 6 8 10 x1 = df.ix[ :, ‘x1’:’x3’ ] Selects columns labeled x1 through x3 for all rows 1 2 3 4 5 6 8 9 10 Note: df[‘x1’:’x3’ ] this python syntax does not work! rows (all) columns Slicing function And many more functions: merge, concat, stack, …
  • 10. Libraries - Matplotlib • A popular library for plotting and visualizing data in Python import matplotlib.pyplot as plt Keyword to import a library Keyword to refer to library by an alias (shortcut) name matplotlib.org: Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Used for: • Plots • Histograms • Bar Charts • Scatter Plots • etc
  • 11. Matplotlib - Plot • The function plot plots a 2D graph. plt.plot( x, y ) Function to plot X values to plot Y values to plot • Example: plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ] ) # Draws plot in the background plt.show() # Displays the plot X Y 1 2 4 6 8 2 3
  • 12. Matplotlib – Plot Labels • Add Labels for X and Y Axis and Plot Title (caption) plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ] ) plt.xlabel( “X Numbers” ) # Label on the X-axis plt.ylabel( “Y Numbers” ) # Label on the Y-axis plt.title( “My Plot of X and Y”) # Title for the Plot plt.show() 1 2 4 6 8 2 3 X Numbers YNumbers My Plot of X and Y
  • 13. Matplotlib – Multiple Plots and Legend • You can add multiple plots in a Graph plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ], label=‘ 1st Line’ ) # Plot for 1st Line plt.plot( [ 1, 2, 3 ], [ 2, 4, 6 ], label=‘2nd Line’ ) # Plot for 2nd Line plt.xlabel( “X Numbers” ) plt.ylabel( “Y Numbers” ) plt.title( “My Plot of X and Y”) plt.legend() # Show Legend for the plots plt.show() 1 2 4 6 8 2 3 X Numbers YNumbers My Plot of X and Y ---- 1st Line ---- 2nd Line
  • 14. Matplotlib – Bar Chart • The function bar plots a bar graph. plt.plot( [ 1, 2, 3 ], [ 4, 6, 8 ] ) # Plot for 1st Line plt.bar() # Draw a bar chart plt.show() 1 2 4 6 8 2 3 And many more functions: hist, scatter, …