SlideShare uma empresa Scribd logo
1 de 89
Baixar para ler offline
Python for Science
 and Engineering  Dr Edward Schofield


 A*STAR / Singapore Computational Sciences Club Seminar
                     June 14, 2011
Scientific programming in 2011

 Most scientists and engineers are:
   programming for 50+% of their work time (and rising)
   self-taught programmers
   using inefficient programming practices
   using the wrong programming languages: C++,
   FORTRAN, C#, PHP, Java, ...
Scientific programming needs

 Rapid prototyping
 Efficiency for computational kernels
 Pre-written packages!
   Vectors, matrices, modelling, simulations, visualisation
 Extensibility; web front-ends; database backends; ...
Ed's story:
How I found Python
 PhD in statistical pattern recognition: 2001-2006

 Needed good tools for my research!

 Discovered Python in 2002 after frustration with C++, Matlab,
 Java, Perl

 Contributed to NumPy and SciPy:

   maxent, sparse matrices, optimization, Monte Carlo, etc.

   Managed six releases of SciPy in 2005-6
1. Why Python?
Introducing Python


 What is it?

 What is it good for?

 Who uses it?
What is Python?

 interpreted
 strongly but dynamically typed
 object-oriented
 intuitive, readable
 open source, free
 ‘batteries included’
‘batteries included’

 Python’s standard library
 is:
   very large
   well-supported
   well-documented
Python’s standard library
 data types     strings     networking     threads

 operating
              compression      GUI        arguments
  system
               complex
    CGI                        FTP       cryptography
               numbers

  testing     multimedia    databases     CSV files

 calendar        email        XML        serialization
What is an efficient
programming language?


Native Python code
executes 10x more slowly
than C and FORTRAN
Would you build a racing car ...
... to get to Kuala Lumpur ASAP?
Date      Cost per GFLOPS (US $)             Technology

  1961          US $1.1 trillion          17 million IBM 1620s

  1984         US $15,000,000                  Cray X-MP

                                         Two 16-CPU clusters of
  1997           US $30,000
                                               Pentiums

2000, Apr           $1000                Bunyip Beowulf cluster

2003, Aug            $82                         KASY0

2007, Mar           $0.42                   Ambric AM2045

2009, Sep           $0.13                   ATI Radeon R800

                                     Source: Wikipedia: “FLOPS”
Unit labor cost growth
Proxy for cost of programmer time
Efficiency


 When FORTRAN was invented, computer time was more
 expensive than programmer time.

 In the 1980s and 1990s that reversed.
Efficient programming



 Python code is 10x faster
 to write than C and
 FORTRAN
What if ...
... you now need to reach Sydney?
Advantages of Python

 Easy to write

 Easy to maintain

 Great standard libraries

 Thriving ecosystem of
 third-party packages

 Open source
‘Batteries included’


 Python’s standard library is:

   very large

   well supported

   well documented
Python’s standard library
 data types     strings     networking     threads

 operating
              compression      GUI        arguments
  system
               complex
    CGI                        FTP       cryptography
               numbers

  testing     multimedia    databases     CSV files

 calendar        email        XML        serialization
Question
What is the date 177 days from now?
Natural applications of Python

 Rapid prototyping

 Plotting, visualisation, 3D

 Numerical computing

 Web and database
 programming

 All-purpose glue
Python vs other languages
Languages used at CSIRO

   Python   Fortran       Java


   Matlab     C          VB.net


    IDL      C++           R


    Perl      C#      +5-10 others!
Which language do I choose?


 A different language for each task?

 A language you know?

 A language others in your team are using: support and help?
Python     Matlab

       Interpreted             Yes       Yes

Powerful data input/output     Yes       Yes

      Great plotting           Yes       Yes

General-purpose language     Powerful   Limited

          Cost                Free       $$$

      Open source              Yes        No
Python     C++


        Powerful              Yes       Yes


        Portable              Yes     In theory


    Standard libraries        Vast    Limited


Easy to write and maintain    Yes       No


      Easy to learn           Yes       No
Python     C


           Fast to write                Yes       No


Good for embedded systems, device
                                        No       Yes
  drivers and operating systems


Good for most other high-level tasks    Yes       No


          Standard library              Vast    Limited
Python    Java


Powerful, well-designed language    Yes      Yes


       Standard libraries           Vast     Vast


         Easy to learn              Yes       No


         Code brevity              Short    Verbose


   Easy to write and maintain       Yes      Okay
Open source

Python is open source software

Benefits:

  No vendor lock-in

  Cross-platform

  Insurance against bugs in the platform

  Free
Python success stories

 Computer graphics:

   Industrial Light & Magic

 Web:

   Google: News, Groups, Maps, Gmail

 Legacy system integration:

   AstraZeneca - collaborative drug discovery
Python success stories (2)

 Aerospace:

   NASA

 Research:

   universities worldwide ...

 Others:

   YouTube, Reddit, BitTorrent, Civilization IV,
Industrial Light & Magic


 Python spread from
 scripting to the entire
 production pipeline

 Numerous reviews since
 1996: Python is still the
 best tool for them
United Space Alliance


 A common sentiment:

 “We achieve immediate functioning code so much faster in
 Python than in any other language that it’s staggering.”

                       - Robin Friedrich, Senior Project Engineer
Case study: air-traffic control

 Eric Newton, “Python for
 Critical Applications”: http://
 metaslash.com/brochure/
 recall.html

 Metaslash, Inc: 1999 to 2001

 Mission-critical system for
 air-traffic control

 Replicated, fault-tolerant
 data storage
Case study: air-traffic control
 Python prototype -> C++ implementation -> Python again

 Why?

   C++ dependencies were buggy

   C++ threads, STL were not portable enough

 Python’s advantages over C++

   More portable

   75% less code: more productivity, fewer bugs
More case studies



 See http://www.python.org/about/success/ for lots more case
 studies and success stories
2. The scientific Python ecosystem
Scientific software
development


 Small beginnings

 Piecemeal growth, quirky interfaces

 ... Large, cumbersome systems
NumPy
An n-dimensional array/matrix package
NumPy
Centre of Python’s numerical computing ecosystem
NumPy


The most fundamental tool for numerical computing in
Python
Fast multi-dimensional array capability
What NumPy defines:

 Two fundamental objects:
 1. n-dimensional array
 2. universal function

 a rich set of numerical data types
 nearly 400 functions and methods on arrays:
   type conversions
   mathematical
   logical
NumPy's features

 Fast. Written in C with BLAS/LAPACK hooks.
 Rich set of data types
 Linear algebra: matrix inversion, decompositions, …
 Discrete Fourier transforms
 Random number generation
 Trig, hypergeometric functions, etc.
Elementwise array operations

 Loops are mostly unnecessary
 Operate on entire arrays!
>>> a = numpy.array([20, 30, 40, 50])
>>> a < 35
array([True, True, False, False], dtype=bool)
>>> b = numpy.arange(4)
>>> a - b
array([20, 29, 38, 47])
>>> b**2
array([0, 1, 4, 9])
Universal functions

 NumPy defines 'ufuncs' that operate on entire arrays
 and other sequences (hence 'universal')
 Example: sin()
>>> a = numpy.array([20, 30, 40, 50])
>>> c = 10 * numpy.sin(a)
>>> c
array([ 9.12945251, -9.88031624, 7.4511316 ,
-2.62374854])
Array slicing


 Arrays can be sliced and indexed powerfully:
>>> a = numpy.arange(10)**3
>>> a
array([ 0,    1,    8, 27, 64, 125, 216, 343,
512, 729])
>>> a[2:5]
array([ 8, 27, 64])
Fancy indexing


 Arrays can be used as indices into other arrays:
>>> a = numpy.arange(12)**2
>>> ind = numpy.array([ 1, 1, 3, 8, 5 ])
>>> a[ind]
array([ 1, 1, 9, 64, 25])
Other linear algebra features

 Matrix inversion: mat(A).I

 Or: linalg.inv(A)
 Linear solvers: linalg.solve(A, x)
 Pseudoinverse: linalg.pinv(A)
What is SciPy?


 A community
 A conference
 A package of scientific libraries
Python for scientific software


 Back-end: computational work

 Front-end: input / output, visualization, GUIs

 Dozens of great scientific packages exist
Python in science (2)


 NumPy: numerical / array module
 Matplotlib: great 2D and 3D plotting library
 IPython: nice interactive Python shell
 SciPy: set of scientific libraries: sparse matrices, signal
 processing, …
 RPy: integration with the R statistical environment
Python in science (3)


 Cython: C language extensions
 Mayavi: 3D graphics, volumetric rendering
 Nitimes, Nipype: Python tools for neuroimaging
 SymPy: symbolic mathematics library
Python in science (4)

 VPython: easy, real-time 3D programming

 UCSF Chimera, PyMOL, VMD: molecular graphics

 PyRAF: Hubble Space Telescope interface to RAF astronomical
 data

 BioPython: computational molecular biology

 Natural language toolkit: symbolic + statistical NLP

 Physics: 	 PyROOT
The SciPy package
BSD-licensed software for maths, science,
engineering


  integration    signal processing    sparse matrices
 optimization     linear algebra     maximum entropy
 interpolation        ODEs                statistics
                   n-dim image
     FFTs                            scientific constants
                    processing
                                     C/C++ and Fortran
  clustering       interpolation
                                        integration
SciPy optimisation example
Fit a model to noisy data:
y = a/xb sin(cx)+ε
Example: fitting a model with
scipy.optimize

 Task: Fit a model of the form y = a/bx sin(cx)+ε
 to noisy data.
 Spec:
 1. Generate noisy data
 2. Choose parameters (a, b, c) to minimize sum squared
 errors
 3. Plot the data and fitted model (next session)
SciPy optimisation example
import numpy
import pylab
from scipy.optimize import leastsq

def myfunc(params, x):
    (a, b, c) = params
    return a / (x**b) * numpy.sin(c * x)

true_params = [1.5, 0.1, 2.]
def f(x):
    return myfunc(true_params, x)

def err(params, x, y): # error function
    return myfunc(params, x) - y
SciPy optimisation example
#   Generate noisy data to fit
n   = 30; xmin = 0.1; xmax = 5
x   = numpy.linspace(xmin, xmax, n)
y   = f(x)
y   += numpy.rand(len(x)) * 0.2 * 
       (y.max() - y.min())

v0 = [3., 1., 4.] # initial param estimate
# Fitting
v, success = leastsq(err, v0, args=(x, y), maxfev=10000)

print 'Estimated parameters: ', v
print 'True parameters: ', true_params
X = numpy.linspace(xmin, xmax, 5 * n)
pylab.plot(x, y, 'ro', X, myfunc(v, X))
pylab.show()
SciPy optimisation example
Fit a model to noisy data:
y = a/xb sin(cx)+ε
Ingredients for this example


 numpy.linspace

 numpy.random.rand for the noise model (uniform)

 scipy.optimize.leastsq
Sparse matrix example
Construct and solve a sparse linear system
Sparse matrices
Sparse matrices are mostly zeros.
They can be symmetric or
asymmetric.
Sparsity patterns vary:
  block sparse, band matrices, ...
They can be huge!
Only non-zeros are stored.
Sparse matrices in SciPy



 SciPy supports seven sparse storage schemes
 ... and sparse solvers in Fortran.
Sparse matrix creation

 To construct a 1000x1000 lil_matrix and add values:
>>> from scipy.sparse import lil_matrix
>>> from numpy.random import rand
>>> from scipy.sparse.linalg import spsolve

>>>   A = lil_matrix((1000, 1000))
>>>   A[0, :100] = rand(100)
>>>   A[1, 100:200] = A[0, :100]
>>>   A.setdiag(rand(1000))
Solving sparse matrix
systems
 Now convert the matrix to CSR format and solve Ax=b:
>>> A = A.tocsr()
>>> b = rand(1000)
>>> x = spsolve(A, b)

# Convert it to a dense matrix and solve, and
check that the result is the same:
>>> from numpy.linalg import solve, norm
>>> x_ = solve(A.todense(), b)
# Compute norm of the error:
>>> err = norm(x - x_)
>>> err < 1e-10
True
Matplotlib

 Great plotting package in Python
 Matlab-like syntax
 Great rendering: anti-aliasing etc.
 Many ‘backends’: Cairo, GTK, Cocoa, PDF
 Flexible output: to EPS, PS, PDF, TIFF, PNG, ...
Matplotlib: worked examples
Search the web for 'Matplotlib gallery'
Example: NumPy
vectorization
1. Use a Monte Carlo algorithm to
   estimate π:

   1. Generate uniform random variates (x,%y) over [0, 1].

   2. Estimate π from the proportion p that land in the unit
      circle.
2. Time two ways of doing this:
   1. Using for loops

   2. Using array operations (vectorized)
3. Scaling
HPC
High-performance computing
Aspects to HPC
   Supercomputers       Distributed clusters / grids


 Parallel programming            Scripting


Caches, shared memory          Job control


    Code porting          Specialized hardware
Python for HPC
       Advantages                 Disadvantages

         Portability            Global interpreter lock

    Easy scripting, glue         Less control than C

       Maintainability          Native loops are slow

Profiling to identify hotspots

 Vectorization with NumPy
Large data sets

 Useful Python language features:
   Generators, iterators
 Useful packages:
   Great HDF5 support from PyTables!
Hierarchical data
Databases without the relational baggage
Great interface for HDF5 data
Efficient support for massive data sets
Applications of PyTables

     aeronautics       telecommunications


   drug discovery          data mining


  financial analysis     statistical analysis


  climate prediction           etc.
Breaking news: June 2011

PyTables Pro is now being open sourced.
  Indexed searches for speed
Merging with PyTables
Working project name: NewPyTables
PyTables performance

OPSI indexing engine speed:
  Querying 10 billion rows can take hundredths of a
  second!
Target use-case:
  mostly read-only or append-only data
Principles for efficient code
Important principles

1. "Premature optimization is the root of all evil"
      Don't write cryptic code just to make it more efficient!


2. 1-5% of the code takes up the vast majority of the
   computing time!
      ... and it might not be the 1-5% that you think!
Checklist for efficient code
 From most to least important:

 1. Check: Do you really need to make it more efficient?
 2. Check: Are you using the right algorithms and data
    structures?
 3. Check: Are you reusing pre-written libraries wherever
    possible?
 4. Check: Which parts of the code are expensive?
    Measure, don't guess!
Relative efficiency gains

 Exponential-order and polynomial-order speedups are
 possible by choosing the right algorithm for a task.
   These require the right data structures!
 These dwarf 10-25x linear-order speedups from:
   using lower-level languages
   using different language constructs.
4. About Python Charmers
The largest Python training provider in South-East Asia
Delighted customers include:
Most popular course topics
         Python for Programmers            3 days

    Python for Scientists and Engineers    4 days

         Python for Geoscientists          4 days

        Python for Bioinformaticians       4 days

New courses:

       Python for Financial Engineers      4 days
    Python for IT Security Professionals   3 days
Python Charmers:
Topics of expertise
 Python: beginners, advanced
 Scientific data processing with Python
 Software engineering with Python
 Large-scale problems: HPC, huge data sets, grids
 Statistics and Monte Carlo problems
Python Charmers:
Topics of expertise (2)
 Spatial data analysis / GIS
 General scripting, job control, glue
 GUIs with PyQt
 Integrating with other languages: R, C, C++, Fortran, ...
 Web development in Django
How to get in touch


 See PythonCharmers.com
 or email us at: info@pythoncharmers.com
Python for Science and Engineering: a presentation to A*STAR and the Singapore Computational Sciences Club, Edward Schofield, Python Charmers, June 2011

Mais conteúdo relacionado

Mais procurados

B4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, Kenya
B4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, KenyaB4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, Kenya
B4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, KenyaIceland Geothermal
 
Continual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep ArchitecturesContinual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep ArchitecturesVincenzo Lomonaco
 
Present Indian energy scenario
Present Indian energy scenarioPresent Indian energy scenario
Present Indian energy scenarioBISHAL DAS
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for researchEsteban Hernandez
 
Speed Up Your Kubernetes Upgrades For Your Kafka Clusters
Speed Up Your Kubernetes Upgrades For Your Kafka ClustersSpeed Up Your Kubernetes Upgrades For Your Kafka Clusters
Speed Up Your Kubernetes Upgrades For Your Kafka ClustersVanessa Vuibert
 
Zero shot-learning: paper presentation
Zero shot-learning: paper presentationZero shot-learning: paper presentation
Zero shot-learning: paper presentationJérémie Kalfon
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance ComputingDell World
 
Edge Computing Architecture using GPUs and Kubernetes
Edge Computing Architecture using GPUs and KubernetesEdge Computing Architecture using GPUs and Kubernetes
Edge Computing Architecture using GPUs and KubernetesVirtualTech Japan Inc.
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Animesh Singh
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesKoan-Sin Tan
 
DTW18 - code08 - Everything You Need To Know About Storage with Kubernetes
DTW18 - code08 - Everything You Need To Know About Storage with KubernetesDTW18 - code08 - Everything You Need To Know About Storage with Kubernetes
DTW18 - code08 - Everything You Need To Know About Storage with KubernetesKendrick Coleman
 
Aircraft Simulation Model and Flight Control Laws Design Using Scilab and XCos
Aircraft Simulation Model and Flight Control Laws Design Using Scilab and XCosAircraft Simulation Model and Flight Control Laws Design Using Scilab and XCos
Aircraft Simulation Model and Flight Control Laws Design Using Scilab and XCosScilab
 
10 Reasons for Choosing OpenSplice DDS
10 Reasons for Choosing OpenSplice DDS10 Reasons for Choosing OpenSplice DDS
10 Reasons for Choosing OpenSplice DDSAngelo Corsaro
 
Google edge tpu
Google edge tpuGoogle edge tpu
Google edge tpuRouyun Pan
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle Databricks
 

Mais procurados (20)

B4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, Kenya
B4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, KenyaB4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, Kenya
B4 - Quantum Power - Menengai 35MW Geothermal Project Nakuru County, Kenya
 
Continual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep ArchitecturesContinual/Lifelong Learning with Deep Architectures
Continual/Lifelong Learning with Deep Architectures
 
Present Indian energy scenario
Present Indian energy scenarioPresent Indian energy scenario
Present Indian energy scenario
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for research
 
Compressed Air Energy Storage CAES
Compressed Air Energy Storage CAESCompressed Air Energy Storage CAES
Compressed Air Energy Storage CAES
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
Speed Up Your Kubernetes Upgrades For Your Kafka Clusters
Speed Up Your Kubernetes Upgrades For Your Kafka ClustersSpeed Up Your Kubernetes Upgrades For Your Kafka Clusters
Speed Up Your Kubernetes Upgrades For Your Kafka Clusters
 
Zero shot-learning: paper presentation
Zero shot-learning: paper presentationZero shot-learning: paper presentation
Zero shot-learning: paper presentation
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
Edge Computing Architecture using GPUs and Kubernetes
Edge Computing Architecture using GPUs and KubernetesEdge Computing Architecture using GPUs and Kubernetes
Edge Computing Architecture using GPUs and Kubernetes
 
Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)Kubeflow Pipelines (with Tekton)
Kubeflow Pipelines (with Tekton)
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
 
DTW18 - code08 - Everything You Need To Know About Storage with Kubernetes
DTW18 - code08 - Everything You Need To Know About Storage with KubernetesDTW18 - code08 - Everything You Need To Know About Storage with Kubernetes
DTW18 - code08 - Everything You Need To Know About Storage with Kubernetes
 
Aircraft Simulation Model and Flight Control Laws Design Using Scilab and XCos
Aircraft Simulation Model and Flight Control Laws Design Using Scilab and XCosAircraft Simulation Model and Flight Control Laws Design Using Scilab and XCos
Aircraft Simulation Model and Flight Control Laws Design Using Scilab and XCos
 
10 Reasons for Choosing OpenSplice DDS
10 Reasons for Choosing OpenSplice DDS10 Reasons for Choosing OpenSplice DDS
10 Reasons for Choosing OpenSplice DDS
 
Serving models using KFServing
Serving models using KFServingServing models using KFServing
Serving models using KFServing
 
Google edge tpu
Google edge tpuGoogle edge tpu
Google edge tpu
 
Yarn
YarnYarn
Yarn
 
MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle MLFlow: Platform for Complete Machine Learning Lifecycle
MLFlow: Platform for Complete Machine Learning Lifecycle
 
Apache KAfka
Apache KAfkaApache KAfka
Apache KAfka
 

Destaque

Austin Python Learners Meetup - Everything you need to know about programming...
Austin Python Learners Meetup - Everything you need to know about programming...Austin Python Learners Meetup - Everything you need to know about programming...
Austin Python Learners Meetup - Everything you need to know about programming...Danny Mulligan
 
Python for System Administrators
Python for System AdministratorsPython for System Administrators
Python for System AdministratorsRoberto Polli
 
B sc_I_General chemistry U-IV Ligands and chelates
B sc_I_General chemistry U-IV Ligands and chelates  B sc_I_General chemistry U-IV Ligands and chelates
B sc_I_General chemistry U-IV Ligands and chelates Rai University
 
Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...
Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...
Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...Dierk Raabe
 
Python Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the FuturePython Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the FutureWes McKinney
 
Mossbauer spectroscopy - Principles and applications
Mossbauer spectroscopy - Principles and applicationsMossbauer spectroscopy - Principles and applications
Mossbauer spectroscopy - Principles and applicationsSANTHANAM V
 
Cordination compounds
Cordination compoundsCordination compounds
Cordination compoundsAnjani Sharma
 
Python for R Users
Python for R UsersPython for R Users
Python for R UsersAjay Ohri
 
Python入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニングPython入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニングYuichi Ito
 
COMPUTATIONAL CHEMISTRY
COMPUTATIONAL CHEMISTRY COMPUTATIONAL CHEMISTRY
COMPUTATIONAL CHEMISTRY Komal Rajgire
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesMatt Harrison
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShareSlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShareSlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShareSlideShare
 

Destaque (16)

Austin Python Learners Meetup - Everything you need to know about programming...
Austin Python Learners Meetup - Everything you need to know about programming...Austin Python Learners Meetup - Everything you need to know about programming...
Austin Python Learners Meetup - Everything you need to know about programming...
 
Reverse engineering with python
Reverse engineering with pythonReverse engineering with python
Reverse engineering with python
 
Python for System Administrators
Python for System AdministratorsPython for System Administrators
Python for System Administrators
 
B sc_I_General chemistry U-IV Ligands and chelates
B sc_I_General chemistry U-IV Ligands and chelates  B sc_I_General chemistry U-IV Ligands and chelates
B sc_I_General chemistry U-IV Ligands and chelates
 
Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...
Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...
Ab initio simulation in materials science, Dierk Raabe, lecture at IHPC Singa...
 
Mossbauer spectroscopy
Mossbauer spectroscopy Mossbauer spectroscopy
Mossbauer spectroscopy
 
Python Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the FuturePython Data Ecosystem: Thoughts on Building for the Future
Python Data Ecosystem: Thoughts on Building for the Future
 
Mossbauer spectroscopy - Principles and applications
Mossbauer spectroscopy - Principles and applicationsMossbauer spectroscopy - Principles and applications
Mossbauer spectroscopy - Principles and applications
 
Cordination compounds
Cordination compoundsCordination compounds
Cordination compounds
 
Python for R Users
Python for R UsersPython for R Users
Python for R Users
 
Python入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニングPython入門 : 4日間コース社内トレーニング
Python入門 : 4日間コース社内トレーニング
 
COMPUTATIONAL CHEMISTRY
COMPUTATIONAL CHEMISTRY COMPUTATIONAL CHEMISTRY
COMPUTATIONAL CHEMISTRY
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 
2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare2015 Upload Campaigns Calendar - SlideShare
2015 Upload Campaigns Calendar - SlideShare
 
What to Upload to SlideShare
What to Upload to SlideShareWhat to Upload to SlideShare
What to Upload to SlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 

Semelhante a Python for Science and Engineering: a presentation to A*STAR and the Singapore Computational Sciences Club, Edward Schofield, Python Charmers, June 2011

Python Intro For Managers
Python Intro For ManagersPython Intro For Managers
Python Intro For ManagersAtul Shridhar
 
Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Ronan Lamy
 
What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)wesley chun
 
Python and its Applications
Python and its ApplicationsPython and its Applications
Python and its ApplicationsAbhijeet Singh
 
Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425Sapna Tyagi
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Fwdays
 
What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)wesley chun
 
The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPykammeyer
 
Python 101 for the .NET Developer
Python 101 for the .NET DeveloperPython 101 for the .NET Developer
Python 101 for the .NET DeveloperSarah Dutkiewicz
 
Python @ PiTech - March 2009
Python @ PiTech - March 2009Python @ PiTech - March 2009
Python @ PiTech - March 2009tudorprodan
 
Python_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdfPython_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdfbhagyashri686896
 
Python_final_print_batch_II_vision_academy (1).pdf
Python_final_print_batch_II_vision_academy (1).pdfPython_final_print_batch_II_vision_academy (1).pdf
Python_final_print_batch_II_vision_academy (1).pdfrupaliakhute
 
Python_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdfPython_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdfsannykhopade
 
Python_vision_academy notes
Python_vision_academy notes Python_vision_academy notes
Python_vision_academy notes rajaniraut
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsHenry Schreiner
 

Semelhante a Python for Science and Engineering: a presentation to A*STAR and the Singapore Computational Sciences Club, Edward Schofield, Python Charmers, June 2011 (20)

Python Intro For Managers
Python Intro For ManagersPython Intro For Managers
Python Intro For Managers
 
Pythonic doesn't mean slow!
Pythonic doesn't mean slow!Pythonic doesn't mean slow!
Pythonic doesn't mean slow!
 
What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)What is Python? (Silicon Valley CodeCamp 2015)
What is Python? (Silicon Valley CodeCamp 2015)
 
Python and its Applications
Python and its ApplicationsPython and its Applications
Python and its Applications
 
Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425Pythonanditsapplications 161121160425
Pythonanditsapplications 161121160425
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
 
What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)
 
Numba lightning
Numba lightningNumba lightning
Numba lightning
 
The Joy of SciPy
The Joy of SciPyThe Joy of SciPy
The Joy of SciPy
 
Python 101 for the .NET Developer
Python 101 for the .NET DeveloperPython 101 for the .NET Developer
Python 101 for the .NET Developer
 
Python Class 1
Python Class 1Python Class 1
Python Class 1
 
Python @ PiTech - March 2009
Python @ PiTech - March 2009Python @ PiTech - March 2009
Python @ PiTech - March 2009
 
Python_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdfPython_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdf
 
Python_final_print_batch_II_vision_academy (1).pdf
Python_final_print_batch_II_vision_academy (1).pdfPython_final_print_batch_II_vision_academy (1).pdf
Python_final_print_batch_II_vision_academy (1).pdf
 
Python_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdfPython_final_print_batch_II_vision_academy.pdf
Python_final_print_batch_II_vision_academy.pdf
 
Python_vision_academy notes
Python_vision_academy notes Python_vision_academy notes
Python_vision_academy notes
 
Pyhton-1a-Basics.pdf
Pyhton-1a-Basics.pdfPyhton-1a-Basics.pdf
Pyhton-1a-Basics.pdf
 
MODULE 1.pptx
MODULE 1.pptxMODULE 1.pptx
MODULE 1.pptx
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 

Último

COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 

Último (20)

COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 

Python for Science and Engineering: a presentation to A*STAR and the Singapore Computational Sciences Club, Edward Schofield, Python Charmers, June 2011

  • 1. Python for Science and Engineering Dr Edward Schofield A*STAR / Singapore Computational Sciences Club Seminar June 14, 2011
  • 2. Scientific programming in 2011 Most scientists and engineers are: programming for 50+% of their work time (and rising) self-taught programmers using inefficient programming practices using the wrong programming languages: C++, FORTRAN, C#, PHP, Java, ...
  • 3. Scientific programming needs Rapid prototyping Efficiency for computational kernels Pre-written packages! Vectors, matrices, modelling, simulations, visualisation Extensibility; web front-ends; database backends; ...
  • 4. Ed's story: How I found Python PhD in statistical pattern recognition: 2001-2006 Needed good tools for my research! Discovered Python in 2002 after frustration with C++, Matlab, Java, Perl Contributed to NumPy and SciPy: maxent, sparse matrices, optimization, Monte Carlo, etc. Managed six releases of SciPy in 2005-6
  • 6. Introducing Python What is it? What is it good for? Who uses it?
  • 7. What is Python? interpreted strongly but dynamically typed object-oriented intuitive, readable open source, free ‘batteries included’
  • 8. ‘batteries included’ Python’s standard library is: very large well-supported well-documented
  • 9. Python’s standard library data types strings networking threads operating compression GUI arguments system complex CGI FTP cryptography numbers testing multimedia databases CSV files calendar email XML serialization
  • 10. What is an efficient programming language? Native Python code executes 10x more slowly than C and FORTRAN
  • 11. Would you build a racing car ... ... to get to Kuala Lumpur ASAP?
  • 12. Date Cost per GFLOPS (US $) Technology 1961 US $1.1 trillion 17 million IBM 1620s 1984 US $15,000,000 Cray X-MP Two 16-CPU clusters of 1997 US $30,000 Pentiums 2000, Apr $1000 Bunyip Beowulf cluster 2003, Aug $82 KASY0 2007, Mar $0.42 Ambric AM2045 2009, Sep $0.13 ATI Radeon R800 Source: Wikipedia: “FLOPS”
  • 13. Unit labor cost growth Proxy for cost of programmer time
  • 14. Efficiency When FORTRAN was invented, computer time was more expensive than programmer time. In the 1980s and 1990s that reversed.
  • 15. Efficient programming Python code is 10x faster to write than C and FORTRAN
  • 16. What if ... ... you now need to reach Sydney?
  • 17. Advantages of Python Easy to write Easy to maintain Great standard libraries Thriving ecosystem of third-party packages Open source
  • 18. ‘Batteries included’ Python’s standard library is: very large well supported well documented
  • 19. Python’s standard library data types strings networking threads operating compression GUI arguments system complex CGI FTP cryptography numbers testing multimedia databases CSV files calendar email XML serialization
  • 20. Question What is the date 177 days from now?
  • 21. Natural applications of Python Rapid prototyping Plotting, visualisation, 3D Numerical computing Web and database programming All-purpose glue
  • 22. Python vs other languages
  • 23. Languages used at CSIRO Python Fortran Java Matlab C VB.net IDL C++ R Perl C# +5-10 others!
  • 24. Which language do I choose? A different language for each task? A language you know? A language others in your team are using: support and help?
  • 25. Python Matlab Interpreted Yes Yes Powerful data input/output Yes Yes Great plotting Yes Yes General-purpose language Powerful Limited Cost Free $$$ Open source Yes No
  • 26. Python C++ Powerful Yes Yes Portable Yes In theory Standard libraries Vast Limited Easy to write and maintain Yes No Easy to learn Yes No
  • 27. Python C Fast to write Yes No Good for embedded systems, device No Yes drivers and operating systems Good for most other high-level tasks Yes No Standard library Vast Limited
  • 28. Python Java Powerful, well-designed language Yes Yes Standard libraries Vast Vast Easy to learn Yes No Code brevity Short Verbose Easy to write and maintain Yes Okay
  • 29. Open source Python is open source software Benefits: No vendor lock-in Cross-platform Insurance against bugs in the platform Free
  • 30. Python success stories Computer graphics: Industrial Light & Magic Web: Google: News, Groups, Maps, Gmail Legacy system integration: AstraZeneca - collaborative drug discovery
  • 31. Python success stories (2) Aerospace: NASA Research: universities worldwide ... Others: YouTube, Reddit, BitTorrent, Civilization IV,
  • 32. Industrial Light & Magic Python spread from scripting to the entire production pipeline Numerous reviews since 1996: Python is still the best tool for them
  • 33. United Space Alliance A common sentiment: “We achieve immediate functioning code so much faster in Python than in any other language that it’s staggering.” - Robin Friedrich, Senior Project Engineer
  • 34. Case study: air-traffic control Eric Newton, “Python for Critical Applications”: http:// metaslash.com/brochure/ recall.html Metaslash, Inc: 1999 to 2001 Mission-critical system for air-traffic control Replicated, fault-tolerant data storage
  • 35. Case study: air-traffic control Python prototype -> C++ implementation -> Python again Why? C++ dependencies were buggy C++ threads, STL were not portable enough Python’s advantages over C++ More portable 75% less code: more productivity, fewer bugs
  • 36. More case studies See http://www.python.org/about/success/ for lots more case studies and success stories
  • 37. 2. The scientific Python ecosystem
  • 38. Scientific software development Small beginnings Piecemeal growth, quirky interfaces ... Large, cumbersome systems
  • 40. NumPy Centre of Python’s numerical computing ecosystem
  • 41. NumPy The most fundamental tool for numerical computing in Python Fast multi-dimensional array capability
  • 42. What NumPy defines: Two fundamental objects: 1. n-dimensional array 2. universal function a rich set of numerical data types nearly 400 functions and methods on arrays: type conversions mathematical logical
  • 43. NumPy's features Fast. Written in C with BLAS/LAPACK hooks. Rich set of data types Linear algebra: matrix inversion, decompositions, … Discrete Fourier transforms Random number generation Trig, hypergeometric functions, etc.
  • 44. Elementwise array operations Loops are mostly unnecessary Operate on entire arrays! >>> a = numpy.array([20, 30, 40, 50]) >>> a < 35 array([True, True, False, False], dtype=bool) >>> b = numpy.arange(4) >>> a - b array([20, 29, 38, 47]) >>> b**2 array([0, 1, 4, 9])
  • 45. Universal functions NumPy defines 'ufuncs' that operate on entire arrays and other sequences (hence 'universal') Example: sin() >>> a = numpy.array([20, 30, 40, 50]) >>> c = 10 * numpy.sin(a) >>> c array([ 9.12945251, -9.88031624, 7.4511316 , -2.62374854])
  • 46. Array slicing Arrays can be sliced and indexed powerfully: >>> a = numpy.arange(10)**3 >>> a array([ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729]) >>> a[2:5] array([ 8, 27, 64])
  • 47. Fancy indexing Arrays can be used as indices into other arrays: >>> a = numpy.arange(12)**2 >>> ind = numpy.array([ 1, 1, 3, 8, 5 ]) >>> a[ind] array([ 1, 1, 9, 64, 25])
  • 48. Other linear algebra features Matrix inversion: mat(A).I Or: linalg.inv(A) Linear solvers: linalg.solve(A, x) Pseudoinverse: linalg.pinv(A)
  • 49. What is SciPy? A community A conference A package of scientific libraries
  • 50. Python for scientific software Back-end: computational work Front-end: input / output, visualization, GUIs Dozens of great scientific packages exist
  • 51. Python in science (2) NumPy: numerical / array module Matplotlib: great 2D and 3D plotting library IPython: nice interactive Python shell SciPy: set of scientific libraries: sparse matrices, signal processing, … RPy: integration with the R statistical environment
  • 52. Python in science (3) Cython: C language extensions Mayavi: 3D graphics, volumetric rendering Nitimes, Nipype: Python tools for neuroimaging SymPy: symbolic mathematics library
  • 53. Python in science (4) VPython: easy, real-time 3D programming UCSF Chimera, PyMOL, VMD: molecular graphics PyRAF: Hubble Space Telescope interface to RAF astronomical data BioPython: computational molecular biology Natural language toolkit: symbolic + statistical NLP Physics: PyROOT
  • 54. The SciPy package BSD-licensed software for maths, science, engineering integration signal processing sparse matrices optimization linear algebra maximum entropy interpolation ODEs statistics n-dim image FFTs scientific constants processing C/C++ and Fortran clustering interpolation integration
  • 55. SciPy optimisation example Fit a model to noisy data: y = a/xb sin(cx)+ε
  • 56. Example: fitting a model with scipy.optimize Task: Fit a model of the form y = a/bx sin(cx)+ε to noisy data. Spec: 1. Generate noisy data 2. Choose parameters (a, b, c) to minimize sum squared errors 3. Plot the data and fitted model (next session)
  • 57. SciPy optimisation example import numpy import pylab from scipy.optimize import leastsq def myfunc(params, x): (a, b, c) = params return a / (x**b) * numpy.sin(c * x) true_params = [1.5, 0.1, 2.] def f(x): return myfunc(true_params, x) def err(params, x, y): # error function return myfunc(params, x) - y
  • 58. SciPy optimisation example # Generate noisy data to fit n = 30; xmin = 0.1; xmax = 5 x = numpy.linspace(xmin, xmax, n) y = f(x) y += numpy.rand(len(x)) * 0.2 * (y.max() - y.min()) v0 = [3., 1., 4.] # initial param estimate # Fitting v, success = leastsq(err, v0, args=(x, y), maxfev=10000) print 'Estimated parameters: ', v print 'True parameters: ', true_params X = numpy.linspace(xmin, xmax, 5 * n) pylab.plot(x, y, 'ro', X, myfunc(v, X)) pylab.show()
  • 59. SciPy optimisation example Fit a model to noisy data: y = a/xb sin(cx)+ε
  • 60. Ingredients for this example numpy.linspace numpy.random.rand for the noise model (uniform) scipy.optimize.leastsq
  • 61. Sparse matrix example Construct and solve a sparse linear system
  • 62. Sparse matrices Sparse matrices are mostly zeros. They can be symmetric or asymmetric. Sparsity patterns vary: block sparse, band matrices, ... They can be huge! Only non-zeros are stored.
  • 63. Sparse matrices in SciPy SciPy supports seven sparse storage schemes ... and sparse solvers in Fortran.
  • 64. Sparse matrix creation To construct a 1000x1000 lil_matrix and add values: >>> from scipy.sparse import lil_matrix >>> from numpy.random import rand >>> from scipy.sparse.linalg import spsolve >>> A = lil_matrix((1000, 1000)) >>> A[0, :100] = rand(100) >>> A[1, 100:200] = A[0, :100] >>> A.setdiag(rand(1000))
  • 65. Solving sparse matrix systems Now convert the matrix to CSR format and solve Ax=b: >>> A = A.tocsr() >>> b = rand(1000) >>> x = spsolve(A, b) # Convert it to a dense matrix and solve, and check that the result is the same: >>> from numpy.linalg import solve, norm >>> x_ = solve(A.todense(), b) # Compute norm of the error: >>> err = norm(x - x_) >>> err < 1e-10 True
  • 66. Matplotlib Great plotting package in Python Matlab-like syntax Great rendering: anti-aliasing etc. Many ‘backends’: Cairo, GTK, Cocoa, PDF Flexible output: to EPS, PS, PDF, TIFF, PNG, ...
  • 67. Matplotlib: worked examples Search the web for 'Matplotlib gallery'
  • 68. Example: NumPy vectorization 1. Use a Monte Carlo algorithm to estimate π: 1. Generate uniform random variates (x,%y) over [0, 1]. 2. Estimate π from the proportion p that land in the unit circle. 2. Time two ways of doing this: 1. Using for loops 2. Using array operations (vectorized)
  • 71. Aspects to HPC Supercomputers Distributed clusters / grids Parallel programming Scripting Caches, shared memory Job control Code porting Specialized hardware
  • 72. Python for HPC Advantages Disadvantages Portability Global interpreter lock Easy scripting, glue Less control than C Maintainability Native loops are slow Profiling to identify hotspots Vectorization with NumPy
  • 73. Large data sets Useful Python language features: Generators, iterators Useful packages: Great HDF5 support from PyTables!
  • 74. Hierarchical data Databases without the relational baggage
  • 75. Great interface for HDF5 data Efficient support for massive data sets
  • 76. Applications of PyTables aeronautics telecommunications drug discovery data mining financial analysis statistical analysis climate prediction etc.
  • 77. Breaking news: June 2011 PyTables Pro is now being open sourced. Indexed searches for speed Merging with PyTables Working project name: NewPyTables
  • 78. PyTables performance OPSI indexing engine speed: Querying 10 billion rows can take hundredths of a second! Target use-case: mostly read-only or append-only data
  • 80. Important principles 1. "Premature optimization is the root of all evil" Don't write cryptic code just to make it more efficient! 2. 1-5% of the code takes up the vast majority of the computing time! ... and it might not be the 1-5% that you think!
  • 81. Checklist for efficient code From most to least important: 1. Check: Do you really need to make it more efficient? 2. Check: Are you using the right algorithms and data structures? 3. Check: Are you reusing pre-written libraries wherever possible? 4. Check: Which parts of the code are expensive? Measure, don't guess!
  • 82. Relative efficiency gains Exponential-order and polynomial-order speedups are possible by choosing the right algorithm for a task. These require the right data structures! These dwarf 10-25x linear-order speedups from: using lower-level languages using different language constructs.
  • 83. 4. About Python Charmers
  • 84. The largest Python training provider in South-East Asia Delighted customers include:
  • 85. Most popular course topics Python for Programmers 3 days Python for Scientists and Engineers 4 days Python for Geoscientists 4 days Python for Bioinformaticians 4 days New courses: Python for Financial Engineers 4 days Python for IT Security Professionals 3 days
  • 86. Python Charmers: Topics of expertise Python: beginners, advanced Scientific data processing with Python Software engineering with Python Large-scale problems: HPC, huge data sets, grids Statistics and Monte Carlo problems
  • 87. Python Charmers: Topics of expertise (2) Spatial data analysis / GIS General scripting, job control, glue GUIs with PyQt Integrating with other languages: R, C, C++, Fortran, ... Web development in Django
  • 88. How to get in touch See PythonCharmers.com or email us at: info@pythoncharmers.com