O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Python as a Replacement for Commercial Stats Packages

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 8 Anúncio

Mais Conteúdo rRelacionado

Mais recentes (20)

Anúncio

Python as a Replacement for Commercial Stats Packages

  1. 1. Python as a Replacement forPython as a Replacement for Commercial Stats PackagesCommercial Stats Packages Harold Henson - Hensky Consulting November 23, 2017 Codie’s Café Shopify 1Codie’s Cafe Nov 2017
  2. 2. Software Choice a Key Element ofSoftware Choice a Key Element of Business Intelligence InfrastructureBusiness Intelligence Infrastructure Many areas in government invest in current data on an ongoing basis ◦ Software cost are minor relative to total costs Many feel safer with known entities ◦ Onus is on experts to champion a new option such as Python Python is still considered exotic However it is a viable choice! 2Codie’s Cafe Nov 2017
  3. 3. Several Core ModulesSeveral Core Modules Statisticians will focus on a few core modules in the entire ecosystem Pandas ◦ Can build analytical datasets ◦ Has many rudimentary techniques Numpy ◦ Matrix Algebra SciPy StatsModels Codie’s Cafe Nov 2017 3
  4. 4. Sample OutputSample Output Very Similar to a Commercial Package Codie’s Cafe Nov 2017 4 OLS Regression Results ============================================================================== Dep. Variable: A2Y R-squared: 0.283 Model: OLS Adj. R-squared: 0.268 Method: Least Squares F-statistic: 18.94 Date: Mon, 04 Sep 2017 Prob (F-statistic): 7.02e-05 Time: 18:31:07 Log-Likelihood: -62.546 No. Observations: 50 AIC: 129.1 Df Residuals: 48 BIC: 132.9 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------------------- Intercept 0.0310 0.635 0.049 0.961 -1.246 1.308 A1Y 0.5432 0.125 4.352 0.000 0.292 0.794 ============================================================================== Omnibus: 1.817 Durbin-Watson: 2.084 Prob(Omnibus): 0.403 Jarque-Bera (JB): 1.102 Skew: -0.339 Prob(JB): 0.576 Kurtosis: 3.263 Cond. No. 27.5 ============================================================================== Warnings: [1] Standard Errors assumes that the covariance matrix of the errors is correctly specified.
  5. 5. Numerous Speciality ModulesNumerous Speciality Modules Pretty Pandas ◦ Henry Hammond Tensor Flow ◦ Independent big data/neural networks project Sckit Learn PyMc ◦ Baysian Statistics Codie’s Cafe Nov 2017 5
  6. 6. Quality of Communications ofQuality of Communications of Results MixedResults Mixed Graphics Package Second to None Tabulation is weak ◦ Export to Spreadsheet is the easiest way to support professional tables in Documents Jupyter Notebooks very useful for limited applications Codie’s Cafe Nov 2017 6
  7. 7. In SummaryIn Summary PerfectlyViable Option Increased Power May Come at a Cost of Training ◦ More research oriented will favour Python ◦ High turnover environments will favour commercial packages Other open source projects of note ◦ R – has very long history ◦ Julia – the next generation? Codie’s Cafe Nov 2017 7
  8. 8. ReferencesReferences Python for Data Analysis – Wes McKinney ◦ Core document for project ◦ Crucial details are online ◦ 2nd edition just released Guide to Numpy – Travis Oliphant ◦ Applied matrix algebra Learning SciPy for Numerical and Scientific Computing – Rojas et. al. Codie’s Cafe Nov 2017 8

×