SlideShare uma empresa Scribd logo
1 de 8
Using
Jython
To
Prototype

      Mahout
Code
        Jonathan
Altman
   Principal
Engineer,
Concur
       Twi=er:
@async_io
Who
Am
I
and
What
Do
I
Do?
• By
day:
principal
engineer
at
Concur
• Architect
of
high‐volume
travel
booking
site
• Architect
of
travel
data
model
for
iKnerary

  storage/expense
integraKon
• Currently
team
lead
for
effort
to
leverage
our

  travel
and
spend
data
into
an
effecKve

  recommendaKon
engine
What
is
Mahout?
• Java
library
of
pre‐built
implementaKons
of

  various
machine
learning
tasks
• Recommenders:
collaboraKve
filtering
• Clustering:
grouping
things
by
similarity
• ClassificaKon:
analysis
of
a
corpus
for
clustering
• Intended
to
run
against
Hadoop‐based
data
sets
• h=p://mahout.apache.org/
What
is
jython?
• ImplementaKon
of
python
that
runs
against

  the
jvm
• Has
full
access
to
any
well‐behaved
java
library
• Started
in
1997
by
Jim
Hugunin,
who
also
later

  did
IronPython
for
the
.Net
CLR
• Version
2.5.2
mirrors
python
2.5
• h=p://www.jython.org/
Why
Do
This?
• I
needed
to
evaluate
Mahout’s
suitability
as

  the
toolkit
for
our
travel
recommender
system
• I
am
not
primarily
a
java
dev
(yet?),
and
I
don’t

  know
how
to
create
a
maven
project
• But
I
do
know
python
• Fastest
way
between
2
points
is
a
straight
line
• Step
1:
adapt
sample
code
from
“Mahout
In

  AcKon”
to
jython
How
Do
I
Do
This?
# Add Mahout jars to jython’s path
sys.path.append(os.environ.get("MAHOUT_CORE"))
for jar in glob.glob(os.environ.get("MAHOUT_JAR_DIR") +
"/*.jar"):
    sys.path.append(jar)



# import classes from Mahout jar…
from org.apache.mahout.cf.taste.impl.model.file import *
# Bunch of imports deleted

def main():
    # and we are using the imported FileDataModel
        model = FileDataModel(File(sys.argv[1]))
What
Did
We
Learn?
• About
3
hours
to
port
first
“Mahout
In
AcKon”

  example
to
jython
• 3
minutes
to
port
the
second
• Includes
learning
how
to
import
jars
into
python
• And
building
a
nice
loop
to
punt
on
jar

  dependency
management
:‐)
• Increases
ability
to
experiment
with
ideas
in

  Mahout
by
reducing
ceremony
Want
Some
Extra
Stuff?
• Python
IDEs
that
work
with
jython:
  – PyCharm
(JetBrains)
  – PyDev
(Eclipse
add‐on)
  – WingIDE
(no
debugger)
• Ported
GroupLens
100k
data
set
example
from

  secKon
2.5
of
“Mahout
In
AcKon”
is
at
h=ps://
  gist.github.com/1041033

Mais conteúdo relacionado

Mais procurados

Monitoring kubernetes with prometheus
Monitoring kubernetes with prometheusMonitoring kubernetes with prometheus
Monitoring kubernetes with prometheuscontinohq
 
CICD Pipeline Using Github Actions
CICD Pipeline Using Github ActionsCICD Pipeline Using Github Actions
CICD Pipeline Using Github ActionsKumar Shìvam
 
Using GitHub Actions to Deploy your Workloads to Azure
Using GitHub Actions to Deploy your Workloads to AzureUsing GitHub Actions to Deploy your Workloads to Azure
Using GitHub Actions to Deploy your Workloads to AzureKasun Kodagoda
 
JAX 2013: Introducing Eclipse Orion
JAX 2013: Introducing Eclipse OrionJAX 2013: Introducing Eclipse Orion
JAX 2013: Introducing Eclipse Orionmartinlippert
 
Ktor 部署攻略 - 老派 Fat Jar 大法
Ktor 部署攻略 - 老派 Fat Jar 大法Ktor 部署攻略 - 老派 Fat Jar 大法
Ktor 部署攻略 - 老派 Fat Jar 大法Shengyou Fan
 
A User Interface for adding Machine Learning tools into GitHub
A User Interface for adding Machine Learning tools into GitHubA User Interface for adding Machine Learning tools into GitHub
A User Interface for adding Machine Learning tools into GitHubRumyana Rumenova
 
Chat+twitter app with lift
Chat+twitter app with liftChat+twitter app with lift
Chat+twitter app with liftk4200
 
Spring Tooling: What's new and what's coming
Spring Tooling: What's new and what's comingSpring Tooling: What's new and what's coming
Spring Tooling: What's new and what's comingmartinlippert
 
Ratpack and Grails 3
Ratpack and Grails 3Ratpack and Grails 3
Ratpack and Grails 3Lari Hotari
 
GitHub Actions demo with mabl
GitHub Actions demo with mablGitHub Actions demo with mabl
GitHub Actions demo with mablBertold Kolics
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow IntroductionLiangjun Jiang
 
Automate your business
Automate your businessAutomate your business
Automate your businesszmoog
 
用 OPENRNDR 將 Chatbot 訊息視覺化
用 OPENRNDR 將 Chatbot 訊息視覺化用 OPENRNDR 將 Chatbot 訊息視覺化
用 OPENRNDR 將 Chatbot 訊息視覺化Shengyou Fan
 
Knative CloudEvents
Knative CloudEventsKnative CloudEvents
Knative CloudEventsNobuhiro Sue
 
Ecs gitlab runners
Ecs gitlab runnersEcs gitlab runners
Ecs gitlab runnersdynnamitt
 
Jenkins-Koji plugin presentation on Python & Ruby devel group @ Brno
Jenkins-Koji plugin presentation on Python & Ruby devel group @ BrnoJenkins-Koji plugin presentation on Python & Ruby devel group @ Brno
Jenkins-Koji plugin presentation on Python & Ruby devel group @ BrnoVaclav Tunka
 

Mais procurados (20)

Monitoring kubernetes with prometheus
Monitoring kubernetes with prometheusMonitoring kubernetes with prometheus
Monitoring kubernetes with prometheus
 
CICD Pipeline Using Github Actions
CICD Pipeline Using Github ActionsCICD Pipeline Using Github Actions
CICD Pipeline Using Github Actions
 
Using GitHub Actions to Deploy your Workloads to Azure
Using GitHub Actions to Deploy your Workloads to AzureUsing GitHub Actions to Deploy your Workloads to Azure
Using GitHub Actions to Deploy your Workloads to Azure
 
JAX 2013: Introducing Eclipse Orion
JAX 2013: Introducing Eclipse OrionJAX 2013: Introducing Eclipse Orion
JAX 2013: Introducing Eclipse Orion
 
Ktor 部署攻略 - 老派 Fat Jar 大法
Ktor 部署攻略 - 老派 Fat Jar 大法Ktor 部署攻略 - 老派 Fat Jar 大法
Ktor 部署攻略 - 老派 Fat Jar 大法
 
A User Interface for adding Machine Learning tools into GitHub
A User Interface for adding Machine Learning tools into GitHubA User Interface for adding Machine Learning tools into GitHub
A User Interface for adding Machine Learning tools into GitHub
 
PR workflow
PR workflowPR workflow
PR workflow
 
2d web mapping with flask
2d web mapping with flask2d web mapping with flask
2d web mapping with flask
 
Chat+twitter app with lift
Chat+twitter app with liftChat+twitter app with lift
Chat+twitter app with lift
 
Spring Tooling: What's new and what's coming
Spring Tooling: What's new and what's comingSpring Tooling: What's new and what's coming
Spring Tooling: What's new and what's coming
 
Ratpack and Grails 3
Ratpack and Grails 3Ratpack and Grails 3
Ratpack and Grails 3
 
GitHub Actions demo with mabl
GitHub Actions demo with mablGitHub Actions demo with mabl
GitHub Actions demo with mabl
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow Introduction
 
Automate your business
Automate your businessAutomate your business
Automate your business
 
用 OPENRNDR 將 Chatbot 訊息視覺化
用 OPENRNDR 將 Chatbot 訊息視覺化用 OPENRNDR 將 Chatbot 訊息視覺化
用 OPENRNDR 將 Chatbot 訊息視覺化
 
Knative CloudEvents
Knative CloudEventsKnative CloudEvents
Knative CloudEvents
 
Ecs gitlab runners
Ecs gitlab runnersEcs gitlab runners
Ecs gitlab runners
 
Jenkins-Koji plugin presentation on Python & Ruby devel group @ Brno
Jenkins-Koji plugin presentation on Python & Ruby devel group @ BrnoJenkins-Koji plugin presentation on Python & Ruby devel group @ Brno
Jenkins-Koji plugin presentation on Python & Ruby devel group @ Brno
 
Azkaban
AzkabanAzkaban
Azkaban
 

Destaque

Guide to AngularJS Services - NOVA MEAN August 2014
Guide to AngularJS Services - NOVA MEAN August 2014Guide to AngularJS Services - NOVA MEAN August 2014
Guide to AngularJS Services - NOVA MEAN August 2014async_io
 
NOVA MEAN - Why the M in MEAN is a Significant Contributor to Its Success
NOVA MEAN - Why the M in MEAN is a Significant Contributor to Its SuccessNOVA MEAN - Why the M in MEAN is a Significant Contributor to Its Success
NOVA MEAN - Why the M in MEAN is a Significant Contributor to Its Successasync_io
 
Building a Cauldron for Chef to Cook In
Building a Cauldron for Chef to Cook InBuilding a Cauldron for Chef to Cook In
Building a Cauldron for Chef to Cook Inasync_io
 
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!async_io
 
Javascript Promises/Q Library
Javascript Promises/Q LibraryJavascript Promises/Q Library
Javascript Promises/Q Libraryasync_io
 
Dcjq node.js presentation
Dcjq node.js presentationDcjq node.js presentation
Dcjq node.js presentationasync_io
 

Destaque (6)

Guide to AngularJS Services - NOVA MEAN August 2014
Guide to AngularJS Services - NOVA MEAN August 2014Guide to AngularJS Services - NOVA MEAN August 2014
Guide to AngularJS Services - NOVA MEAN August 2014
 
NOVA MEAN - Why the M in MEAN is a Significant Contributor to Its Success
NOVA MEAN - Why the M in MEAN is a Significant Contributor to Its SuccessNOVA MEAN - Why the M in MEAN is a Significant Contributor to Its Success
NOVA MEAN - Why the M in MEAN is a Significant Contributor to Its Success
 
Building a Cauldron for Chef to Cook In
Building a Cauldron for Chef to Cook InBuilding a Cauldron for Chef to Cook In
Building a Cauldron for Chef to Cook In
 
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
Using npm to Manage Your Projects for Fun and Profit - USEFUL INFO IN NOTES!
 
Javascript Promises/Q Library
Javascript Promises/Q LibraryJavascript Promises/Q Library
Javascript Promises/Q Library
 
Dcjq node.js presentation
Dcjq node.js presentationDcjq node.js presentation
Dcjq node.js presentation
 

Semelhante a Using Jython To Prototype Mahout Code

All about that reactive ui
All about that reactive uiAll about that reactive ui
All about that reactive uiPaul van Zyl
 
Modern Web Framework : Play framework
Modern Web Framework : Play frameworkModern Web Framework : Play framework
Modern Web Framework : Play frameworkSuman Adak
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?GetInData
 
Introduction to React native
Introduction to React nativeIntroduction to React native
Introduction to React nativeDhaval Barot
 
Jython in workflow and rules engines
Jython in workflow and rules enginesJython in workflow and rules engines
Jython in workflow and rules enginesVaclav Tunka
 
An evening with React Native
An evening with React NativeAn evening with React Native
An evening with React NativeMike Melusky
 
Google App Engine Java, Groovy and Gaelyk
Google App Engine Java, Groovy and GaelykGoogle App Engine Java, Groovy and Gaelyk
Google App Engine Java, Groovy and GaelykGuillaume Laforge
 
The Kubernetes Effect
The Kubernetes EffectThe Kubernetes Effect
The Kubernetes EffectBilgin Ibryam
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)Panagiotis Kanavos
 
Concurrency Programming in Java - 01 - Introduction to Concurrency Programming
Concurrency Programming in Java - 01 - Introduction to Concurrency ProgrammingConcurrency Programming in Java - 01 - Introduction to Concurrency Programming
Concurrency Programming in Java - 01 - Introduction to Concurrency ProgrammingSachintha Gunasena
 
DevOps: Automate all the things
DevOps: Automate all the thingsDevOps: Automate all the things
DevOps: Automate all the thingsMat Mannion
 
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)OpenBlend society
 
Why real integration developers ride Camels
Why real integration developers ride CamelsWhy real integration developers ride Camels
Why real integration developers ride CamelsChristian Posta
 
[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...
[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...
[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...DevDay.org
 
Immutable infrastructure:觀念與實作 (建議)
Immutable infrastructure:觀念與實作 (建議)Immutable infrastructure:觀念與實作 (建議)
Immutable infrastructure:觀念與實作 (建議)William Yeh
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioMuralidharan Deenathayalan
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioMuralidharan Deenathayalan
 

Semelhante a Using Jython To Prototype Mahout Code (20)

State of angular ecosystem
State of angular ecosystemState of angular ecosystem
State of angular ecosystem
 
All about that reactive ui
All about that reactive uiAll about that reactive ui
All about that reactive ui
 
Modern Web Framework : Play framework
Modern Web Framework : Play frameworkModern Web Framework : Play framework
Modern Web Framework : Play framework
 
Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?Hot to build continuously processing for 24/7 real-time data streaming platform?
Hot to build continuously processing for 24/7 real-time data streaming platform?
 
Introduction to React native
Introduction to React nativeIntroduction to React native
Introduction to React native
 
Jython in workflow and rules engines
Jython in workflow and rules enginesJython in workflow and rules engines
Jython in workflow and rules engines
 
Lattice yapc-slideshare
Lattice yapc-slideshareLattice yapc-slideshare
Lattice yapc-slideshare
 
An evening with React Native
An evening with React NativeAn evening with React Native
An evening with React Native
 
Google App Engine Java, Groovy and Gaelyk
Google App Engine Java, Groovy and GaelykGoogle App Engine Java, Groovy and Gaelyk
Google App Engine Java, Groovy and Gaelyk
 
The Kubernetes Effect
The Kubernetes EffectThe Kubernetes Effect
The Kubernetes Effect
 
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)Parallel and Asynchronous Programming -  ITProDevConnections 2012 (English)
Parallel and Asynchronous Programming - ITProDevConnections 2012 (English)
 
Concurrency Programming in Java - 01 - Introduction to Concurrency Programming
Concurrency Programming in Java - 01 - Introduction to Concurrency ProgrammingConcurrency Programming in Java - 01 - Introduction to Concurrency Programming
Concurrency Programming in Java - 01 - Introduction to Concurrency Programming
 
DevOps: Automate all the things
DevOps: Automate all the thingsDevOps: Automate all the things
DevOps: Automate all the things
 
mpandya_poster
mpandya_postermpandya_poster
mpandya_poster
 
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
 
Why real integration developers ride Camels
Why real integration developers ride CamelsWhy real integration developers ride Camels
Why real integration developers ride Camels
 
[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...
[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...
[DevDay 2017] ReactJS Hands on - Speaker: Binh Phan - Developer at mgm techno...
 
Immutable infrastructure:觀念與實作 (建議)
Immutable infrastructure:觀念與實作 (建議)Immutable infrastructure:觀念與實作 (建議)
Immutable infrastructure:觀念與實作 (建議)
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
 

Último

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Último (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Using Jython To Prototype Mahout Code

Notas do Editor

  1. \n
  2. First we built a travel booking tool\nThen we integrated it with expense and built reporting\nThen we went back and built the trip data storage subsystem to handle increased volumes of data\nNow we are trying to put the combined travel and expense data into Hadoop to do analysis and leverage the knowledge of our customers for their benefit\n
  3. So Mahout looked like it might be a good way to bootstrap our efforts around building recommendations. If nothing else, it might be a fast path to v1 while we write more specialized algorithms tuned to our specific data sets as a v2.\n
  4. It’s very cool: Jim H started both projects as tests: jython to see if jvm would be faster than python’s vm. IronPython to “prove” CLR was slow compared to e.g. JVM (it wasn’t)\nYeah, jython’s definitely on the cutting edge with python 2.5 support\n
  5. Mahout appears to be a good system for doing recommendation engines. We need to find out how good, and what its strengths and limitations are.\n\nI do know some java; enough to do some light recreational Android programming. But not only do I know python, the data scientist who will actually determine the optimal factors to build our recommendation engine on knows python. She also doesn’t know java (yet?). So I have a tool that the team is familiar with\n\nJust building Mahout so I could test it out was painful enough. It requires maven2 to build, but since this is an existing project it was all configured for me to just build after downloading. But I still find it painful to watch maven work.\n\nI shuddered at the thought of having to actually do the maven setup for a new project that would have to be built\n\nMost importantly here, what you end up with when you make Mahout accessible via jython is a rapid prototyping/testing/experimentation tool for building out Mahout code. We’ve taken out the ceremony. That’s all.\n\nWhen you’re done figuring out what you need to do, you could then move to compiled java for speed.\n\nBut, for many/most applications, you can probably stop there. The actual Mahout processing is the serious limiting factor here, not the jython code. My suspicion is that there’s far more performance to be gained optimizing the actual Mahout implementation than moving the jython code (which is native jvm by the time it runs) to java/scala/clojure\n
  6. \n
  7. The single largest chunk of my time was actually spent trying to decide what jars I had to append to my jython path, followed by really grokking the jython path/import stuff\n\nAs you can see, after enough time I just punted on the jar dependencies. Every single jar is on the path, although I only import from the ones I need. Worth some research into jython to see if I’m adding any overhead other than search path like opening/inspecting the jars. I suspect not.\n\nNow, if you knew maven, it might take less time to start a new project and get it up than I would take, but once *I* was done, every subsequent jython script takes almost no time to set up, and the project is ready to run as soon as you’ve saved your source code.\n\nWe can work without having to either build a new app for every experiment, or build in some way to control which experiment runs in some ever-growing app\n
  8. I haven’t really tested either PyCharm or PyDev to do these things. Someone else can do *that* lightning talk at a later meetup\n