SlideShare a Scribd company logo
1 of 49
Download to read offline
Improving Python and Spark Performance
and Interoperability with Apache Arrow
Julien Le Dem
Principal Architect
Dremio
Li Jin
Software Engineer
Two Sigma Investments
© 2017 Dremio Corporation, Two Sigma Investments, LP
About Us
• Architect at @DremioHQ
• Formerly Tech Lead at Twitter on Data 
Platforms
• Creator of Parquet
• Apache member
• Apache PMCs: Arrow, Kudu, Incubator, 
Pig, Parquet
Julien Le Dem
@J_
Li Jin
@icexelloss
• Software Engineer at Two Sigma Investments
• Building a python­based analytics platform with PySpark
• Other open source projects:
– Flint: A Time Series Library on 
Spark
– Cook: A Fair Share Scheduler on 
Mesos
© 2017 Dremio Corporation, Two Sigma Investments, LP
Agenda
• Current state and limitations of PySpark UDFs
• Apache Arrow overview
• Improvements realized
• Future roadmap
Current state 
and limitations 
of PySpark UDFs
© 2017 Dremio Corporation, Two Sigma Investments, LP
Why do we need User Defined Functions?
• Some computation is more easily expressed with Python than Spark 
built­in functions.
• Examples:
– weighted mean
– weighted correlation 
– exponential moving average
© 2017 Dremio Corporation, Two Sigma Investments, LP
What is PySpark UDF
• PySpark UDF is a user defined function executed in 
Python runtime.
• Two types:
– Row UDF: 
• lambda x: x + 1
• lambda date1, date2: (date1 - date2).years
– Group UDF (subject of this presentation):
• lambda values: np.mean(np.array(values))
© 2017 Dremio Corporation, Two Sigma Investments, LP
Row UDF
• Operates on a row by row basis
– Similar to `map` operator
• Example …
df.withColumn(
‘v2’,
udf(lambda x: x+1, DoubleType())(df.v1)
)
• Performance:
– 60x slower than build­in functions for simple case
© 2017 Dremio Corporation, Two Sigma Investments, LP
Group UDF
• UDF that operates on more than one row
– Similar to `groupBy` followed by `map` operator
• Example:
– Compute weighted mean by month
© 2017 Dremio Corporation, Two Sigma Investments, LP
Group UDF
• Not supported out of box:
– Need boiler plate code to pack/unpack multiple rows into a nested row
• Poor performance
– Groups are materialized and then converted to Python data structures
© 2017 Dremio Corporation, Two Sigma Investments, LP
Example: Data Normalization
(values – values.mean()) / values.std()
© 2017 Dremio Corporation, Two Sigma Investments, LP
Example: Data Normalization
© 2017 Dremio Corporation, Two Sigma Investments, LP
Example: Monthly Data Normalization
Useful bits
© 2017 Dremio Corporation, Two Sigma Investments, LP
Example: Monthly Data Normalization
Boilerplate
Boilerplate
© 2017 Dremio Corporation, Two Sigma Investments, LP
Example: Monthly Data Normalization
• Poor performance ­ 16x slower than baseline
groupBy().agg(collect_list())
© 2017 Dremio Corporation, Two Sigma Investments, LP
Problems
• Packing / unpacking nested rows
• Inefficient data movement (Serialization / Deserialization)
• Scalar computation model: object boxing and interpreter overhead
Apache 
Arrow
© 2017 Dremio Corporation, Two Sigma Investments, LP
Arrow: An open source standard
• Common need for in memory columnar
• Building on the success of Parquet.
• Top­level Apache project
• Standard from the start
– Developers from 13+ major open source projects involved
• Benefits:
– Share the effort
– Create an ecosystem
Calcite
Cassandra
Deeplearning4
j
Drill
Hadoop
HBase
Ibis
Impala
Kudu
Pandas
Parquet
Phoenix
Spark
Storm
R
© 2017 Dremio Corporation, Two Sigma Investments, LP
Arrow goals
• Well­documented and cross language compatible
• Designed to take advantage of modern CPU
• Embeddable 
­ In execution engines, storage layers, etc.
• Interoperable
© 2017 Dremio Corporation, Two Sigma Investments, LP
High Performance Sharing & Interchange
Before With Arrow
• Each system has its own internal
memory format
• 70-80% CPU wasted on
serialization and deserialization
• Functionality duplication and
unnecessary conversions
• All systems utilize the same
memory format
• No overhead for cross-system
communication
• Projects can share functionality
(eg: Parquet-to-Arrow reader)
© 2017 Dremio Corporation, Two Sigma Investments, LP
Columnar data
persons = [{
nam e:’Joe',
age:18,
phones:[
‘555-111-1111’,
‘555-222-2222’
]
},{
nam e:’Jack',
age:37,
phones:[‘555-333-3333’]
}]
© 2017 Dremio Corporation, Two Sigma Investments, LP
Record Batch Construction
Schema 
Negotiation
Schema 
Negotiation
Dictionary 
Batch
Dictionary 
Batch
Record 
Batch
Record 
Batch
Record 
Batch
Record 
Batch
Record 
Batch
Record 
Batch
name (offset)name (offset)
name (data)name (data)
age (data)age (data)
phones (list offset)phones (list offset)
phones (data)phones (data)
data header (describes offsets into data)data header (describes offsets into data)
name (bitmap)name (bitmap)
age (bitmap)age (bitmap)
phones (bitmap)phones (bitmap)
phones (offset)phones (offset)
{
nam e:’Joe',
age:18,
phones:[
‘555-111-1111’,
‘555-222-2222’
]
}
Each box (vector) is contiguous memory 
The entire record batch is contiguous on wire
Each box (vector) is contiguous memory 
The entire record batch is contiguous on wire
© 2017 Dremio Corporation, Two Sigma Investments, LP
In memory columnar format for speed
• Maximize CPU throughput
­ Pipelining
­ SIMD
­ cache locality
• Scatter/gather I/O
© 2017 Dremio Corporation, Two Sigma Investments, LP
Results
­ PySpark Integration: 
53x speedup (IBM spark work on SPARK­13534)
http://s.apache.org/arrowresult1
­ Streaming Arrow Performance
7.75GB/s data movement
http://s.apache.org/arrowresult2
­ Arrow Parquet C++ Integration
4GB/s reads
http://s.apache.org/arrowresult3
­ Pandas Integration
9.71GB/s
http://s.apache.org/arrowresult4
© 2017 Dremio Corporation, Two Sigma Investments, LP
Arrow Releases
178
195
311
85
237
131
76
17
Changes Days
Improvements 
to PySpark  
with Arrow
© 2017 Dremio Corporation, Two Sigma Investments, LP
How PySpark UDF works
Execut
or
Python
Worker
UDF: scalar -> scalar
Batched Rows
Batched Rows
© 2017 Dremio Corporation, Two Sigma Investments, LP
Current Issues with UDF
• Serialize / Deserialize in Python
• Scalar computation model (Python for loop)
© 2017 Dremio Corporation, Two Sigma Investments, LP
Profile lambda x: x+1 Actual Runtime is 2s without profiling.
8 Mb/s
91.8%
© 2017 Dremio Corporation, Two Sigma Investments, LP
Vectorize Row UDF
Executor
Python
Worker
UDF: pd.DataFrame ­> pd.DataFrame
Rows ­> 
RB
RB ­> 
Rows
© 2017 Dremio Corporation, Two Sigma Investments, LP
Why pandas.DataFrame
• Fast, feature­rich, widely used by Python users
• Already exists in PySpark (toPandas)
• Compatible with popular Python libraries:
­ NumPy, StatsModels, SciPy, scikit­learn…
• Zero copy to/from Arrow
© 2017 Dremio Corporation, Two Sigma Investments, LP
Scalar vs Vectorized UDF
20x Speed Up
Actual Runtime is 2s without profiling
© 2017 Dremio Corporation, Two Sigma Investments, LP
Scalar vs Vectorized UDF
Overhead
Removed
© 2017 Dremio Corporation, Two Sigma Investments, LP
Scalar vs Vectorized UDF
Less System Call
Faster I/O
© 2017 Dremio Corporation, Two Sigma Investments, LP
Scalar vs Vectorized UDF
4.5x Speed Up
© 2017 Dremio Corporation, Two Sigma Investments, LP
Support Group UDF
• Split­apply­combine:
­ Break a problem into smaller pieces
­ Operate on each piece independently
­ Put all pieces back together
• Common pattern supported in SQL, Spark, Pandas, R … 
© 2017 Dremio Corporation, Two Sigma Investments, LP
Split­Apply­Combine (Current)
• Split: groupBy, window, …
• Apply: mean, stddev, collect_list, rank …
• Combine: Inherently done by Spark
© 2017 Dremio Corporation, Two Sigma Investments, LP
Split­Apply­Combine (with Group UDF)
• Split: groupBy, window, …
• Apply: UDF
• Combine: Inherently done by Spark
© 2017 Dremio Corporation, Two Sigma Investments, LP
Introduce groupBy().apply()
• UDF: pd.DataFrame ­> pd.DataFrame
– Treat each group as a pandas DataFrame
– Apply UDF on each group
– Assemble as PySpark DataFrame
© 2017 Dremio Corporation, Two Sigma Investments, LP
Introduce groupBy().apply()
RowsRows
RowsRows
RowsRows
GroupsGroups
GroupsGroups
GroupsGroups
GroupsGroups
GroupsGroups
GroupsGroups
                 Each Group:
pd.DataFrame ­> pd.DataFramegroupBy
© 2017 Dremio Corporation, Two Sigma Investments, LP
Previous Example: Data Normalization
(values – values.mean()) / values.std()
© 2017 Dremio Corporation, Two Sigma Investments, LP
Previous Example: Data Normalization
5x Speed Up
Current: Group UDF:
© 2017 Dremio Corporation, Two Sigma Investments, LP
Limitations
• Requires Spark Row <­> Arrow RecordBatch conversion
– Incompatible memory layout (row vs column)
• (groupBy) No local aggregation
– Difficult due to how PySpark works. See 
https://issues.apache.org/jira/browse/SPARK­10915 
Future 
Roadmap
© 2017 Dremio Corporation, Two Sigma Investments, LP
What’s Next (Arrow)
• Arrow RPC/REST
• Arrow IPC
• Apache {Spark, Drill, Kudu} to Arrow Integration
– Faster UDFs, Storage interfaces
© 2017 Dremio Corporation, Two Sigma Investments, LP
What’s Next (PySpark UDF)
• Continue working on SPARK­20396
• Support Pandas UDF with more PySpark functions:
– groupBy().agg()
– window
© 2017 Dremio Corporation, Two Sigma Investments, LP
What’s Next (PySpark UDF)
© 2017 Dremio Corporation, Two Sigma Investments, LP
Get Involved
• Watch SPARK­20396
• Join the Arrow community
– dev@arrow.apache.org
– Slack:
• https://apachearrowslackin.herokuapp.com/
– http://arrow.apache.org
– Follow @ApacheArrow
© 2017 Dremio Corporation, Two Sigma Investments, LP
Thank you
• Bryan Cutler (IBM), Wes McKinney (Two Sigma Investments) for 
helping build this feature
• Apache Arrow community
• Spark Summit organizers
• Two Sigma and Dremio for supporting this work
This document is being distributed for informational and educational purposes only and is not an offer to sell or the solicitation of an offer to buy
any securities or other instruments. The information contained herein is not intended to provide, and should not be relied upon for investment
advice. The views expressed herein are not necessarily the views of Two Sigma Investments, LP or any of its affiliates (collectively, “Two Sigma”).
Such views reflect significant assumptions and subjective of the author(s) of the document and are subject to change without notice. The
document may employ data derived from third-party sources. No representation is made as to the accuracy of such information and the use of
such information in no way implies an endorsement of the source of such information or its validity.
The copyrights and/or trademarks in some of the images, logos or other material used herein may be owned by entities other than Two Sigma. If
so, such copyrights and/or trademarks are most likely owned by the entity that created the material and are used purely for identification and
comment as fair use under international copyright and/or trademark laws. Use of such image, copyright or trademark does not imply any
association with such organization (or endorsement of such organization) by Two Sigma, nor vice versa.

More Related Content

What's hot

APIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in HeavenAPIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in HeavenMichael Petychakis
 
Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)
Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)
Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)Hafiz Ismail
 
API Athens Meetup - API standards 25-6-2014
API Athens Meetup - API standards   25-6-2014API Athens Meetup - API standards   25-6-2014
API Athens Meetup - API standards 25-6-2014Michael Petychakis
 
Modeling REST API's Behaviour with Text, Graphics or Both?
Modeling REST API's Behaviour with Text, Graphics or Both?Modeling REST API's Behaviour with Text, Graphics or Both?
Modeling REST API's Behaviour with Text, Graphics or Both?Ana Ivanchikj
 
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeWebinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeMongoDB
 
Tracking and business intelligence
Tracking and business intelligenceTracking and business intelligence
Tracking and business intelligenceSebastian Schleicher
 
Better APIs with GraphQL
Better APIs with GraphQL Better APIs with GraphQL
Better APIs with GraphQL Josh Price
 
Share point apps the good, the bad, and the pot of gold at the end of the r...
Share point apps   the good, the bad, and the pot of gold at the end of the r...Share point apps   the good, the bad, and the pot of gold at the end of the r...
Share point apps the good, the bad, and the pot of gold at the end of the r...Bill Ayers
 
SPEngage Raleigh 2017 Azure Active Directory For Office 365 Developers
SPEngage Raleigh 2017 Azure Active Directory For Office 365 DevelopersSPEngage Raleigh 2017 Azure Active Directory For Office 365 Developers
SPEngage Raleigh 2017 Azure Active Directory For Office 365 DevelopersPrashant G Bhoyar (Microsoft MVP)
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB
 
Webtrends and bright starr webinar 01282015 sharepoint is evolving
Webtrends and bright starr webinar 01282015   sharepoint is evolvingWebtrends and bright starr webinar 01282015   sharepoint is evolving
Webtrends and bright starr webinar 01282015 sharepoint is evolvingKunaal Kapoor
 
Maintainable API Docs and Other Rainbow Colored Unicorns
Maintainable API Docs and Other Rainbow Colored UnicornsMaintainable API Docs and Other Rainbow Colored Unicorns
Maintainable API Docs and Other Rainbow Colored UnicornsNeil Mansilla
 

What's hot (12)

APIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in HeavenAPIs and Linked Data: A match made in Heaven
APIs and Linked Data: A match made in Heaven
 
Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)
Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)
Introduction to GraphQL (or How I Learned to Stop Worrying about REST APIs)
 
API Athens Meetup - API standards 25-6-2014
API Athens Meetup - API standards   25-6-2014API Athens Meetup - API standards   25-6-2014
API Athens Meetup - API standards 25-6-2014
 
Modeling REST API's Behaviour with Text, Graphics or Both?
Modeling REST API's Behaviour with Text, Graphics or Both?Modeling REST API's Behaviour with Text, Graphics or Both?
Modeling REST API's Behaviour with Text, Graphics or Both?
 
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a TimeWebinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
Webinar: Realizing Omni-Channel Retailing with MongoDB - One Step at a Time
 
Tracking and business intelligence
Tracking and business intelligenceTracking and business intelligence
Tracking and business intelligence
 
Better APIs with GraphQL
Better APIs with GraphQL Better APIs with GraphQL
Better APIs with GraphQL
 
Share point apps the good, the bad, and the pot of gold at the end of the r...
Share point apps   the good, the bad, and the pot of gold at the end of the r...Share point apps   the good, the bad, and the pot of gold at the end of the r...
Share point apps the good, the bad, and the pot of gold at the end of the r...
 
SPEngage Raleigh 2017 Azure Active Directory For Office 365 Developers
SPEngage Raleigh 2017 Azure Active Directory For Office 365 DevelopersSPEngage Raleigh 2017 Azure Active Directory For Office 365 Developers
SPEngage Raleigh 2017 Azure Active Directory For Office 365 Developers
 
MongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business InsightsMongoDB and Hadoop: Driving Business Insights
MongoDB and Hadoop: Driving Business Insights
 
Webtrends and bright starr webinar 01282015 sharepoint is evolving
Webtrends and bright starr webinar 01282015   sharepoint is evolvingWebtrends and bright starr webinar 01282015   sharepoint is evolving
Webtrends and bright starr webinar 01282015 sharepoint is evolving
 
Maintainable API Docs and Other Rainbow Colored Unicorns
Maintainable API Docs and Other Rainbow Colored UnicornsMaintainable API Docs and Other Rainbow Colored Unicorns
Maintainable API Docs and Other Rainbow Colored Unicorns
 

Similar to Improving Python and Spark Performance and Interoperability with Apache Arrow

Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Databricks
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowJulien Le Dem
 
Enabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data CitizenEnabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data CitizenWes McKinney
 
Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowDataWorks Summit/Hadoop Summit
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
 
The Ignite Buzz That Drives Digital Transformation Success
The Ignite Buzz That Drives Digital Transformation SuccessThe Ignite Buzz That Drives Digital Transformation Success
The Ignite Buzz That Drives Digital Transformation SuccessDocAuto
 
#ESPC18 how to migrate to the #SharePoint Framework?
#ESPC18 how to migrate to the #SharePoint Framework?#ESPC18 how to migrate to the #SharePoint Framework?
#ESPC18 how to migrate to the #SharePoint Framework?Vincent Biret
 
2019-Nov: Domain Driven Design (DDD) and when not to use it
2019-Nov: Domain Driven Design (DDD) and when not to use it2019-Nov: Domain Driven Design (DDD) and when not to use it
2019-Nov: Domain Driven Design (DDD) and when not to use itMark Windholtz
 
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015Wes McKinney
 
Convert your Full Trust Solutions to the SharePoint Framework (SPFx)
Convert your Full Trust Solutions to the SharePoint Framework (SPFx)Convert your Full Trust Solutions to the SharePoint Framework (SPFx)
Convert your Full Trust Solutions to the SharePoint Framework (SPFx)Brian Culver
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkDatabricks
 
My Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics EngineerMy Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics EngineerGoDataDriven
 
Light Speed Integrations With Anypoint Flow Designer
Light Speed Integrations With Anypoint Flow DesignerLight Speed Integrations With Anypoint Flow Designer
Light Speed Integrations With Anypoint Flow DesignerAaronLieberman5
 
Building Business Applications in Office 365 SharePoint Online Using Logic Apps
Building Business Applications in Office 365 SharePoint Online Using Logic AppsBuilding Business Applications in Office 365 SharePoint Online Using Logic Apps
Building Business Applications in Office 365 SharePoint Online Using Logic AppsPrashant G Bhoyar (Microsoft MVP)
 
DEVNET-1125 Partner Case Study - “Project Hybrid Engineer”
DEVNET-1125	Partner Case Study - “Project Hybrid Engineer”DEVNET-1125	Partner Case Study - “Project Hybrid Engineer”
DEVNET-1125 Partner Case Study - “Project Hybrid Engineer”Cisco DevNet
 

Similar to Improving Python and Spark Performance and Interoperability with Apache Arrow (20)

Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
 
Enabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data CitizenEnabling Python to be a Better Big Data Citizen
Enabling Python to be a Better Big Data Citizen
 
Jitesh Agrawal plone
Jitesh Agrawal ploneJitesh Agrawal plone
Jitesh Agrawal plone
 
Jitesh agrawal Resume
Jitesh agrawal ResumeJitesh agrawal Resume
Jitesh agrawal Resume
 
Efficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and ArrowEfficient Data Formats for Analytics with Parquet and Arrow
Efficient Data Formats for Analytics with Parquet and Arrow
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
 
The Ignite Buzz That Drives Digital Transformation Success
The Ignite Buzz That Drives Digital Transformation SuccessThe Ignite Buzz That Drives Digital Transformation Success
The Ignite Buzz That Drives Digital Transformation Success
 
#ESPC18 how to migrate to the #SharePoint Framework?
#ESPC18 how to migrate to the #SharePoint Framework?#ESPC18 how to migrate to the #SharePoint Framework?
#ESPC18 how to migrate to the #SharePoint Framework?
 
2019-Nov: Domain Driven Design (DDD) and when not to use it
2019-Nov: Domain Driven Design (DDD) and when not to use it2019-Nov: Domain Driven Design (DDD) and when not to use it
2019-Nov: Domain Driven Design (DDD) and when not to use it
 
An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015An Incomplete Data Tools Landscape for Hackers in 2015
An Incomplete Data Tools Landscape for Hackers in 2015
 
Convert your Full Trust Solutions to the SharePoint Framework (SPFx)
Convert your Full Trust Solutions to the SharePoint Framework (SPFx)Convert your Full Trust Solutions to the SharePoint Framework (SPFx)
Convert your Full Trust Solutions to the SharePoint Framework (SPFx)
 
Simplifying AI integration on Apache Spark
Simplifying AI integration on Apache SparkSimplifying AI integration on Apache Spark
Simplifying AI integration on Apache Spark
 
6yearsResume
6yearsResume6yearsResume
6yearsResume
 
My Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics EngineerMy Path From Data Engineer to Analytics Engineer
My Path From Data Engineer to Analytics Engineer
 
Deploy prometheus on kubernetes
Deploy prometheus on kubernetesDeploy prometheus on kubernetes
Deploy prometheus on kubernetes
 
Light Speed Integrations With Anypoint Flow Designer
Light Speed Integrations With Anypoint Flow DesignerLight Speed Integrations With Anypoint Flow Designer
Light Speed Integrations With Anypoint Flow Designer
 
Building Business Applications in Office 365 SharePoint Online Using Logic Apps
Building Business Applications in Office 365 SharePoint Online Using Logic AppsBuilding Business Applications in Office 365 SharePoint Online Using Logic Apps
Building Business Applications in Office 365 SharePoint Online Using Logic Apps
 
SamSegalResume
SamSegalResumeSamSegalResume
SamSegalResume
 
DEVNET-1125 Partner Case Study - “Project Hybrid Engineer”
DEVNET-1125	Partner Case Study - “Project Hybrid Engineer”DEVNET-1125	Partner Case Study - “Project Hybrid Engineer”
DEVNET-1125 Partner Case Study - “Project Hybrid Engineer”
 

More from Two Sigma

The State of Open Data on School Bullying
The State of Open Data on School BullyingThe State of Open Data on School Bullying
The State of Open Data on School BullyingTwo Sigma
 
Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018Two Sigma
 
Future of Pandas - Jeff Reback
Future of Pandas - Jeff RebackFuture of Pandas - Jeff Reback
Future of Pandas - Jeff RebackTwo Sigma
 
BeakerX - Tiezheng Li
BeakerX - Tiezheng LiBeakerX - Tiezheng Li
BeakerX - Tiezheng LiTwo Sigma
 
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee JooEngineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee JooTwo Sigma
 
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel HudsonBringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel HudsonTwo Sigma
 
Waiter: An Open-Source Distributed Auto-Scaler
Waiter: An Open-Source Distributed Auto-ScalerWaiter: An Open-Source Distributed Auto-Scaler
Waiter: An Open-Source Distributed Auto-ScalerTwo Sigma
 
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia YeResponsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia YeTwo Sigma
 
Archival Storage at Two Sigma - Josh Leners
Archival Storage at Two Sigma - Josh LenersArchival Storage at Two Sigma - Josh Leners
Archival Storage at Two Sigma - Josh LenersTwo Sigma
 
Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...Two Sigma
 
The Language of Compression - Leif Walsh
The Language of Compression - Leif WalshThe Language of Compression - Leif Walsh
The Language of Compression - Leif WalshTwo Sigma
 
Identifying Emergent Behaviors in Complex Systems - Jane Adams
Identifying Emergent Behaviors in Complex Systems - Jane AdamsIdentifying Emergent Behaviors in Complex Systems - Jane Adams
Identifying Emergent Behaviors in Complex Systems - Jane AdamsTwo Sigma
 
Algorithmic Data Science = Theory + Practice
Algorithmic Data Science = Theory + PracticeAlgorithmic Data Science = Theory + Practice
Algorithmic Data Science = Theory + PracticeTwo Sigma
 
HUOHUA: A Distributed Time Series Analysis Framework For Spark
HUOHUA: A Distributed Time Series Analysis Framework For SparkHUOHUA: A Distributed Time Series Analysis Framework For Spark
HUOHUA: A Distributed Time Series Analysis Framework For SparkTwo Sigma
 
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...Two Sigma
 
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...Two Sigma
 
Graph Summarization with Quality Guarantees
Graph Summarization with Quality GuaranteesGraph Summarization with Quality Guarantees
Graph Summarization with Quality GuaranteesTwo Sigma
 
Rademacher Averages: Theory and Practice
Rademacher Averages: Theory and PracticeRademacher Averages: Theory and Practice
Rademacher Averages: Theory and PracticeTwo Sigma
 
Credit-Implied Volatility
Credit-Implied VolatilityCredit-Implied Volatility
Credit-Implied VolatilityTwo Sigma
 

More from Two Sigma (19)

The State of Open Data on School Bullying
The State of Open Data on School BullyingThe State of Open Data on School Bullying
The State of Open Data on School Bullying
 
Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018Halite @ Google Cloud Next 2018
Halite @ Google Cloud Next 2018
 
Future of Pandas - Jeff Reback
Future of Pandas - Jeff RebackFuture of Pandas - Jeff Reback
Future of Pandas - Jeff Reback
 
BeakerX - Tiezheng Li
BeakerX - Tiezheng LiBeakerX - Tiezheng Li
BeakerX - Tiezheng Li
 
Engineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee JooEngineering with Open Source - Hyonjee Joo
Engineering with Open Source - Hyonjee Joo
 
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel HudsonBringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
Bringing Linux back to the Server BIOS with LinuxBoot - Trammel Hudson
 
Waiter: An Open-Source Distributed Auto-Scaler
Waiter: An Open-Source Distributed Auto-ScalerWaiter: An Open-Source Distributed Auto-Scaler
Waiter: An Open-Source Distributed Auto-Scaler
 
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia YeResponsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
Responsive and Scalable Real-time Data Analytics for SHPE 2017 - Cecilia Ye
 
Archival Storage at Two Sigma - Josh Leners
Archival Storage at Two Sigma - Josh LenersArchival Storage at Two Sigma - Josh Leners
Archival Storage at Two Sigma - Josh Leners
 
Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...Smooth Storage - A distributed storage system for managing structured time se...
Smooth Storage - A distributed storage system for managing structured time se...
 
The Language of Compression - Leif Walsh
The Language of Compression - Leif WalshThe Language of Compression - Leif Walsh
The Language of Compression - Leif Walsh
 
Identifying Emergent Behaviors in Complex Systems - Jane Adams
Identifying Emergent Behaviors in Complex Systems - Jane AdamsIdentifying Emergent Behaviors in Complex Systems - Jane Adams
Identifying Emergent Behaviors in Complex Systems - Jane Adams
 
Algorithmic Data Science = Theory + Practice
Algorithmic Data Science = Theory + PracticeAlgorithmic Data Science = Theory + Practice
Algorithmic Data Science = Theory + Practice
 
HUOHUA: A Distributed Time Series Analysis Framework For Spark
HUOHUA: A Distributed Time Series Analysis Framework For SparkHUOHUA: A Distributed Time Series Analysis Framework For Spark
HUOHUA: A Distributed Time Series Analysis Framework For Spark
 
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
TRIEST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fix...
 
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
Exploring the Urban – Rural Incarceration Divide: Drivers of Local Jail Incar...
 
Graph Summarization with Quality Guarantees
Graph Summarization with Quality GuaranteesGraph Summarization with Quality Guarantees
Graph Summarization with Quality Guarantees
 
Rademacher Averages: Theory and Practice
Rademacher Averages: Theory and PracticeRademacher Averages: Theory and Practice
Rademacher Averages: Theory and Practice
 
Credit-Implied Volatility
Credit-Implied VolatilityCredit-Implied Volatility
Credit-Implied Volatility
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 

Improving Python and Spark Performance and Interoperability with Apache Arrow