In this session, SnapLogic's Maneesh Joshi will share perspective on why the data integration market is ripe for disruption and what that means to data scientists and data integration professionals. This coverc critical topics for companies that desperately need big insights to remain competitive, but also require help as they struggle to digest massive amounts of data from a variety of sources. Because analytics algorithms are only as good as the comprehensiveness of the data they process, keeping data current and relevant is crucial before the battle for data insights even begins.
This session covers:
==> SnapLogic’s vision for the future of integration, and integration’s role in empowering companies to be more agile and competitive
==> Pragmatic techniques for integrating across today’s increasingly disparate data varieties (unstructured vs. flat files), growing volumes of information (Hadoop clusters vs. data warehouses), and increasing velocities of data (real time vs. batch)
==>Tips for integrating Hadoop data with other data sources, including leading Business Intelligence (BI) apps, for better information flow and decision-making
==> Best practices for dramatically lowering integration costs and improving time to value
To learn more, visit: http://www.snaplogic.com/.
About Maneesh:
Maneesh Joshi has over 15 years of experience in the enterprise software space, primarily in application and data integration. In his current role as Senior Director of Product Marketing at leading enterprise cloud integration company SnapLogic, he is responsible for its global go-to market strategy and product marketing. Prior to this position, Maneesh was the head of platform product marketing at Informatica. He started his career as a key member of the team that built Oracle’s Service Oriented Architecture and Business Process Management businesses. Before running product marketing for this group, he managed product planning, architecture, and engineering operations for Oracle’s integration products. Maneesh holds a B.S. in Engineering from the Indian Institute of Technology, Kharagpur, where he graduated with honors. He also received an M.S. in Engineering from the University of California, Davis, and an M.B.A. from The Wharton School at the University of Pennsylvania.
5. Data Containerization with Snaps
BUY
• SnapStore
• Certified and supported
by SnapLogic
BUILD
• SDK + API
• Java, Python
• Customer, Partner or
SnapLogic
6. Big Data as a Service Architecture
Structured Data
Unstructured
Data
DB
Collect Translate & Enrich Distribute
1 32
Data
View
DB
6
Amazon Redshift
8. • Store information on CouchDB in AWS as
semi-structured JSON documents with
100+ attributes
• SnapLogic, TIBCO and Java programs to
be used in “ingestion” layer to maintain
data in CouchDB
• Use AWS Elastic Search Layer on top of
CouchDB to provide querying
APPROACH
•6x improvement in development time
compared to TIBCO and hand-coding
•Intuitive graphic designer allows for agile
changes in response to requirements
•Seamless integration between cloud
applications and on-premise legacy
applications with conversion between
structured and semi-structured data
•Building snaps in SnapLogic to connect
with new systems determined to be the
fastest way to connect to a new system
•Full-automated one-touch deployment
allows for elastic scaling of SnapLogic
cluster
BENEFITS TO THE CLIENT
• BBY Open vision to encourage a vibrant
reseller and developer community.
• Data propagation from 15-20 major
backend systems, accrued over 12-15
years
• Backend Systems are continually
changing (30+ per month), so need to
move away from hand coding
• Million+ SKUs: Product
information, Warranty Plans with 100+
Attributes and Pricing (with 16
Localization Scenarios)
• External traffic is expanding by million hits
per month
BACKGROUND
Agile Cloud ETL for CouchDB on AWS
9. Summary
• Algorithm Output = f(Your Data)
• Without comprehensive data inputs, the battle
of Big Data is lost before it even begins
• SnapLogic speeds up access of structured and
unstructured data in the cloud, and on-premise
Platform not toolingA la carte – don’t like that word
1990sValuable data was being generated but was really living in silo’d environments. The term MDM was not even coined till 2003As long as you could connect different systems together via a nightly, or sometimes even a weekly feed, that was pretty darn awesome!Technologies like ESBs, EAIs, ETLs… flourished.Data was mostly structured. Sitting in RDBMS systems2000sNetwork speeds increasedCosts went downPlayers like Salesforce and NetSuite started getting traction from SMB marketImmense value on cost and agilityFlexibility of to subscribe vs. perpetual licenses2005: Consumer / Social dataFB, Twitter, LinkedIn, amazon.com consumer reviews…Humans generating massive amounts of preference data, likes and dislikes, Data was different: Non-relational unstructured. Real-time dataHuge volumes: PetabytesProviding immense value to the business on their customers2010: MachineRFID tags. Various other sensors, weblogs. ArcSight got bought out for $1.5B by HPMassive amounts of dataExabytesSplunk had a successful IPO last monthSnap LogicThese 4 sources create an Impendence mismatch!Good luck doing all of this with an ESB Structured vs. unstructuredStreaming vs. batchPetabytes and Exabytes vs. GigaBytesPull vs. pushHub and spokeUnprecedented opportunity & desire to use dataData silos (data fragmentation) unavoidableLegacy Apps, Cloud Apps, and Hadoop are driving thisDifferent locations, protocols, formats, and architecturesData is more distributed & less accessible (less useful)Compounding due to volume & variety of apps & dataESB is just another connectionEnterprises must share data between their appsCollect, combine, process data into valuable informationCompetitive advantage will become necessity for survivalsnapLogic = data sharing platform
1990sValuable data was being generated but was really living in silo’d environments. The term MDM was not even coined till 2003As long as you could connect different systems together via a nightly, or sometimes even a weekly feed, that was pretty darn awesome!Technologies like ESBs, EAIs, ETLs… flourished.Data was mostly structured. Sitting in RDBMS systems2000sNetwork speeds increasedCosts went downPlayers like Salesforce and NetSuite started getting traction from SMB marketImmense value on cost and agilityFlexibility of to subscribe vs. perpetual licenses2005: Consumer / Social dataFB, Twitter, LinkedIn, amazon.com consumer reviews…Humans generating massive amounts of preference data, likes and dislikes, Data was different: Non-relational unstructured. Real-time dataHuge volumes: PetabytesProviding immense value to the business on their customers2010: MachineRFID tags. Various other sensors, weblogs. ArcSight got bought out for $1.5B by HPMassive amounts of dataExabytesSplunk had a successful IPO last monthSnap LogicThese 4 sources create an Impendence mismatch!Good luck doing all of this with an ESB Structured vs. unstructuredStreaming vs. batchPetabytes and Exabytes vs. GigaBytesPull vs. pushHub and spokeUnprecedented opportunity & desire to use dataData silos (data fragmentation) unavoidableLegacy Apps, Cloud Apps, and Hadoop are driving thisDifferent locations, protocols, formats, and architecturesData is more distributed & less accessible (less useful)Compounding due to volume & variety of apps & dataESB is just another connectionEnterprises must share data between their appsCollect, combine, process data into valuable informationCompetitive advantage will become necessity for survivalsnapLogic = data sharing platform
Apple Like Model – we offer an API and about 200 SnapsBuild or BuyEasy to build w Java or Phython – An intern out of school built snaps in 4 daysBuild or Buy – Containerazation of accessAbstraction of the end point – so you do not need to know everything
3rd day on the job2 Dozen customers in a variety of industries,IT because that’s great for start ups – IT companies are early adopters and risk takers.Media (web architecture), Retail (Facing disruption from Amazon looking to increase their competitive advantage using software solutions and better access to informationWe are being managed by the security group at GE – GE Security: Apptio performed an independent security assessment on us before going live. RBC, AMEX took that further and performed even more rigorous data plane and control plane security analysis. (Single channel, port usage.)