SlideShare uma empresa Scribd logo
1 de 5
Baixar para ler offline
Scott Yara




THE SWEET SOUND OF

BIG DATA
EMC Greenplum’s Scott Yara on the Planet-Altering
Power of Data Analytics and Collaboration
                                                    by Terry Brown
                                                    EMC+
That rumbling sound? Information. It’s a dull roar of taps and keystrokes and spin-        GREENPLUM CHORUS:
ning disk drives, crowd noise, chaos, until people like EMC’s Scott Yara arrive. Then
                                                                                           SOCIAL MEETS BIG DATA—
the annoying buzz starts to sound more like music—a Big Data symphony of telling
answers to infinite questions that helps us manage a frantic, data-mad planet.
                                                                                           IN A BIG WAY
                                                                                           Asked to sum up Greenplum Chorus,
The maestro Yara is a co-founder of Greenplum, the database software company in
                                                                                           the Big Data collaboration software
San Mateo, California acquired by EMC in 2010. Greenplum specializes in large-scale
                                                                                           just announced by EMC Greenplum,
data warehousing, analytics, and now, with the introduction of Greenplum Chorus,
                                                                                           Scott Yara does not mince words.
enterprise-wide collaboration. Chorus, according to Yara, is extremely big news.
                                                                                           “I think what Java was to the Inter-
 “We have found,” says Yara, now Greenplum’s Senior Vice President for Products,
                                                                                           net,” says Yara, “Chorus will be to
“that as companies try to gain insight from their data, people and process challenges
                                                                                           Big Data.”
are just as vexing as the infrastructure ones. Chorus is really the first system that
focuses on building a collaboration and social environment for doing the work of           Yara, EMC Greenplum’s Senior Vice
data science. It’s really the first of its kind.                                           President for Products, says that Cho-
                                                                                           rus completes the puzzle of how to
“Chorus is a breakthrough in what typically has been a pretty siloed and scattered
                                                                                           analyze gigantic volumes of disparate
process. Now we can integrate all the tasks that a company does to produce an insight
                                                                                           data. To master Big Data you need
from data and bring it all together in a single environment.”
                                                                                           fast processing, sophisticated analyt-
In other words, for the first time the table is set to take full advantage of Big Data—    ics, and what wasn’t possible until
the analytics platform to work the data and the collaboration environment to speed         Chorus—the ability to collaborate.
decisions about it. So does that mean that Big Data is ready to do what the pundits
                                                                                            “Chorus provides an opportunity for
say—change the world? Yara weighs his words carefully when asked about that—he’s
                                                                                           a data scientist to see the data as-
reluctant (and says so) to boil his viewpoint into sound bites. But he does concede
                                                                                           sets across a company with a simple
the presence of an inexorable trend.
                                                                                           search interface. It gives them the
“We’ve been at this for awhile,” says Yara. “These revolutions take a long time. But       freedom to manipulate and analyze
today there really isn’t an organization on the planet that’s not thinking very deeply     that data as they see fit. You’ll have
about using data, and that just wasn’t the case three or four years ago. It’s moving.      your own private workspace and
And that’s exciting to see.”                                                               sandbox where you can manipulate
                                                                                           that data as you see fit. And then
•••
                                                                                           you have workflow and collaboration
“Big Data” is just what it sounds like—huge volumes of data generated by anything          tools to share that data or insight
and anyone who works or plays or functions on line and in computer networks. Smart         or process back to the company, in
phones, laptops, PCs, mainframes. Social networks, internet shopping, online bank-         a very agile way.
ing, surveillance systems, pavement sensors, call records, health care information,
                                                                                           It’s data at your fingertips, light-speed
and so on and so on. Of course data has always been big, in the context of existing
                                                                                           analytics for an extended, enterprise-
technology—to Ebenezer Scrooge, Big Data was a tall shelf of dusty ledger books.
                                                                                           wide team. “We wanted to make us-
The recent need to name it in capital letters stems from more than the deluge of elec-     ing data inside the enterprise a lot
tronic data and the inability of conventional systems to make sense of it. It also names   more familiar and friendly,” says Yara.
the fiendishly clever new processing and analytics technologies built by people like       “And so providing social collaboration
Yara and his Greenplum colleagues. That’s what led EMC, the leader in information          interfaces using common streams,
infrastructure, to Greenplum—the need to offer ways to analyze the data stored and         user profiles, the opportunity to share
managed on that infrastructure.                                                            things, hopefully that lends itself to
                                                                                           an organizational dynamic that is a
                                                                                           lot more natural.”
“For the last ten years, as the web has exploded and expanded around us,” says Yara,
“the idea of answering questions about customers or buying patterns or whatever
by looking through all this tremendous variety of information was technically impos-
sible. So Greenplum developed a way to support multiple data types, using a parallel
scale-out computing model that mirrored the internet, with analytics software support
that has made some of these really hard things much easier.”

Yara grew up in Minnesota, outside Minneapolis, matriculated to UCLA, studied com-
puter science, and left early to join an internet startup called Sandpiper Networks,
an early player in content delivery systems. Internet performance was spotty in the
early ‘90s, and Sandpiper built caching systems that sped the performance of big
websites. Sandpiper merged with the internet services company Digital Island, went
public, and was sold to the British telecom Cable & Wireless.                             “I think what Java was to
In 2000 Yara started a company called Metapa to capture and analyze information on
the Web. In 2002 Metapa merged with a similar startup, called Didera, whose founder
                                                                                           the Internet, Chorus will be
Luke Lonergan became Yara’s partner in the new venture they called Greenplum.              to Big Data.”
(Where did the name come from? As Yara and Lonergan cast about for a name, one                                  SCOTT YARA
of their employees asked his young daughter for her advice. She suggested “Apple.”                            EMC GREENPLUM

Told that name was taken, she offered up Greenplum, which stuck. Kids bring a lot
of naming help to the Big Data world—the developer of Hadoop, the open source
software that Greenplum uses to analyze unstructured data, named his product after
his son’s toy elephant.)

“It was a natural evolution for EMC as a business,” says Yara. “It’s a huge opportunity
to provide analytic capabilities to customers once they’ve stored all that Big Data on
EMC systems. Here’s the thing: Companies are starting to realize, and consumers are
too, that their most valuable asset isn’t necessarily the intellectual property they’ve
built, but rather the data that they generate as a consequence of their products, so
there is a very aggressive movement to monetize or gain value from that data.

“So what we’re seeing is an economy being built around the data that’s being gener-
ated across all industries and how to unlock the value of that data.”

•••

The first movers in the Big Data world were companies with Internet-enabled busi-
nesses—search engines, online retailers, social networking sites. Now other orga-
nizations—government, universities, offline companies—are learning the potential
of Big Data analytics, and the technology is ready to spread to every corner of the
marketplace because compute and storage costs have dropped dramatically. Now
companies can not only afford to gather and store information—they can also afford
to analyze it.

Now Google can detect regional flu outbreaks a week to ten days faster than the Cen-
ters for Disease Control and Prevention by monitoring increased search term activity
for phrases associated with flu systems. Cities are analyzing traffic data in real time
and making decisions to manage congestion before it becomes a story in tomorrow’s
newspaper. Smart electric grids are helping homeowners monitor and manage their
power use. The Federal government’s USAspending.gov website tracks government
spending and charts the data based on queries by anybody who visits the site.

Big Data is woven into the physical fabric of our lives. The “Internet of Things”—the
physical assets that become part of the information infrastructure—is changing how
companies create business models and people live their lives, giving systems and
people the ability to capture, compute, communicate, and collaborate around in-
formation. Embedded with sensors, actuators, and communications capabilities,
such assets or “things” will soon be able to absorb and transmit information on a
massive scale and, in some cases, to adapt and react to changes in the environment
automatically.

So how will lives be changed? Ask Scott Yara.

“Let’s say you’re a big retail bank,” he says. “You might have 60 million customers
that use a huge number of different products—checking account customers, home
loan customers, credit card customers. Some communicate through the website.
Some complain on Twitter. As the business owner, you want know who your top,
most loyal customers are, and what kinds of products they’re using and not using.
And what makes a great customer?”

Big Data, says Yara, lets a business owner sift through all the data available, answer
thorny questions, and know how to create a business that has more loyal customers
and keeps the bad ones away.

“Let’s say you’re a young woman who lives in a condo in the suburbs and works in an
                                                                                          “The adoption of Big Data
office downtown,” Yara says. “When it’s time to go to work, you instruct your condo        technology in the
when to wash the dishes, when to start a load of laundry, when to open or shut the
windows depending on the weather. “                                                        enterprise will be twice as
Then you’re at the office, Yara says. Your washing machine sends you a text that says      fast and twice as big as
it’s out of detergent and can’t do the load you requested. The text includes a coupon
                                                                                           the virtualization cloud
for detergent at the store you where you most often shop (based on a credit card
spending pattern algorithm). It beeps you when you’re near the store (based on geo         computing market.”
location data and smart car sensors) so you don’t forget. Your refrigerator texts to                            SCOTT YARA
say you’re low on lettuce so if you plan to have a salad with dinner tonight (based                           EMC GREENPLUM

on the menu you programmed into the frig) you should pick up the veggie when you
get the laundry detergent. Maybe there’s there is a two-for-one coupon included for
your favorite salad dressing.

Or maybe you’re planning a trip or wondering about your bank balance or thinking
about phone services. When you call up the airline or your bank and or your telephone
company you won’t be irritated to learn they don’t have your latest purchase history or
account details available. Those simple things will start to become much more com-
monplace and over time the services themselves will seem more personalized to you.

“That’s a big part of it,” says Yara, “but Big Data also represents the services and
business that we provide getting safer and more trustworthy because they can use
this information to better trap the bad guys. So the end result of Big Data is that
hopefully things start to naturally work more efficiently, more securely, and more
personally in a way that feels very natural.”

•••

Of course for some that metaphorical drone of information background noise feels a
bit creepy—the sense of systems behind every wall, in every purse or pants pocket,
on every car dashboard, overhead, in the ground, continuously gathering and for-             “Big Data is about being
warding and analyzing information on everything we do. Yara is aware, but not wary.
                                                                                              able to take all this
“I think with any new technology,” he says, “there will always be concerns over privacy
or security or fraud. People had the same fears about the internet when it first ap-          information, from an
peared —the idea of putting your credit card number online was pretty scary to a lot of       incredible variety of
people back in ‘90s. And those fears are understandable. A number of companies are
building technologies to help make sure that data analysis has some level of encryp-          sources, and answer
tion and access control and security. We will see a need for more awareness around
                                                                                              questions we couldn’t
the ethics or protection of individual rights, and there are already firms tackling these
tough issues—the Electronic Frontier Foundation, Creative Commons, and others.”               answer before.”
The fact is that the Big Data revolution is here to stay. “It’s only going to get bigger,”                        SCOTT YARA
                                                                                                                EMC GREENPLUM
Yara wrote in the Huffington Post last year. “There’s no turning back the tide, no going
back to an era when we knew less.”

How big will Big Data be? “We expect Data Science and data analytics to be perva-
sive,” says Yara, “with far broader reach and impact even than previous-generation
computational science. Big-data computing is perhaps the biggest innovation in
computing in the last decade. We have only begun to see its potential to collect,
organize, and process data in all walks of life.

“My simple guess is that the adoption of Big Data technology in the enterprise will
be twice as fast and twice as big as the virtualization cloud computing market has
been and that’s because while cloud computing is about the bottom line, with more
efficient and optimized infrastructure, Big Data is really about the top line, because the
information itself helps you generate more revenues. It helps you get more profitable
and so I think that the enthusiasm we see growing around Big Data is accelerating
at a pace that is faster than cloud computing itself.”

For John and Jane Doe, Yara says, Big Data works because it can capture who we are
as individuals.

“I think that in the best ways,” says Yara, “Big Data is not something new. It’s an
amplifier for existing human behavior and so when you are looking for things that
you like, whether it’s an individual or music or places to eat or someone to manage
your retirement savings, you have a set of personal preferences that the technology
knows and protects. The serendipity comes when the options available are much more
closely correlated with the things that you already like—it’s an extension of yourself.”

That sounds pretty good.

Mais conteúdo relacionado

Destaque

The Industrial Internet@Work
The Industrial Internet@WorkThe Industrial Internet@Work
The Industrial Internet@WorkEMC
 
Tues palace of versailles
Tues palace of versaillesTues palace of versailles
Tues palace of versaillesTravis Klein
 
Mon wars of religion
Mon wars of religionMon wars of religion
Mon wars of religionTravis Klein
 
RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...
RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...
RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...EMC
 
Federmanager Presentazione Vincenzo Balzani 12 aprile
Federmanager Presentazione Vincenzo Balzani 12 aprileFedermanager Presentazione Vincenzo Balzani 12 aprile
Federmanager Presentazione Vincenzo Balzani 12 aprileMarco Frullanti
 
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...EMC
 
Fri lenin and trotsky
Fri lenin and trotskyFri lenin and trotsky
Fri lenin and trotskyTravis Klein
 
Seize ICT enabledTransformation
Seize ICT enabledTransformationSeize ICT enabledTransformation
Seize ICT enabledTransformationRene Summer
 

Destaque (11)

Printing press
Printing pressPrinting press
Printing press
 
The Industrial Internet@Work
The Industrial Internet@WorkThe Industrial Internet@Work
The Industrial Internet@Work
 
Tues palace of versailles
Tues palace of versaillesTues palace of versailles
Tues palace of versailles
 
Mon wars of religion
Mon wars of religionMon wars of religion
Mon wars of religion
 
RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...
RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...
RSA Laboratories' Frequently Asked Questions About Today's Cryptography, Vers...
 
Federmanager Presentazione Vincenzo Balzani 12 aprile
Federmanager Presentazione Vincenzo Balzani 12 aprileFedermanager Presentazione Vincenzo Balzani 12 aprile
Federmanager Presentazione Vincenzo Balzani 12 aprile
 
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
White Paper: EMC Compute-as-a-Service — EMC Ionix IT Orchestrator, VCE Vblock...
 
Fri lenin and trotsky
Fri lenin and trotskyFri lenin and trotsky
Fri lenin and trotsky
 
Korea1a
Korea1aKorea1a
Korea1a
 
Finland
FinlandFinland
Finland
 
Seize ICT enabledTransformation
Seize ICT enabledTransformationSeize ICT enabledTransformation
Seize ICT enabledTransformation
 

Mais de EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 

Mais de EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Último

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Scott Yara: The Sweet Sound of Big Data

  • 1. Scott Yara THE SWEET SOUND OF BIG DATA EMC Greenplum’s Scott Yara on the Planet-Altering Power of Data Analytics and Collaboration by Terry Brown EMC+
  • 2. That rumbling sound? Information. It’s a dull roar of taps and keystrokes and spin- GREENPLUM CHORUS: ning disk drives, crowd noise, chaos, until people like EMC’s Scott Yara arrive. Then SOCIAL MEETS BIG DATA— the annoying buzz starts to sound more like music—a Big Data symphony of telling answers to infinite questions that helps us manage a frantic, data-mad planet. IN A BIG WAY Asked to sum up Greenplum Chorus, The maestro Yara is a co-founder of Greenplum, the database software company in the Big Data collaboration software San Mateo, California acquired by EMC in 2010. Greenplum specializes in large-scale just announced by EMC Greenplum, data warehousing, analytics, and now, with the introduction of Greenplum Chorus, Scott Yara does not mince words. enterprise-wide collaboration. Chorus, according to Yara, is extremely big news. “I think what Java was to the Inter- “We have found,” says Yara, now Greenplum’s Senior Vice President for Products, net,” says Yara, “Chorus will be to “that as companies try to gain insight from their data, people and process challenges Big Data.” are just as vexing as the infrastructure ones. Chorus is really the first system that focuses on building a collaboration and social environment for doing the work of Yara, EMC Greenplum’s Senior Vice data science. It’s really the first of its kind. President for Products, says that Cho- rus completes the puzzle of how to “Chorus is a breakthrough in what typically has been a pretty siloed and scattered analyze gigantic volumes of disparate process. Now we can integrate all the tasks that a company does to produce an insight data. To master Big Data you need from data and bring it all together in a single environment.” fast processing, sophisticated analyt- In other words, for the first time the table is set to take full advantage of Big Data— ics, and what wasn’t possible until the analytics platform to work the data and the collaboration environment to speed Chorus—the ability to collaborate. decisions about it. So does that mean that Big Data is ready to do what the pundits “Chorus provides an opportunity for say—change the world? Yara weighs his words carefully when asked about that—he’s a data scientist to see the data as- reluctant (and says so) to boil his viewpoint into sound bites. But he does concede sets across a company with a simple the presence of an inexorable trend. search interface. It gives them the “We’ve been at this for awhile,” says Yara. “These revolutions take a long time. But freedom to manipulate and analyze today there really isn’t an organization on the planet that’s not thinking very deeply that data as they see fit. You’ll have about using data, and that just wasn’t the case three or four years ago. It’s moving. your own private workspace and And that’s exciting to see.” sandbox where you can manipulate that data as you see fit. And then ••• you have workflow and collaboration “Big Data” is just what it sounds like—huge volumes of data generated by anything tools to share that data or insight and anyone who works or plays or functions on line and in computer networks. Smart or process back to the company, in phones, laptops, PCs, mainframes. Social networks, internet shopping, online bank- a very agile way. ing, surveillance systems, pavement sensors, call records, health care information, It’s data at your fingertips, light-speed and so on and so on. Of course data has always been big, in the context of existing analytics for an extended, enterprise- technology—to Ebenezer Scrooge, Big Data was a tall shelf of dusty ledger books. wide team. “We wanted to make us- The recent need to name it in capital letters stems from more than the deluge of elec- ing data inside the enterprise a lot tronic data and the inability of conventional systems to make sense of it. It also names more familiar and friendly,” says Yara. the fiendishly clever new processing and analytics technologies built by people like “And so providing social collaboration Yara and his Greenplum colleagues. That’s what led EMC, the leader in information interfaces using common streams, infrastructure, to Greenplum—the need to offer ways to analyze the data stored and user profiles, the opportunity to share managed on that infrastructure. things, hopefully that lends itself to an organizational dynamic that is a lot more natural.”
  • 3. “For the last ten years, as the web has exploded and expanded around us,” says Yara, “the idea of answering questions about customers or buying patterns or whatever by looking through all this tremendous variety of information was technically impos- sible. So Greenplum developed a way to support multiple data types, using a parallel scale-out computing model that mirrored the internet, with analytics software support that has made some of these really hard things much easier.” Yara grew up in Minnesota, outside Minneapolis, matriculated to UCLA, studied com- puter science, and left early to join an internet startup called Sandpiper Networks, an early player in content delivery systems. Internet performance was spotty in the early ‘90s, and Sandpiper built caching systems that sped the performance of big websites. Sandpiper merged with the internet services company Digital Island, went public, and was sold to the British telecom Cable & Wireless. “I think what Java was to In 2000 Yara started a company called Metapa to capture and analyze information on the Web. In 2002 Metapa merged with a similar startup, called Didera, whose founder the Internet, Chorus will be Luke Lonergan became Yara’s partner in the new venture they called Greenplum. to Big Data.” (Where did the name come from? As Yara and Lonergan cast about for a name, one SCOTT YARA of their employees asked his young daughter for her advice. She suggested “Apple.” EMC GREENPLUM Told that name was taken, she offered up Greenplum, which stuck. Kids bring a lot of naming help to the Big Data world—the developer of Hadoop, the open source software that Greenplum uses to analyze unstructured data, named his product after his son’s toy elephant.) “It was a natural evolution for EMC as a business,” says Yara. “It’s a huge opportunity to provide analytic capabilities to customers once they’ve stored all that Big Data on EMC systems. Here’s the thing: Companies are starting to realize, and consumers are too, that their most valuable asset isn’t necessarily the intellectual property they’ve built, but rather the data that they generate as a consequence of their products, so there is a very aggressive movement to monetize or gain value from that data. “So what we’re seeing is an economy being built around the data that’s being gener- ated across all industries and how to unlock the value of that data.” ••• The first movers in the Big Data world were companies with Internet-enabled busi- nesses—search engines, online retailers, social networking sites. Now other orga- nizations—government, universities, offline companies—are learning the potential of Big Data analytics, and the technology is ready to spread to every corner of the marketplace because compute and storage costs have dropped dramatically. Now companies can not only afford to gather and store information—they can also afford to analyze it. Now Google can detect regional flu outbreaks a week to ten days faster than the Cen- ters for Disease Control and Prevention by monitoring increased search term activity for phrases associated with flu systems. Cities are analyzing traffic data in real time and making decisions to manage congestion before it becomes a story in tomorrow’s newspaper. Smart electric grids are helping homeowners monitor and manage their
  • 4. power use. The Federal government’s USAspending.gov website tracks government spending and charts the data based on queries by anybody who visits the site. Big Data is woven into the physical fabric of our lives. The “Internet of Things”—the physical assets that become part of the information infrastructure—is changing how companies create business models and people live their lives, giving systems and people the ability to capture, compute, communicate, and collaborate around in- formation. Embedded with sensors, actuators, and communications capabilities, such assets or “things” will soon be able to absorb and transmit information on a massive scale and, in some cases, to adapt and react to changes in the environment automatically. So how will lives be changed? Ask Scott Yara. “Let’s say you’re a big retail bank,” he says. “You might have 60 million customers that use a huge number of different products—checking account customers, home loan customers, credit card customers. Some communicate through the website. Some complain on Twitter. As the business owner, you want know who your top, most loyal customers are, and what kinds of products they’re using and not using. And what makes a great customer?” Big Data, says Yara, lets a business owner sift through all the data available, answer thorny questions, and know how to create a business that has more loyal customers and keeps the bad ones away. “Let’s say you’re a young woman who lives in a condo in the suburbs and works in an “The adoption of Big Data office downtown,” Yara says. “When it’s time to go to work, you instruct your condo technology in the when to wash the dishes, when to start a load of laundry, when to open or shut the windows depending on the weather. “ enterprise will be twice as Then you’re at the office, Yara says. Your washing machine sends you a text that says fast and twice as big as it’s out of detergent and can’t do the load you requested. The text includes a coupon the virtualization cloud for detergent at the store you where you most often shop (based on a credit card spending pattern algorithm). It beeps you when you’re near the store (based on geo computing market.” location data and smart car sensors) so you don’t forget. Your refrigerator texts to SCOTT YARA say you’re low on lettuce so if you plan to have a salad with dinner tonight (based EMC GREENPLUM on the menu you programmed into the frig) you should pick up the veggie when you get the laundry detergent. Maybe there’s there is a two-for-one coupon included for your favorite salad dressing. Or maybe you’re planning a trip or wondering about your bank balance or thinking about phone services. When you call up the airline or your bank and or your telephone company you won’t be irritated to learn they don’t have your latest purchase history or account details available. Those simple things will start to become much more com- monplace and over time the services themselves will seem more personalized to you. “That’s a big part of it,” says Yara, “but Big Data also represents the services and business that we provide getting safer and more trustworthy because they can use this information to better trap the bad guys. So the end result of Big Data is that
  • 5. hopefully things start to naturally work more efficiently, more securely, and more personally in a way that feels very natural.” ••• Of course for some that metaphorical drone of information background noise feels a bit creepy—the sense of systems behind every wall, in every purse or pants pocket, on every car dashboard, overhead, in the ground, continuously gathering and for- “Big Data is about being warding and analyzing information on everything we do. Yara is aware, but not wary. able to take all this “I think with any new technology,” he says, “there will always be concerns over privacy or security or fraud. People had the same fears about the internet when it first ap- information, from an peared —the idea of putting your credit card number online was pretty scary to a lot of incredible variety of people back in ‘90s. And those fears are understandable. A number of companies are building technologies to help make sure that data analysis has some level of encryp- sources, and answer tion and access control and security. We will see a need for more awareness around questions we couldn’t the ethics or protection of individual rights, and there are already firms tackling these tough issues—the Electronic Frontier Foundation, Creative Commons, and others.” answer before.” The fact is that the Big Data revolution is here to stay. “It’s only going to get bigger,” SCOTT YARA EMC GREENPLUM Yara wrote in the Huffington Post last year. “There’s no turning back the tide, no going back to an era when we knew less.” How big will Big Data be? “We expect Data Science and data analytics to be perva- sive,” says Yara, “with far broader reach and impact even than previous-generation computational science. Big-data computing is perhaps the biggest innovation in computing in the last decade. We have only begun to see its potential to collect, organize, and process data in all walks of life. “My simple guess is that the adoption of Big Data technology in the enterprise will be twice as fast and twice as big as the virtualization cloud computing market has been and that’s because while cloud computing is about the bottom line, with more efficient and optimized infrastructure, Big Data is really about the top line, because the information itself helps you generate more revenues. It helps you get more profitable and so I think that the enthusiasm we see growing around Big Data is accelerating at a pace that is faster than cloud computing itself.” For John and Jane Doe, Yara says, Big Data works because it can capture who we are as individuals. “I think that in the best ways,” says Yara, “Big Data is not something new. It’s an amplifier for existing human behavior and so when you are looking for things that you like, whether it’s an individual or music or places to eat or someone to manage your retirement savings, you have a set of personal preferences that the technology knows and protects. The serendipity comes when the options available are much more closely correlated with the things that you already like—it’s an extension of yourself.” That sounds pretty good.