SlideShare uma empresa Scribd logo
1 de 5
Baixar para ler offline
Hadoop and Vertica at Snagajob: How Big Data 
Technologies Drive Business Results 
Transcript of a BriefingsDirect podcast on how an employment search company is using data 
analysis to bring better matching for job seekers and employers. 
Listen to the podcast. Find it on iTunes. Sponsor: HP 
Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm 
Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this 
ongoing sponsored discussion on IT innovation and how it’s making an impact 
on people’s lives. 
Once again, we're focusing on how companies are adapting to the new style of 
IT to improve IT performance and deliver better user experiences, as well as 
better business results. 
Gardner 
This time, we're coming to you directly from the HP Big Data 2014 Conference in Boston. We're 
here the week of August 11 to learn directly from IT and business leaders alike how big data, 
cloud, and converged infrastructure implementations are supporting their goals. 
Our next innovation case study interview highlights how Snagajob in Richmond, Virginia, one of 
the largest hourly employment networks for job seekers and employers is using big data to 
improve their performance, as well as to better understand how their systems 
provide services to their end users in a very rapid environment. 
Snagajob has already recently delivered almost half a million new jobs in a 
single month to their systems. So the scale here is very impressive. To learn how 
they're managing that we are here with Robert Fehrmann, the Data Architect at 
Snagajob in Richmond, Virginia. Welcome to the show. 
Robert Fehrmann: Dana, thank you for the introduction. 
Gardner: First, tell us about your organization. How are hourly workers different from regular 
employment? What type of employment are we talking about? You’ve been around since 2000 
and you’ve been doing this successfully. Let's understand the role you play in the employment 
market. 
Fehrmann: Snagajob, as you mentioned, is America's largest hourly network for employees and 
employers. The hourly market means we have, relatively speaking, high turnover. 
Another aspect, in comparison to some of our competitors, is that we provide an inexpensive 
service. So our subscriptions are on the low end, compared to our competitors.
Gardner: Tell us how you've used big data to improve your operations. I believe that among the 
first ways that you’ve done that is to try to better analyze your performance metrics. What were 
you facing as a problem when it came to performance metrics? 
Signs of stress 
Fehrmann: A couple of years ago, we started looking at our environment, and it became 
obvious that our traditional technology was showing some signs of stress. As you mentioned, we 
really have data at scale here. We have 20,000 to 25,000 postings per day, and 
we have about 700,000 unique visitors on a daily basis. So data is coming in 
very, very quickly. 
We also realized that we're sitting on a gold mine and we were able to ingest 
data pretty well. But we had problem getting information and innovation out of 
our big data lake. 
Gardner: And of course, real time is important. You want to catch degradation 
in any fashion from your systems right away. How do you then go about 
Fehrmann 
getting this in real time? How do you do the analysis? 
Fehrmann: We started using Hadoop. I'll use a lot of technical terms here. From our website, 
we're getting events. Events are routed via Flume directly into Hadoop. We're collecting about 
600 million key-value pairs on a daily basis. It's a massive amount of data, 25 gigabytes on a 
daily basis. 
The second piece in this journey to big data was analyzing these events, and that’s where we're 
using Vertica. Second, our original use case was to analyze a funnel. A funnel is where people 
come to our site. They're searching for jobs, maybe by keyword, maybe by zip code. A subset of 
that is an interest in a job, and they click on a posting. A subset of that is applying for the job via 
an application. A subset is interest in an employer, and so on. We had never been able to analyze 
this funnel. 
The dataset is about 300 to 400 million rows and 30 to 40 gigabytes. We wanted to make this 
data available, not just to our internal users, but all external users. Therefore, we set ourselves a 
goal of a five-second response time. No query on this dataset should run for more than five 
seconds, and Vertica and Hadoop gave us a solution for this. 
Gardner: Any metrics of success? How have you been able to increase your performance reach 
your key performance indicators (KPIs) and service-level agreements (SLAs)? How has this 
benefited you?
Fehrmann: Another application that we were able to implement is a recommendation engine. A 
recommendation engine is that use where our jobseekers who apply for a specific job may not 
know about all the other jobs that are very similar to this job or that other people have applied to. 
We started analyzing the search results that we were getting and implemented a recommendation 
engine. Sometimes it’s very difficult to have real comparison between before and after. Here, we 
were able to see that we got an 11 percent increase in application flow. Application flow is how 
many applications a customer is getting from us. By implementing this recommendation engine, 
we saw an immediate 11 percent increase in application flow, one of our key metrics. 
Gardner: So you took the success from your big-data implementation and analysis capabilities 
from this performance task to some other areas. Are there other business areas, search yield, for 
example, where you can apply this to get other benefits? 
Brand-new applications 
Fehrmann: When we started, we had the idea that we were looking for a solution for migrating 
our existing environment, to a better-performing new environment. But what we've seen is that 
most of the applications we've developed so far are brand-new applications that we hadn't been 
able to do before. 
You mentioned search yield. Search yield is a very interesting aspect. It’s a massive dataset. It's 
about 2.5 billion rows and about 100 gigabytes of data as of right now and it's continuously 
increasing. So for all of the applications, as well as all of the search requests that we have 
collected since we have started this environment, we're able to analyze the search yield. 
For example, that's how many applications we get for a specific search keyword in real time. By 
real time, I mean that somebody can run a query against this massive dataset and gets result in a 
couple of seconds. We can analyze specific jobs in specific areas, specific keywords that are 
searched in a specific time period or in a specific location of the country. 
Gardner: And once again, now that you've been able to do something you couldn't do before, 
what have been the results? How has that impacted change your business? 
Fehrmann: It really allows our salespeople to provide great information during the prospecting 
phase. If we're prospecting with a new client, we can tell him very specifically that if they're in 
this industry, in this area, they can expect an application flow, depending on how big the 
company is, of let’s say in a hundred applications per day. 
Gardner: How has this been a benefit to your end users, those people seeking jobs and those 
people seeking to fill jobs? 
Fehrmann: There are certainly some jobs that people are more interested in than others. On the 
flip side, if a particular job gets a 100 or 500 applications, it's just a fact that only a small number
going to get that particular job. Now if you apply for a job that isn't as interesting, you have 
much, much higher probability of getting the job. 
Gardner: Now that you’ve been here at the Big Data Conference for a day or two, what’s 
jumping out of you? What would you like to see from HP going forward, maybe across the 
HAVEn Portfolio or tighter integration between Hadoop and Vertica? What's one, of interest, and 
two, what would you like to see next year? 
Fehrmann: I attended one of the technical tracks on Maverick. It's fantastic what's coming up in 
Vertica. Second, what I'd like to see from HP is a tighter integration or to continue to integrate 
Hadoop and Vertica. I think it would be great to see Vertica as sort of front end into Hadoop. 
Vertica has a great analytical engine and has all the SQL-92 compliance. There's not a whole lot 
of competition right now. Most other SQL distributions sitting on top of Hadoop either aren’t as 
compliant in terms of standards or they don't provide all the analytical capabilities. So for HP to 
reach down into the HDFS storage would be a great benefit for us. 
Gardner: Very good. I'm afraid we will have to leave it there. We've been talking with Snagajob, 
based in Richmond, Virginia about how they are using big data on multiple levels to improve 
their business performance, their system’s performance, and ultimately how they go about 
understanding their new challenges and opportunities. 
With that, I'd like to thank our guest. We’ve been joined by Robert Fehrmann, the Data Architect 
at Snagajob in Richmond, Virginia. Thank you. 
Fehrmann: Thank you, Dana. 
Gardner: And I’d like to thank our audience as well for joining us for this special new style of 
IT discussion coming to you directly from the HP Big Data 2014 Conference in Boston. 
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of 
HP sponsored discussions. Thanks again for listening, and do come back next time. 
Listen to the podcast. Find it on iTunes. Sponsor: HP 
Transcript of a BriefingsDirect podcast on how an employment search company is using data 
analysis to bring better matching for job seekers and employers. Copyright Interarbor Solutions, 
LLC, 2005-2014. All rights reserved. 
You may also be interested in: 
• 
How Waste Management Builds a Powerful Services Continuum Across Operations, 
Infrastructure, Development and IT Practices 
• 
GSN Games hits top prize using big data to uncover deep insights into gamer preferences
• 
Hybrid cloud models demand more infrastructure standardization, says global service 
provider Steria 
• 
Service providers gain new levels of actionable customer intelligence from big data 
analytics 
• 
How UK data solutions developer Systems Mechanics uses HP Vertica for BI, streaming 
and data analysis 
• 
Advanced cloud service automation eases application delivery for global service provider 
NNIT 
• 
HP network management heightens performance while reducing total costs for Nordic 
telco TDC 
• 
How Capgemini's UK financial services unit helps clients manage risk using big data 
analysis 
• 
Perfecto Mobile goes to cloud-based testing so developers can build the best apps faster 
• 
Software security pays off: How Heartland Payment Systems gains steep ROI via 
software assurance tools and methods 
• 
HP ART documentation and readiness tools bring better user experiences to Nordic IT 
solutions provider EVRY

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Using a Big Data Solution Helps Conservation International Identify and Proac...
Using a Big Data Solution Helps Conservation International Identify and Proac...Using a Big Data Solution Helps Conservation International Identify and Proac...
Using a Big Data Solution Helps Conservation International Identify and Proac...
 
How Big Data Generates New Insights into What’s Happening in Tropical Ecosyst...
How Big Data Generates New Insights into What’s Happening in Tropical Ecosyst...How Big Data Generates New Insights into What’s Happening in Tropical Ecosyst...
How Big Data Generates New Insights into What’s Happening in Tropical Ecosyst...
 
HP Vertica Provides adMarketplace with Big Data Warehousing Solution
HP Vertica Provides adMarketplace with Big Data Warehousing SolutionHP Vertica Provides adMarketplace with Big Data Warehousing Solution
HP Vertica Provides adMarketplace with Big Data Warehousing Solution
 
DevOps by Design -- Practical Guide to Effectively Ushering DevOps into Any O...
DevOps by Design -- Practical Guide to Effectively Ushering DevOps into Any O...DevOps by Design -- Practical Guide to Effectively Ushering DevOps into Any O...
DevOps by Design -- Practical Guide to Effectively Ushering DevOps into Any O...
 
Nottingham Trent University Elevates Big Data’s Role to Improving Student Re...
Nottingham Trent University Elevates Big Data’s Role  to Improving Student Re...Nottingham Trent University Elevates Big Data’s Role  to Improving Student Re...
Nottingham Trent University Elevates Big Data’s Role to Improving Student Re...
 
How INOVVO Delivers Analysis that Leads to Greater User Retention and Loyalty...
How INOVVO Delivers Analysis that Leads to Greater User Retention and Loyalty...How INOVVO Delivers Analysis that Leads to Greater User Retention and Loyalty...
How INOVVO Delivers Analysis that Leads to Greater User Retention and Loyalty...
 
Beyond Look and Feel--The New Role That User Experience Plays in Business App...
Beyond Look and Feel--The New Role That User Experience Plays in Business App...Beyond Look and Feel--The New Role That User Experience Plays in Business App...
Beyond Look and Feel--The New Role That User Experience Plays in Business App...
 
Putting Buyers and Sellers in the Best Light, How Etsy Leverages Big Data for...
Putting Buyers and Sellers in the Best Light, How Etsy Leverages Big Data for...Putting Buyers and Sellers in the Best Light, How Etsy Leverages Big Data for...
Putting Buyers and Sellers in the Best Light, How Etsy Leverages Big Data for...
 
How Veikkaus Digitally Transforms as it Emerges as the New Combined Finnish N...
How Veikkaus Digitally Transforms as it Emerges as the New Combined Finnish N...How Veikkaus Digitally Transforms as it Emerges as the New Combined Finnish N...
How Veikkaus Digitally Transforms as it Emerges as the New Combined Finnish N...
 
Focus on Data, Risk Control, and Predictive Analysis Drives New Era of Cloud-...
Focus on Data, Risk Control, and Predictive Analysis Drives New Era of Cloud-...Focus on Data, Risk Control, and Predictive Analysis Drives New Era of Cloud-...
Focus on Data, Risk Control, and Predictive Analysis Drives New Era of Cloud-...
 
Intralinks Uses Hybrid Computing to Blaze a Compliance Trail Across the Regul...
Intralinks Uses Hybrid Computing to Blaze a Compliance Trail Across the Regul...Intralinks Uses Hybrid Computing to Blaze a Compliance Trail Across the Regul...
Intralinks Uses Hybrid Computing to Blaze a Compliance Trail Across the Regul...
 
With Large Workforce in the Field, Source Refrigeration Selects an Agile Plat...
With Large Workforce in the Field, Source Refrigeration Selects an Agile Plat...With Large Workforce in the Field, Source Refrigeration Selects an Agile Plat...
With Large Workforce in the Field, Source Refrigeration Selects an Agile Plat...
 
IT Support Gains Automation and Intelligence to Bring Self-Service to Both Le...
IT Support Gains Automation and Intelligence to Bring Self-Service to Both Le...IT Support Gains Automation and Intelligence to Bring Self-Service to Both Le...
IT Support Gains Automation and Intelligence to Bring Self-Service to Both Le...
 
SAP Ariba Chief Strategy Officer on The Digitization of Business and the Futu...
SAP Ariba Chief Strategy Officer on The Digitization of Business and the Futu...SAP Ariba Chief Strategy Officer on The Digitization of Business and the Futu...
SAP Ariba Chief Strategy Officer on The Digitization of Business and the Futu...
 
Mexican ISP Telum Gains Operational Advantages Via Project to Identify and Me...
Mexican ISP Telum Gains Operational Advantages Via Project to Identify and Me...Mexican ISP Telum Gains Operational Advantages Via Project to Identify and Me...
Mexican ISP Telum Gains Operational Advantages Via Project to Identify and Me...
 
How New York Genome Center Manages the Massive Data Generated from DNA Sequen...
How New York Genome Center Manages the Massive Data Generated from DNA Sequen...How New York Genome Center Manages the Massive Data Generated from DNA Sequen...
How New York Genome Center Manages the Massive Data Generated from DNA Sequen...
 
Spirent Leverages Big Data to Keep User Experience Quality a Winning Factor f...
Spirent Leverages Big Data to Keep User Experience Quality a Winning Factor f...Spirent Leverages Big Data to Keep User Experience Quality a Winning Factor f...
Spirent Leverages Big Data to Keep User Experience Quality a Winning Factor f...
 
Redcentric Uses Advanced Configuration Database to Bring into Focus Massive M...
Redcentric Uses Advanced Configuration Database to Bring into Focus Massive M...Redcentric Uses Advanced Configuration Database to Bring into Focus Massive M...
Redcentric Uses Advanced Configuration Database to Bring into Focus Massive M...
 
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
How HTC Centralizes Storage Management to Gain Visibility, Reduce Costs and I...
 
'Extreme Apps’ Approach to Analysis Makes On-Site Retail Experience King Again
'Extreme Apps’ Approach to Analysis Makes On-Site Retail Experience King Again'Extreme Apps’ Approach to Analysis Makes On-Site Retail Experience King Again
'Extreme Apps’ Approach to Analysis Makes On-Site Retail Experience King Again
 

Destaque

Destaque (10)

Legal Services Leader Foley & Lardner LLP Achieves Cost Savings and Increased...
Legal Services Leader Foley & Lardner LLP Achieves Cost Savings and Increased...Legal Services Leader Foley & Lardner LLP Achieves Cost Savings and Increased...
Legal Services Leader Foley & Lardner LLP Achieves Cost Savings and Increased...
 
Virtualization Spurs ERP Operations and Disaster Recovery for Sportswear Gian...
Virtualization Spurs ERP Operations and Disaster Recovery for Sportswear Gian...Virtualization Spurs ERP Operations and Disaster Recovery for Sportswear Gian...
Virtualization Spurs ERP Operations and Disaster Recovery for Sportswear Gian...
 
SOA Re-emerges to Provide Needed Support to Enterprise Architecture in Cloud,...
SOA Re-emerges to Provide Needed Support to Enterprise Architecture in Cloud,...SOA Re-emerges to Provide Needed Support to Enterprise Architecture in Cloud,...
SOA Re-emerges to Provide Needed Support to Enterprise Architecture in Cloud,...
 
The Open Group San Diego Panel Explores Global Cybersecurity Issues for Impro...
The Open Group San Diego Panel Explores Global Cybersecurity Issues for Impro...The Open Group San Diego Panel Explores Global Cybersecurity Issues for Impro...
The Open Group San Diego Panel Explores Global Cybersecurity Issues for Impro...
 
The State of Mobile Security and How Identity Advancement Plays an Essential ...
The State of Mobile Security and How Identity Advancement Plays an Essential ...The State of Mobile Security and How Identity Advancement Plays an Essential ...
The State of Mobile Security and How Identity Advancement Plays an Essential ...
 
Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...
Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...
Cloud and Big Data Come Together in the Ocean Observatories Initiative to Giv...
 
Service Virtualization Solves Quality and Performance Bottlenecks Amid Comple...
Service Virtualization Solves Quality and Performance Bottlenecks Amid Comple...Service Virtualization Solves Quality and Performance Bottlenecks Amid Comple...
Service Virtualization Solves Quality and Performance Bottlenecks Amid Comple...
 
Liberty Mutual Insurance Melds Regulatory Compliance with Security Awareness ...
Liberty Mutual Insurance Melds Regulatory Compliance with Security Awareness ...Liberty Mutual Insurance Melds Regulatory Compliance with Security Awareness ...
Liberty Mutual Insurance Melds Regulatory Compliance with Security Awareness ...
 
Learn More About Advances in Identity Management and It's Role in Reducing Cy...
Learn More About Advances in Identity Management and It's Role in Reducing Cy...Learn More About Advances in Identity Management and It's Role in Reducing Cy...
Learn More About Advances in Identity Management and It's Role in Reducing Cy...
 
New Health Data Deluges Require Secure Information Flow Enablement Via Standa...
New Health Data Deluges Require Secure Information Flow Enablement Via Standa...New Health Data Deluges Require Secure Information Flow Enablement Via Standa...
New Health Data Deluges Require Secure Information Flow Enablement Via Standa...
 

Semelhante a Hadoop and Vertica at Snagajob: How Big Data Technologies Drive Business Results

BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...
BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...
BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...
Dana Gardner
 

Semelhante a Hadoop and Vertica at Snagajob: How Big Data Technologies Drive Business Results (20)

Roundtable Discussion: Revlon, SAP and VMware See huge Benefits from Aggressi...
Roundtable Discussion: Revlon, SAP and VMware See huge Benefits from Aggressi...Roundtable Discussion: Revlon, SAP and VMware See huge Benefits from Aggressi...
Roundtable Discussion: Revlon, SAP and VMware See huge Benefits from Aggressi...
 
HP Vertica
HP Vertica HP Vertica
HP Vertica
 
Using Testing as a Service, Globe Testing Helping Startups Make Leap to Cloud...
Using Testing as a Service, Globe Testing Helping Startups Make Leap to Cloud...Using Testing as a Service, Globe Testing Helping Startups Make Leap to Cloud...
Using Testing as a Service, Globe Testing Helping Startups Make Leap to Cloud...
 
HP's Converged Infrastructure and Data Center Transformation Models Define th...
HP's Converged Infrastructure and Data Center Transformation Models Define th...HP's Converged Infrastructure and Data Center Transformation Models Define th...
HP's Converged Infrastructure and Data Center Transformation Models Define th...
 
Internet of Things Brings On Development Demands That DevOps Manages, Say Exp...
Internet of Things Brings On Development Demands That DevOps Manages, Say Exp...Internet of Things Brings On Development Demands That DevOps Manages, Say Exp...
Internet of Things Brings On Development Demands That DevOps Manages, Say Exp...
 
Choice, Consistency, Confidence Keys to Improving Services' Performance throu...
Choice, Consistency, Confidence Keys to Improving Services' Performance throu...Choice, Consistency, Confidence Keys to Improving Services' Performance throu...
Choice, Consistency, Confidence Keys to Improving Services' Performance throu...
 
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
 
Closing the Digital Transformation Gap
Closing the Digital Transformation GapClosing the Digital Transformation Gap
Closing the Digital Transformation Gap
 
Synthetic APIs Shape the Future of Data Acquisition and Management
Synthetic APIs Shape the Future of Data Acquisition and ManagementSynthetic APIs Shape the Future of Data Acquisition and Management
Synthetic APIs Shape the Future of Data Acquisition and Management
 
Gaining Digital Business Strategic View Across More Data Gives AmeriPride Cul...
Gaining Digital Business Strategic View Across More Data Gives AmeriPride Cul...Gaining Digital Business Strategic View Across More Data Gives AmeriPride Cul...
Gaining Digital Business Strategic View Across More Data Gives AmeriPride Cul...
 
A Practical Guide to Rapid ITSM as a Foundation for Overall Business Agility
A Practical Guide to Rapid ITSM as a Foundation for Overall Business AgilityA Practical Guide to Rapid ITSM as a Foundation for Overall Business Agility
A Practical Guide to Rapid ITSM as a Foundation for Overall Business Agility
 
Showing Value Early and Often Boosts Software Testing Overhaul Success at Pom...
Showing Value Early and Often Boosts Software Testing Overhaul Success at Pom...Showing Value Early and Often Boosts Software Testing Overhaul Success at Pom...
Showing Value Early and Often Boosts Software Testing Overhaul Success at Pom...
 
GoodData Developers Share Their Big Data Platform Wish List
GoodData Developers Share Their Big Data Platform Wish ListGoodData Developers Share Their Big Data Platform Wish List
GoodData Developers Share Their Big Data Platform Wish List
 
How Tunisian IT Service Provider Tunisie Builds Improved IT Service Managemen...
How Tunisian IT Service Provider Tunisie Builds Improved IT Service Managemen...How Tunisian IT Service Provider Tunisie Builds Improved IT Service Managemen...
How Tunisian IT Service Provider Tunisie Builds Improved IT Service Managemen...
 
Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...
Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...
Case Study: Sprint Simplifies IT Environment with Speedy Implementation of To...
 
Explore the Roles and Myths of Automation and Virtualization in Data Center T...
Explore the Roles and Myths of Automation and Virtualization in Data Center T...Explore the Roles and Myths of Automation and Virtualization in Data Center T...
Explore the Roles and Myths of Automation and Virtualization in Data Center T...
 
Unum Group Architect Charts a DevOps Course to a Hybrid Cloud Future
Unum Group Architect Charts a DevOps Course to a Hybrid Cloud FutureUnum Group Architect Charts a DevOps Course to a Hybrid Cloud Future
Unum Group Architect Charts a DevOps Course to a Hybrid Cloud Future
 
BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...
BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...
BSM and IT Data Access Improvement at Swiss Insurer and Turkish Mobile Carrie...
 
Fast-Changing Demands on Data Centers Drives the Need for Automated Data Cent...
Fast-Changing Demands on Data Centers Drives the Need for Automated Data Cent...Fast-Changing Demands on Data Centers Drives the Need for Automated Data Cent...
Fast-Changing Demands on Data Centers Drives the Need for Automated Data Cent...
 
Manufacturer Gains Advantage by Expanding IoT Footprint from Many Machines to...
Manufacturer Gains Advantage by Expanding IoT Footprint from Many Machines to...Manufacturer Gains Advantage by Expanding IoT Footprint from Many Machines to...
Manufacturer Gains Advantage by Expanding IoT Footprint from Many Machines to...
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

Hadoop and Vertica at Snagajob: How Big Data Technologies Drive Business Results

  • 1. Hadoop and Vertica at Snagajob: How Big Data Technologies Drive Business Results Transcript of a BriefingsDirect podcast on how an employment search company is using data analysis to bring better matching for job seekers and employers. Listen to the podcast. Find it on iTunes. Sponsor: HP Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing sponsored discussion on IT innovation and how it’s making an impact on people’s lives. Once again, we're focusing on how companies are adapting to the new style of IT to improve IT performance and deliver better user experiences, as well as better business results. Gardner This time, we're coming to you directly from the HP Big Data 2014 Conference in Boston. We're here the week of August 11 to learn directly from IT and business leaders alike how big data, cloud, and converged infrastructure implementations are supporting their goals. Our next innovation case study interview highlights how Snagajob in Richmond, Virginia, one of the largest hourly employment networks for job seekers and employers is using big data to improve their performance, as well as to better understand how their systems provide services to their end users in a very rapid environment. Snagajob has already recently delivered almost half a million new jobs in a single month to their systems. So the scale here is very impressive. To learn how they're managing that we are here with Robert Fehrmann, the Data Architect at Snagajob in Richmond, Virginia. Welcome to the show. Robert Fehrmann: Dana, thank you for the introduction. Gardner: First, tell us about your organization. How are hourly workers different from regular employment? What type of employment are we talking about? You’ve been around since 2000 and you’ve been doing this successfully. Let's understand the role you play in the employment market. Fehrmann: Snagajob, as you mentioned, is America's largest hourly network for employees and employers. The hourly market means we have, relatively speaking, high turnover. Another aspect, in comparison to some of our competitors, is that we provide an inexpensive service. So our subscriptions are on the low end, compared to our competitors.
  • 2. Gardner: Tell us how you've used big data to improve your operations. I believe that among the first ways that you’ve done that is to try to better analyze your performance metrics. What were you facing as a problem when it came to performance metrics? Signs of stress Fehrmann: A couple of years ago, we started looking at our environment, and it became obvious that our traditional technology was showing some signs of stress. As you mentioned, we really have data at scale here. We have 20,000 to 25,000 postings per day, and we have about 700,000 unique visitors on a daily basis. So data is coming in very, very quickly. We also realized that we're sitting on a gold mine and we were able to ingest data pretty well. But we had problem getting information and innovation out of our big data lake. Gardner: And of course, real time is important. You want to catch degradation in any fashion from your systems right away. How do you then go about Fehrmann getting this in real time? How do you do the analysis? Fehrmann: We started using Hadoop. I'll use a lot of technical terms here. From our website, we're getting events. Events are routed via Flume directly into Hadoop. We're collecting about 600 million key-value pairs on a daily basis. It's a massive amount of data, 25 gigabytes on a daily basis. The second piece in this journey to big data was analyzing these events, and that’s where we're using Vertica. Second, our original use case was to analyze a funnel. A funnel is where people come to our site. They're searching for jobs, maybe by keyword, maybe by zip code. A subset of that is an interest in a job, and they click on a posting. A subset of that is applying for the job via an application. A subset is interest in an employer, and so on. We had never been able to analyze this funnel. The dataset is about 300 to 400 million rows and 30 to 40 gigabytes. We wanted to make this data available, not just to our internal users, but all external users. Therefore, we set ourselves a goal of a five-second response time. No query on this dataset should run for more than five seconds, and Vertica and Hadoop gave us a solution for this. Gardner: Any metrics of success? How have you been able to increase your performance reach your key performance indicators (KPIs) and service-level agreements (SLAs)? How has this benefited you?
  • 3. Fehrmann: Another application that we were able to implement is a recommendation engine. A recommendation engine is that use where our jobseekers who apply for a specific job may not know about all the other jobs that are very similar to this job or that other people have applied to. We started analyzing the search results that we were getting and implemented a recommendation engine. Sometimes it’s very difficult to have real comparison between before and after. Here, we were able to see that we got an 11 percent increase in application flow. Application flow is how many applications a customer is getting from us. By implementing this recommendation engine, we saw an immediate 11 percent increase in application flow, one of our key metrics. Gardner: So you took the success from your big-data implementation and analysis capabilities from this performance task to some other areas. Are there other business areas, search yield, for example, where you can apply this to get other benefits? Brand-new applications Fehrmann: When we started, we had the idea that we were looking for a solution for migrating our existing environment, to a better-performing new environment. But what we've seen is that most of the applications we've developed so far are brand-new applications that we hadn't been able to do before. You mentioned search yield. Search yield is a very interesting aspect. It’s a massive dataset. It's about 2.5 billion rows and about 100 gigabytes of data as of right now and it's continuously increasing. So for all of the applications, as well as all of the search requests that we have collected since we have started this environment, we're able to analyze the search yield. For example, that's how many applications we get for a specific search keyword in real time. By real time, I mean that somebody can run a query against this massive dataset and gets result in a couple of seconds. We can analyze specific jobs in specific areas, specific keywords that are searched in a specific time period or in a specific location of the country. Gardner: And once again, now that you've been able to do something you couldn't do before, what have been the results? How has that impacted change your business? Fehrmann: It really allows our salespeople to provide great information during the prospecting phase. If we're prospecting with a new client, we can tell him very specifically that if they're in this industry, in this area, they can expect an application flow, depending on how big the company is, of let’s say in a hundred applications per day. Gardner: How has this been a benefit to your end users, those people seeking jobs and those people seeking to fill jobs? Fehrmann: There are certainly some jobs that people are more interested in than others. On the flip side, if a particular job gets a 100 or 500 applications, it's just a fact that only a small number
  • 4. going to get that particular job. Now if you apply for a job that isn't as interesting, you have much, much higher probability of getting the job. Gardner: Now that you’ve been here at the Big Data Conference for a day or two, what’s jumping out of you? What would you like to see from HP going forward, maybe across the HAVEn Portfolio or tighter integration between Hadoop and Vertica? What's one, of interest, and two, what would you like to see next year? Fehrmann: I attended one of the technical tracks on Maverick. It's fantastic what's coming up in Vertica. Second, what I'd like to see from HP is a tighter integration or to continue to integrate Hadoop and Vertica. I think it would be great to see Vertica as sort of front end into Hadoop. Vertica has a great analytical engine and has all the SQL-92 compliance. There's not a whole lot of competition right now. Most other SQL distributions sitting on top of Hadoop either aren’t as compliant in terms of standards or they don't provide all the analytical capabilities. So for HP to reach down into the HDFS storage would be a great benefit for us. Gardner: Very good. I'm afraid we will have to leave it there. We've been talking with Snagajob, based in Richmond, Virginia about how they are using big data on multiple levels to improve their business performance, their system’s performance, and ultimately how they go about understanding their new challenges and opportunities. With that, I'd like to thank our guest. We’ve been joined by Robert Fehrmann, the Data Architect at Snagajob in Richmond, Virginia. Thank you. Fehrmann: Thank you, Dana. Gardner: And I’d like to thank our audience as well for joining us for this special new style of IT discussion coming to you directly from the HP Big Data 2014 Conference in Boston. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP sponsored discussions. Thanks again for listening, and do come back next time. Listen to the podcast. Find it on iTunes. Sponsor: HP Transcript of a BriefingsDirect podcast on how an employment search company is using data analysis to bring better matching for job seekers and employers. Copyright Interarbor Solutions, LLC, 2005-2014. All rights reserved. You may also be interested in: • How Waste Management Builds a Powerful Services Continuum Across Operations, Infrastructure, Development and IT Practices • GSN Games hits top prize using big data to uncover deep insights into gamer preferences
  • 5. • Hybrid cloud models demand more infrastructure standardization, says global service provider Steria • Service providers gain new levels of actionable customer intelligence from big data analytics • How UK data solutions developer Systems Mechanics uses HP Vertica for BI, streaming and data analysis • Advanced cloud service automation eases application delivery for global service provider NNIT • HP network management heightens performance while reducing total costs for Nordic telco TDC • How Capgemini's UK financial services unit helps clients manage risk using big data analysis • Perfecto Mobile goes to cloud-based testing so developers can build the best apps faster • Software security pays off: How Heartland Payment Systems gains steep ROI via software assurance tools and methods • HP ART documentation and readiness tools bring better user experiences to Nordic IT solutions provider EVRY