presentation at the Big DataShow as part of InternetWorld in London April 2013
its an overview of the main technologies used to process big data, and then a review of some real world big data use cases - showing how they map onto the axes of complexity of analytics and responsiveness
1. WhenTwo Seconds IsToo Long
A look at low latency, real-time analytics
VP Product Marketing
dai@acunu.com
@daiclegg
dai clegg
2. @daiclegg
The new data sources driving big data analytics
2
mobile marketing
social apps
infrastructure monitoring
batch reporting
exploratory analysis
data discovery
Batch AnalyticsOperational Intelligence
infrastructure fabric/logs
smart grids/smart meters
dfid tags, etc
social media
mobile apps
web clicks
Machine DataSocial Data
3. @daiclegg
TITLE HERE
3
the Big Data technology landscape
data mining/
data warehousing
quantitative
analytics
data access
4. @daiclegg
Subtitle
TITLE HERE
4
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
complexity of analytics
use case
money-off
coupons
data mining/
data warehousing
quantitative
analytics
data access
‣ Identifies items that shoppers are likely to
buy in future visits
‣ Coupon redemption rates as high as 24%
5. @daiclegg
Subtitle
TITLE HERE
babies
5
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
data mining/
data warehousing
quantitative
analytics
data access
complexity of analytics
use case
‣ Neo-natal infant monitoring
‣ 120 babies monitored
‣ 120k messages /second
6. @daiclegg
Subtitle
TITLE HERE
babies
6
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
data mining/
data warehousing
quantitative
analytics
data access
complexity of analytics
use case
‣ 2.5 petabytes in Hadoop
‣ weather data, turbine operational data
‣ model weather to optimise wind farms
7. @daiclegg
Subtitle
TITLE HERE
babies
7
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
data mining/
data warehousing
quantitative
analytics
data access
‣ Neo-natal infant monitoring
‣ 120 babies monitored
‣ 120k messages /second
‣ Global e-store
‣ Shopping cart session store
‣ 000s of transactions per second
shopping
carts
complexity of analytics
use case
8. @daiclegg
Subtitle
TITLE HERE
babies
shopping
carts
8
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
data mining/
data warehousing
quantitative
analytics
data access
Hi-tech Mftg
‣ automated hi-tech assembly
‣ ‘000s of test readings per second
‣ comparing results to historic metrics
‣ monitoring the test stations’ performance
complexity of analytics
use case
9. @daiclegg
Subtitle
TITLE HERE
babies
shopping
carts
9
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
data mining/
data warehousing
quantitative
analytics
data access
Hi-tech Mftg
taxis
complexity of analytics
use case
‣ Neo-natal infant monitoring
‣ 120 babies monitored
‣ 120k messages /second
‣ Real-time visibility of infrastructure
‣ Insight delivered into the cab
‣ Caught competitor ‘stealing’ web data
10. @daiclegg
Subtitle
TITLE HERE
babies
shopping
carts
10
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
data mining/
data warehousing
quantitative
analytics
data access
Hi-tech Mftg
taxis
SMS Mktg
‣ ingesting 300m SMSs daily
‣ over 5000 events per second
‣ maintaining 90 days history per campaign
‣ Oracle/NetApp only supported 45 days
complexity of analytics
use case
11. @daiclegg
Subtitle
TITLE HERE
babies
shopping
carts
11
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
data mining/
data warehousing
quantitative
analytics
data access
Hi-tech Mftg
taxis
SMS Mktg
X-factor
complexity of analytics
use case
‣ Neo-natal infant monitoring
‣ 120 babies monitored
‣ 120k messages /second
‣ 1000s of votes/boos/applauds per second
‣ “we were able to handle our problems with
$5,000 and a credit card.” - CTO, Tellybug
12. @daiclegg
Subtitle
TITLE HERE
babies
shopping
carts
12
the Big Data technology landscape
hours minutes seconds milli-seconds
immediacy of results
money-off
coupons
windmills
latest data,
historic context,
analytic insight
deep analytics
worth the wait
data mining/
data warehousing
quantitative
analytics
data access
Hi-tech Mftg
taxis
SMS Mktg
X-factor
just need put & get
just need recent data
complexity of analytics
use case
13. @daiclegg
the Acunu Analytics
13
Acunu
Analytics
delay text
delay text
Prod 4Prod 3Prod 2
Ventas Ron Rate
Ventas BSF
Ventas x VendedorMes en Curso
Cuentas Por
Cobrar
Cuentas Por
Pagar
Contabilidad
Prod 1
Acunu Analytics can ingest data,
at very high velocity, from any
source
The data is pre-processed, as
it arrives, to filter, transform
and enrich it with other
corporate data
And aggregated into roll-up
cubes of sums, averages, top k,
etc, so query answers are
already stored
Then, when a dashboard
query is executed, the
answer is there for instant
response
But not just
dashboards; there’s a
JSON API so queries
can be embedded in
other apps
And the original data is
stored for further analysis
and to share with other
analytic tools