This talk was presented by Sriskandarajah Suhothayan (WSO2) and Roland Major (Transport for London) at the Strata Data Conference in London, May 23 2017.
Transport for London (TfL) uses a wide range of data for operational purposes, but the underlying data is typically held in multiple disconnected systems. Freedom of Information requests have helped prove the value of sharing this data. TfL is embarking on a journey to make more of this data open and available in real time.
TfL and WSO2 have been working together on broader integration projects. Roland Major and Sriskandarajah Suhothayan share the evolving big data and IoT architectures and services TfL is building to pull together these diverse datasets to better support operational teams and accelerate the identification and classification of disruption to improve response times for incidents. In particular, they explore WSO2’s solution, which emerged from the Data in Motion hackathon organized by TfL, AWS, and Geovation. The solution innovates TfL’s heterogeneous data sources through the combination of the TfL Unified API and its operational data sources, including traffic sensor, air quality, and passenger flow data, to provide better travel time and transit suggestions for Londoners and tourists using the WSO2 Data Analytics Server, WSO2 Complex Event Processor, and WSO2 API Manager, bringing together IoT and big data techniques to feed a real-time dashboard of current and predicted transport network status.
How AI, OpenAI, and ChatGPT impact business and software.
Transport for London: Using data to keep London moving
1. Transport for London, using data
to keep London moving
Roland Major
Enterprise Architect
TfL
Sriskandarajah Suhothayan
Associate Director / Architect
WSO2
2. Agenda
§ Introduction to Transport of London
- Surface Intelligent Transport System
- Conceptual Architecture
§ Introduction to WSO2
§ Data in Motion - Hack Week
- ‘Live Journey Planner’ prototype
- How It’s Done?
§ Data Driven Operational Applications
§ Game Changing Visualizations
§ Learning Outcomes
3. Breadth of Transport for London
§ 30 million journeys daily
§ In addition to all road and rail
transport, we look after rivers, assisted
travel, taxi and private hire regulation
§ We do much more also from education
to running the world’s largest out of
home advertising estate of its type
§ We’re 150 years old and chock full of
heritage and design assets
4. Road Space Management
§ Sponsorship
- Own the improvement strategy for the TLRN
- Engage with a wide range of customers and external stakeholders
- Deliver user benefits that are clearly defined and measured
§ Outcome Delivery
- Highways design and engineering
- Traffic modelling capability
- Intelligent control of traffic signals
- Monitor, analyse and optimise road network performance
§ Operations
- 24/7 Operation to keep London moving
- Real-time incident management through LSTCC
- Assess and coordinate works and schemes to minimise disruption on TfL’s roads
5. Definition
§ Digital transformation is the change associated with the application of digital
technology in all aspects of human society. ...
§ The transformation stage means that digital usages inherently enable new types
of innovation and creativity in a particular domain, rather than simply enhance and
support the traditional methods.
6. Surface Intelligent Transport System
§ SITS will
- provide the capability to unlock significant additional effective capacity on the road
network for the future
- enable and support delivery of a multi-modal approach to transport management
by using and allocating existing and new capacity
- enable and support delivery of a Balanced Scorecard approach to transport
management by using and allocating existing and new capacity based on local
modal demands
7. Conceptual Architecture
Key Characteristics
§ Bringing information and insight closer to
decisions
- Locale, map based UI
- Timely, event driven not batch
- Trusted, consistent and accurate
§ Public cloud hosted
§ Reusable commodity platforms
§ Open Standards
Integration
Data Hub
Collaboration
and Innovation
Analytics and
Visualisation
Data Driven
Operational
Applications
Secure Enclave
8. The Data Hub
Commodity Cloud
§ Scalable Object Store S3
§ Persistence Services (SQL and No SQL) Aurora +Dynamo
§ Elastic Search
§ Spark
§ Compute EC2
Configured Platform Services
§ Spatial - ESRI
§ Data Warehouse - Oracle RDBMS +Redshift
§ Middleware Platform – WS02
Data Hub
9. WSO2
§ Enabler for Digital Transformation
§ 100% Open Source Middleware Platform
§ Offices in : Mountain View, New York, London, Sao Paolo, Colombo
§ 350+ Customers
§ 450 People, 300 Engineers
11. Data In Motion - Hack Week
§ September 26-29
§ Objective : Managing the Capacity of London’s Transport Network
- Maximizing capacity on the public transport network
- Maximizing capacity on the roads network
- Improving air quality
§ Datasets
- TFL APIs (Realtime and Historical)
- SCOOT sensor reading (Realtime)
- Passenger flow (Historical)
- Air quality (from KCL) (Historical)
§ Solution
- ‘Live Journey Planner’ Prototype
12.
13. Chance of get a seat in the train?
Summarized Historic Data
14. Traffic Control Sensors
§ TfL has about 14,000 sensors measuring junctions approaching junctions
§ This data is currently used by the Real Time control to manage optimization
§ We have been investigating how it can be processed
- Scale 780 Million events per day
- Latency to data center circa 1 Second
- Resolution 250ms scans
Junction
SCOOT Sensor
18. Looking into the Future
§ Collect SCOOT Data
§ Learning traffic patterns using R
- Building Random Forest Classification (it’s 88% accurate for this usecase) !
§ Exported the model as PMML
§ Use the model to predict traffic in realtime with WSO2 DAS
20. How It’s Done ?
Raw SCOOT
Data
Traffic and
Flow
Calculation
Integrating
Historic
Summarization
Predicting
Traffic
Potential
Incident
Analysis
§ Realtime data processing pipeline
21. Detect Headway and Vehicle Length
§ With row SCOOT data stream
Sample stream: 1111100000111111100000
define stream ScootStream (scootId string, time long, reading int, seqId long);
from every e1=ScootStream[reading==1], e2=ScootStream[reading==0]+,
e3=ScootStream[reading==1]+, e4=ScootStream[reading==0]
select e3[0].seqId - e2[0].seqId as headway,
e4[0].seqId - e3[0].seqId as vehicleLength, ...
insert into DetectorStream;
Raw SCOOT Data
Traffic and Flow
Calculation
Integrating Historic
Summarization
Predicting Traffic
Potential Incident
Analysis
Pattern Matching
22. Traffic and Flow Calculation
§ Calculated for all SCOOT Detectors in last 10 minuets at London region
from DetectorStream[str:split(scootId, "-", 0)== ‘london’]#window.time(’10 min’)
select count(*)/60 as flow,
avg(vehicleLength)/60 as traffic,
1/avg(headway) as density, ...
group by scootId
insert into TrafficStream;
§ The results are mapped to links and presented via APIs for visual representation
Raw SCOOT Data
Traffic and Flow
Calculation
Integrating Historic
Summarization
Predicting Traffic
Potential Incident
Analysis
Filtering
SlidingTimeWindow
Group By
Aggregations
23. Integrating With Historic Summarization
§ Historic data analysis with Apache Spark
§ Joining With Summarized Data
@from(table=‘rdbms’, url=‘...’, ...)
define table TrafficSummery (scootId string, week int, day string, traffic long);
from TrafficStream as ts join TrafficSummery as tt
on ts.week == tt.week and ts.day == tt.day
select ts.traffic as currentTraffic, tt.traffic as usualTraffic, ...
insert into SummeryTrafficStream;
Raw SCOOT Data
Traffic and Flow
Calculation
Integrating Historic
Summarization
Predicting Traffic
Potential Incident
Analysis
Join
24. How it’s Done?
§ Predicting traffic in next 15 minutes
from SummeryTrafficStream
#pmml:predict(’wso2das-3.1.0/marbel_model.pmml')
select *
insert into PredictedTrafficStream;
Raw SCOOT Data
Traffic and Flow
Calculation
Integrating Historic
Summarization
Predicting Traffic
Potential Incident
Analysis
Predict Function
25. Potential Incident Analysis
§ Increasing trend in traffic hikes
from PredictedTrafficStream
select currentTraffic - historicTraffic as currentHick,
predictedTraffic – currentTraffic as predictedHick, ...
having currentTrafficHick > 0 and predictedTrafficHick > 0
insert into TrafficHickStream;
from every e1=TrafficHickStream ->
e2=TrafficHickStream[ (e2.currentHick – e1.currentHick)*100.0 / e1.currentHick > 20 and
(e2.predictedHick – e1.predictedHick)*100.0 / e1.predictedHick > 20]
with in 15 min
insert into PotentialIncidentStream;
Raw SCOOT Data
Traffic and Flow
Calculation
Integrating Historic
Summarization
Predicting Traffic
Potential Incident
Analysis
Pattern Matching
27. Data Driven Operational Applications
London Works 2
§ Central Register – a pan London system enabling visibility and management of works and
related activities in London
§ Traffic Management Act Notifications (TMAN) - A dedicated interface between London
boroughs and TfL enabling the balanced delivery of major schemes and works on the
TLRN and SRN
§ Forward Planning Tool - An advance planning tool that allows promoters to provide early
visibility of road and street works
36. Visualization For Situational Awareness
§ Trial of Waze crowdsourced has just
started
§ Early results are promising for
incident detection
§ Solution combines several data
sources to enrich real time view
§ Management views and dashboard
included
37. Visualization For Situational Awareness …
§ Needs Filters and Alerts
§ Areas of Interest
§ Design is evolving rapidly
38. Learning Points
§ Realtime analytics on data provides edge
§ Keep focus on usability
§ Use the right tool for the right task
§ Skills are the biggest hurdle
§ Bringing information sets together encourages new thinking
§ Using Agile approaches has transformed outcomes
§ Removing system fragmentation has a big impact on organization
§ Flattening delivery structures & small staged initiatives
§ Platform approach
§ Early indications of making data easily shared and integrated is improving decisions
39. Thank You !
Roland Major
roland.major@btopenworld.com
S. Suhothayan
suho@wso2.com
www.wso2.com