6. NO. NOT MILK FLOATS
ALL ELECTRIC, COMMERCIAL VEHICLES. (ANYMORE)
Photo courtesy of kenjonbro: http://www.flickr.com/photos/kenjonbro/4037649210/in/set-72157623026469013
41. MEMCACHE
// instantiate
$mc = new Memcache();
// set the memcache server and port
$mc->connect( MEMCACHE_SERVER, MEMCACHE_PORT );
// get data based on a key
$value = $mc->get(‘key’);
58. WHICH BUCKET?
public function getTableNameFromDate( $date ) {
// we can’t query for future data, so if the date is in the future
// reset it to today
$date = ( $date > date( 'Y-m-d') ) ? date('Y-m-d') : $date;
// get the time in seconds since epoc
$stt = strtotime( $date );
// is the query date since we implemented sharding?
if( $date >= $this->switchOver ) {
// calculate the year this week is from
$year = ( date( 'm', $stt ) == 01 && date( 'W', $stt ) == 52 ) ? date('Y',
$stt ) - 1 : date('Y', $stt );
// add the year and the week number to the table, and return
return ‘data_' . $year . '_' . date('W', $stt );
} else {
// return the legacy table
return 'data';
}
}
75. INTERLACE
* Add an array to the interlation
public function addArray( $name, $array )
* Get the time that we first receive data in one of our arrays
public function getFirst( $field )
* Get the time that we last received data in any of our arrays
public function getLast( $field )
* Generate the interlaced array
public function generate( $keyField, $valueField )
* Beak the interlaced array down into seperate days
public function dayBreak( $interlationArray )
* Generate an interlaced array and fill for all timestamps within the range of
_first_ to _last_
public function generateAndFill( $keyField, $valueField )
* Populate the new combined array with key fields using the common field
public function populateKeysFromField( $field, $valueField=null )
http://www.michaelpeacock.co.uk/interlation-library
For the past 10 months I’ve been involved in a very challenging big-data focused application. This presentation is a case study of our particular application, with some of the challenges and solutions we found in dealing with the huge volumes of data, processing them, and displaying them within the web application. The challenges ranged from being available to process the data, storing data on volume, querying the database quickly, keeping the web application responsive.Our team inherited a large legacy application, which meant we were left with stability problems, hacks and architecture decisions which we didn’t necessarily agree with.
For those of you who don’t know me; I’m Michael Peacock – I’m the web systems developer for Smith Electric Vehicles on their telemetry project. I’m an experienced lead developer, I spent 5 years running a small web design and development firm, writing bespoke CMS, E-Commerce and CRM solutions. I’ve written a few PHP related books, and volunteer with the PHP North-East user group in the UK.You can find me on twitter, or contact me if you have any questions later.
Smith are the worlds largest manufacturer of all electric, commercial vehicles. Founded over 90 years ago to build electric delivery vehicles – both battery based and cable based. In 2009 the company opened its doors in the US, and at the start of last year the US operation bought out the European company which brings us to where we are today.
Normally when I tell people I work with Electric Vehicles, they think of hybrids like the Prius, or they think about passenger electric vehicles such as the Nissan Leaf. When I tell them its COMMERCIAL electric vehicles, they think about milkfloats or airport passenger buggies.What we actually make are large scale, fully electric delivery and transport vehicles.
These vehicles range from depo based delivery vehicles to home delivery vehicles to utility applications, military applications right through to the american school bus.
These are 16 and a half thousand to 26 thousand pound delivery vehicles, capable of supporting upto 16 thousand pound payload, with a top speed of 80km/h.
Electric vehicles bring to us a huge data challenge. As I’m sure you can appreciate, they are a new continually evolving technology; people are looking for viability evidence; government want to do research; For passenger vehicles charging infrastructures need to be planned and developed. Customers who buy these vehicles want to monitor them; be these performance metrics or to evaluate that their drivers have been properly educated to drive an electric vehicle – as they are very different to drive.
The solution to these problems was to develop a vehicle telematics service to collect data on the vehicles. Our vehicles, and many other types of vehicles, have a number of internal networks, called CAN buses, which broadcast data continuously around the vehicle – in such a way that a central host isn’t required. The problem is they broadcast data internally at a rate of hundreds of times per second. We took and developed a CANBus monitor to sample this on a per second basis. This information included: drive train – how hard and fast are the pedals being pressed, how fast is the motor spinning, how hot is the motor. What is the overall state of the battery – current, voltage, temperature and state of charge. What about the individual modules in the battery, how are they performing?We also needed to know where the vehicle was, what the outside temperature is and so on.We also need to report any error codes the vehicle reports. Our telemetry unit samples all of this information on a per second basis, packages it up and broadcasts it over the GPRS network. ---Another problem was that the project was initially designed by a group of contractors, each handling a different aspect of the application, developing what was initially a proof of concept system. As the project grew, requirements changed, the data volumes increased exponentially, data became more important and stability became a big problem – right about this time the company hired a full time internal telemetry team. The team consists of two hands on technical staff: myself as the web application developer, and my colleague as the systems administrator and DBA
We monitor 2,500 data points from the vehicles CANbus, because of the per-second sampling, we have this per second per vehicle...
Currently, we have around 500 vehicles with telemetry installed giving us data. Telemetry is now going to be a standard component of the vehicle – meaning we are going to have to deal with the data of every vehicle that rolls off the production line.Our MySQL solution had to deal with around one and a half billion MySQL Inserts every single day; we have a constant minimum of 4000 inserts every day.It has recently been publicly announced that new production facilities are being built and new partners are going to be making and selling our vehicles; that means we will have even more data to worry about.
When the project was initially conceived, it was a really simple mandate: we only needed a small number of data points collected for each vehicle, which we would export and provide to grant authorities. A challenge in itself, but not too big of a worry.
The teams of contractors who were tasked with developing the initial proof of concept solution created a basic web application, with a simple database:A single table held all of the dataA single table held all the data descriptorsThese two were joined to match the data e.g. 100% with the descriptor Battery State of ChargeA key field was part of the table to link the data back to the vehicle.The vehicles communicated directly with our servers.
Once the proof of concept was delivered and the benefits of the system were seen, a new mandate was given. Why should we just give a small sample away, why don’t we keep all of the data to monitor the vehicles, monitor the technology, deal with service and warranty issues (after all, we don’t have a dealer on every corner our customers can turn to), and look at vehicle performance data.
Once this mandate came into play, there were still only a small number of vehicles with telemetry installed. The initial team working on the project took a sledge hammer to the database. Where there was initially only one database, they created one for every vehicle – making it easier to add new hardware – so long as the solution knew which machine a specific database was one; and quicker to query the data.
Initial application level sharding meant we could scale out with new databases and new database servers as the number of vehicles went up.It doesn’t help us if the data we collect, or the data retention policy, increases; it also still leaves us with slow running queries.
Problem 0 was the number of inserts into the database. Mysql_insert() is very slow. We need data to be taken by the server and inserted into the database as quickly as possible so that live, real time information can be displayed. The solution, an easy win, was to insert the data in large batches. These large batches are large in terms of the number of inserts, but small in terms of the time they relate to, only a second or two. LOAD DATA INFILE is something MySQL can do quickly, so we were able to push data into the solution quickly.However, thats not enough on its own; we don’t just have inserts to deal with – we need to work with the data too, and deal with a host of other problems.
The inital application architecture involved vehicles communicating directly with our solution.
The problem with this is availability.What happens if our solution goes offline?We need all of that data, because we deliver it to grant authorities. We need it because we don’t want to miss a vehicle fault code. We need it because we want to accurately calculate the vehicles energy usage and driving range.
The other problem is capacity; we need to be able to cope with data which fluctuates. We only get data when the vehicle is on, or charging; when the vehicle is on we get data every second, when it charges we get data every minute. We can’t easily plan for new vehicles coming off the production line as some vehicles take a while before they go into active service – customers want to brand the vehicles, they need to train their drivers, then they put them into service.
The other problem is the capacity of our servers. With a large amount of data coming in, and a large number of collection devices giving us this data, we could find our selves vulnerable to a Distributed Denial of Service attack that we ourselves authorised. This would lead to us being unable to process some or all data, some data being lost, and potentially, downtime.As more and more vehicles are used more and more regularly our servers will run the risk of catching fire!
One option when faced with problems like this, is of course standard cloud based infrastructure. With the likes of Amazons EC2, more machines could be powered on when demand was high, and different availability zones can help in the event of machine downtime or network problems.
However cloud based solutions have problems themselves. The first is that virtualised hardware isn’t well suited for large volumes of MySQL inserts, and large amounts of I/O. That has changed somewhat recently with cloud based relational databases, but wasn’t the case at the time.We also had an existing hardware investment as a result of our proof of concept application.Security and legal issues prevented us from storing un-encrypted data permanently off-site.
The solution, was a message queue.
We integrated a cloud based message queue into the service, based off the open AMQP standard. Data transfer was encrypted over SSL, and stored in the queue as an encrypted message. As the queue was a cloud based service, it could grow or shrink as our demand changed.This allows us not only to work around capacity problems but also added higher availability with the likes of availability zones offered by many cloud based providers.
But of course, the cloud isn’t perfect. There have been a number of famous cloud-service outages, which have affected multiple availability zones at the same time of a number of cloud hosting providers. So we need to take some additional precautions. A small buffer on the vehicle itself.
If our service teams get a call about a vehicle thats off the road, they need to be able to look and see how the vehicle is operating and performing, in real time. We need to provide a large range of metrics in real time to our users, that continually updates to reflect the current state of the vehicle.
Showing data in real time causes a number of headaches:Processing the huge number of insertsData and legacy application architecture Race conditionsAccessing the data quicklyA global view
Imagine viewing a customers fleet of 30 vehicles on a map? 60 queries refreshing every 30 seconds. The second issue was made even more problematic, thanks to a management request: Global map.
Initial team made use of Flash based charts library. This was both good and bad. The Good:Requests for data were asynchronousMain page loaded quickly The badEach chart was a separate queryEach chart was a separate requestSession authentication introduced race conditions
Sessions in PHP close at the end of the execution cycleUnpredictable query timesLarge number of concurrent requests per screenSession Locking Completely locks out a users session, as PHP hasn’t closed the session
session_write_close()Added after each write to the $_SESSION array. Closes the current session.(requires a call to session_start immediately before any further reads or writes)
We now had a live screen which was stable (it didn’t lock) but was still slow
We cached most upto date datapoints for each vehicle in an in-memory key-value storeMemcache! Allows us to quickly get access to live data (which continually changes) without hitting the database
Although for us, the Lazy Loading Registry works, the inclusion of the memcache connection is stretching its flexibility.Better approach: Dependency Injection ContainerWe could instead prepare configuration data and pass it within a DIC to the relevant features
Currently, each piece of “live data” is loaded into a flash graph or widget, which updates every 30 seconds using an AJAX requestThe move from MySQL to Memcache reduces database load, but large number of requests still add strain to web serverMaking more use of text and image based representations, all of which can be updated from a single AJAX request
V1 of the system mixed PHP and HTML You can’t re-initialise your session once output has been sent All new code uses a template engine, so session interaction has no bearing on output. When the template is processed and output, all database and session work has been completed long before.
Race conditions are further exacerbated by the PHP timeout valuesCertain exports, actions and processes take longer than 30 seconds, so the default execution time is longerInitially the project lacked a single entry point, and execution flow was muddledSingle Entry Point makes it easier to enforce a lower time out, which is overridden by intensive controllers or models
We now had an application which was: Stable: sessions were not locking Quick in parts:Fleet overview (map)Vehicle live screen Gaining the confidence of users Unfortunately...Speed was only the appearance
Generating performance data Backing up and archiving data Exporting data for grant authorities Our initial mandate! Viewing historical data Viewing analytical data
In order to look at how a vehicle performed for a given day, we needed to analyse a lot of data points. We needed to take several types of data for a single day and perform calculations on that data; this was to give us detail on efficiency, distance, speeds and driver style. Although this took a little while to load, because we had to deal with lots of data, users were initially willing to wait – they understood that it involved lots of processing.However! Soon they wanted to look at more than a day at a time, and more than a vehicle at a time, and they were asking questions, such as: How far has this customers vehicles travelled last week How do the efficiencies of vehicles in NY compare to OH How far have all our our vehicles ever travelledQuestions which we couldn’t answer
Introduced regular, automated data processingPerformance calculations were done over night, for every vehicleSaved as a summary record, one per vehicle per dayReally, really, really easy and quick to pull out, aggregate, compare and analyse
Pulling data out was still very slow, especially when analysing data over the course of a day We decided to shard the database again, this time based on the date: Week Number Data stored before we implemented sharding wasn’t stored in a sharded table
Shading makes backing up and archiving easier Simply export a sharded-week table and saveNo long queries which have to find data based on dateIf its an archive (not just a backup)DROP the table afterwardsNo need to run a slow delete query based on a wide ranging WHERE clause
With SQL based database systems, each data type available for a field uses a set amount of storage space. A good example is integers, MySQL offers a range of different integer fields, each type is able to store a different range of values, the greater the range, the more storage space the field needs to use – regardless of if the value of the field is part of that range, as opposed to the range of the next field type down. If you know the data in a particular field is always within a specific range – use the data type with the smallest size which supports the range you require. When you need to store data at scale, an over eager datatype can cost you dearly.Similarly, make sure the data type is optimised for the work you are doing on it. When it comes to Ints, floats, doubles and decimals some are more suited to others for arithmetic work because of the part of the CPU they use.
A few other issues and cavets we faced, which don’t really sit nicely in the timeline I’ve just given
Our telemetry unit broadcasts each data point once per secondData doesn’t change every second, e.g.Battery state of charge may take several minutes to loose a percentage pointFault flags only change to 1 when there is a faultMake an assumption. We compare the data to the last known value…if it’s the same we don’t insert, instead we assume it was the sameUnfortunately, this requires us to put additional checks and balances in place