A real world case study reather than academic 2-3 minute background about myself
History will be a few slides of overview about why, when, and how we started to move into the cloud Power of SOA discusses the flexibility and freedom of moving away from single point of data access Complications range from data ownership to data freshness to scalability Example application will cover our Skimmer and Best Sellers applications and their move
We first started working with services by exposing our structured data to external developers for our TimesOpen event in 2009. Very little of that was being used internally. When the community platform moved to the cloud, some services were created so internal pages could talk to the cloud servers We had some very structured venue data for our travel section. In the previous version, if an editor updated a venue, it took up to an hour to make it to the SQLite DB and onto the page. A change was published to XML. XML was picked up by the loader once an hour and update the SQLite. We created services that allowed the front-end to call them in real time and then allow the editor to update the MySQL DB in real time via the same services.
Definition not as important as culture. It's the concept that data is flexible and self contained. Doesn't rely on specific servers, environments, or configurations A lot of our applications were being built around how we access the data we use and what was available on what system. Instead, we should be thinking about what do we want the application to actually do and have the data component be an accent to that, not a hinderence SOA and cloud are related, but not mutual required Ability to output results in multiple data formats. In our case, offering skimmer feeds in json greatly sped up the app because no need to parse RSS or XML
SOA means that you can get the data sliced up many different ways. It is not rigid, but allows much flexibility. Because of this, it allows the same services to be used by many different applications One of the items that gets lost a lot when people talk about SOA is that it has to be a two way street. There can be restrictions, but the ability to update the data via the same services is key to keeping these services truly reusable
As you can see from the previous slide, using RSS feeds to build applications isn't SOA. But if it starts to allow the applications to not think about where they get their data from, then it's a great start RSS is very rigid, doesn't offer the ability to sort or filter and doesn't allow two way data transfer On our first front-end application, TimesSkimmer, we used RSS to allow a mainly javascript application to free itself from a shared mount point for data. But the new incarnation of Skimmer shows the power of having services versus just RSS.
One of the biggest hurdles we ran into was determining data ownership. The core data lived in a database, but was presented and updated via services. So was the group in charge of the interface in charge of cleansing input or the service owner? If there were new fields added to the data set, how could we decide if it would break the applications currently using it. Versioning allowed us to solve some of this, but then left older applications orphaned. Say application1 is using version2 and new application2 wants 3 new fields. We would create version3 of the services. But then if application1 wants a new field, what version would get it? By asking utilizing applications to be tolerant to new functions and fields, as long as the ones they need stay consistent, then updating a current version won't break other application. When new formatting is truly required, then change version and ask utilizing applications to upgrade. Similar issue to OS updates. By adding keys for each application using the services is useful, it isn't pure security
After we got past the core data side of things, we realized that there were other complications that came up. Other related, but not data items existed on the file system. Some shared navigation modules were reading directly off the shared file system. Pulling information for related content that existed on the file system like inlines One of the key ones was that the way we dealt with ad calls were purely local. So we had to create a service layer to abstract this. But had to be as responsive as local call
Service had existed for a long time, so much of the heavy lifted was done Addition of E-Books meant there needed to be the ability to add new lists to the data set Static articles meant that users couldn't easily navigate between weeks
Right now, the applications poll the services for every request, or on some interval, even if there haven't been any changes. Should change it to where there services push out changes when they happen The services are fairly flexible, but still require a good deal of maintenance when new data is needed. Trying to abstract that out to be more tolerant
We've opened sourced many of the frameworks that allowed us to make this transition