2. About Me
• a.k.a. ihower
• http://ihower.tw
• http://twitter.com/ihower
• Rails Developer since 2006
• The Organizer of Ruby Taiwan Community
• http://ruby.tw
• http://rubyconf.tw
3. Agenda
• What’s SOA
• Why SOA
• Considerations
• The tool set overview
• Service side implement
• Client side implement
• Library packaging
• Caching
4. What’s SOA
Service oriented architectures
• “monolithic” approach is not enough
• SOA is a way to design complex applications
by splitting out major components into
individual services and communicating via
APIs.
• a service is a vertical slice of functionality:
database, application code and caching layer
5. a monolithic web app example
request
Load
Balancer
WebApps
Database
6. a SOA example
request
Load
request
Balancer
WebApp WebApps
for Administration for User
Services A Services B
Database Database
8. Shared Resources
• Different front-web website use the same
resource.
• SOA help you avoiding duplication databases
and code.
• Why not only shared database?
• code is not DRY WebApp
for Administration
WebApps
for User
• caching will be problematic
Database
9. Encapsulation
• you can change underly implementation in
services without affect other parts of system
• upgrade library
• upgrade to Ruby 1.9
• upgrade to Rails 3
• you can provide API versioning
10. Scalability1: Partitioned
Data Provides
• Database is the first bottleneck, a single DB server
can not scale. SOA help you reduce database load
• Anti-pattern: only split the database
• model relationship is broken WebApps
• referential integrity
• increase code complexity Database
A
Database
B
• Myth: database replication can not help you speed
and consistency
11. Scalability 2: Caching
• SOA help you design caching system easier
• Cache data at the right place and expire
at the right times
• Cache logical model, not physical
• You do not need cache view everywhere
12. Scalability 3: Efficient
• Different components have different task
loading, SOA can scale by service.
WebApps
Load
Balancer Load
Balancer
Services A Services A Services B Services B Services B Services B
13. Security
• Different services can be inside different
firewall
• You can only open public web and
services, others are inside firewall.
14. Interoperability
• HTTP is the most common interface, SOA
help you integrate them:
• Multiple languages
• Internal system e.g. Full-text searching engine
• Legacy database, system
• External vendors
15. Reuse
• Reuse across multiple applications
• Reuse for public APIs
• Example: Amazon Web Services (AWS)
17. Reduce Local
Complexity
• Team modularity along the same module
splits as your software
• Understandability: The amount of code is
minimized to a quantity understandable by
a small team
• Source code control
19. How to partition into
Separate Services
• Partitioning on Logical Function
• Partitioning on Read/Write Frequencies
• Partitioning on Minimizing Joins
• Partitioning on Iteration Speed
20. on Iteration Speed
• Which parts of the app have clear defined
requirements and design?
• Identify the parts of the application which
are unlikely to change.
• For example:
The first version data storage is using
MySQL, but may change to NoSQL in the
future without affecting front-app.
21. on Logical Function
• Higher-level feature services
• articles, photos, bookmarks...etc
• Low-level infrastructure services
• a shared key-value store, queue system
22. On Read/Write
Frequencies
• Ideally, a service will have to work only with
a single data store
• High read and low write: the service should
optimize a caching strategy.
• High write and low read: don’t bother with
caching
23. On Join Frequency
• Minimize cross-service joins.
• But almost all data in an app is joined to
something else.
• How often particular joins occur? by read/
write frequency and logical separation.
• Replicate data across services
(For example: a activity stream by using messaging)
24. API Design Guideline
• Send Everything you need
• Unlike OOP has lots of finely grained method calls
• Parallel HTTP requests
• for multiple service requests
• Send as Little as Possible
• Avoid expensive XML
25. Versioning
• Be able run multiple versions in parallel:
Clients have time to upgrade rather than having to
upgrade both client and server in locks step.
• Ideally, you won’t have to run multiple versions for
very long
• Two solutions:
• Including a Version in URIs
• Using Accept Headers for Versioning
(disadvantage: HTTP caching)
26. Physical Models &
Logical Models
• Physical models are mapped to database
tables through ORM. (It’s 3NF)
• Logical models are mapped to your
business problem. (External API use it)
• Logical models are mapped to physical
models by you.
27. Logical Models
• Not relational or normalized
• Maintainability
• can change with no change to data store
• can stay the same while the data store
changes
• Better fit for REST interfaces
• Better caching
29. RESTful Web services
• Rails way
• Easy to use and implement
• REST is about resources
• URI
• HTTP Verbs: GET/PUT/POST/DELETE
• Representations: HTML, XML, JSON...etc
30. The tool set
• Web framework
• XML Parser
• JSON Parser
• HTTP Client
• Model library
31. Web framework
• Ruby on Rails, but we don’t need afull features.
(Rails3 can be customized because it’s lot more
modular. We will discuss it later)
• Sinatra: a lightweight framework
• Rack: a minimal Ruby webserver interface
library
32. ActiveResource
• Mapping RESTful resources as models in a
Rails application.
• Use XML by default
• But not useful in practice, why?
33. XML parser
• http://nokogiri.org/
• Nokogiri ( ) is an HTML, XML, SAX, and
Reader parser. Among Nokogiri’s many
features is the ability to search documents
via XPath or CSS3 selectors.
35. HTTP Client
• How to run requests in parallel?
• Asynchronous I/O
• Reactor pattern (EventMachine)
• Multi-threading
• JRuby
36. Typhoeus
http://github.com/pauldix/typhoeus/
• A Ruby library with native C extensions to
libcurl and libcurl-multi.
• Typhoeus runs HTTP requests in parallel
while cleanly encapsulating handling logic
37. Typhoeus: Quick
example
response = Typhoeus::Request.get("http://www.pauldix.net")
response = Typhoeus::Request.head("http://www.pauldix.net")
response = Typhoeus::Request.put("http://localhost:3000/posts/1",
:body => "whoo, a body")
response = Typhoeus::Request.post("http://localhost:3000/posts",
:params => {:title => "test post", :content => "this is my test"})
response = Typhoeus::Request.delete("http://localhost:3000/posts/1")
38. Hydra handles requests
but not guaranteed to run in any particular order
HYDRA = Typhoeus::HYDRA.new
a = nil
request1 = Typhoeus::Request.new("http://example1")
request1.on_complete do |response|
a = response.body
end
HYDRA.queue(request1)
b = nil
request2 = Typhoeus::Request.new("http://example1")
request2.on_complete do |response|
b = response.body
end
HYDRA.queue(request2)
HYDRA.run # a, b are set from here
39. a asynchronous method
def foo_asynchronously
request = Typhoeus::Request.new( "http://example" )
request.on_complete do |response|
result_value = ActiveSupport::JSON.decode(response.body)
# do something
yield result_value
end
self.hydra.queue(request)
end
40. Usage
result = nil
foo_asynchronously do |i|
result = i
end
foo_asynchronously do |i|
# Do something for i
end
HYDRA.run
# Now you can use result1 and result2
41. a synchronous method
def foo
result = nil
foo_asynchronously { |i| result = i }
self.hydra.run
result
end
45. A basic model
class YourModel
extend ActiveModel::Naming
include ActiveModel::Conversion
include ActiveModel::Validations
def persisted?
false
end
end
46. without validations
class YourModel
extend ActiveModel::Naming
include ActiveModel::Conversion
def persisted?
false
end
def valid?() true end
def errors
@errors ||= ActiveModel::Errors.new(self)
end
end
48. Serializers
class Person
include ActiveModel::Serializers::JSON
include ActiveModel::Serializers::Xml
attr_accessor :name
def attributes
@attributes ||= {'name' => 'nil'}
end
end
person = Person.new
person.serializable_hash # => {"name"=>nil}
person.as_json # => {"name"=>nil}
person.to_json # => "{"name":null}"
person.to_xml # => "<?xml version="1.0" encoding="UTF-8"?>
n<serial-person...
49. Mass Assignment
class YourModel
# ...
def initialize(attributes = {})
if attributes.present?
attributes.each { |k, v| send("#{k}=", v) if respond_to?("#{k}=") }
end
end
end
YourModel.new( :a => 1, :b => 2, :c => 3 )
50. MassAssignmentSecurity
class YourModel
# ...
include ActiveModel::MassAssignmentSecurity
attr_accessible :first_name, :last_name
def initialize(attributes = {})
if attributes.present?
sanitize_for_mass_assignment(attributes).each { |k, v| send("#{k}=", v) if
respond_to?("#{k}=") }
end
end
end
51. Scenario we want to
implement
• an Users web service, which provide basic
CRUD functions.
• an web application with the Users client
library
53. Customized Rails3
• We don’t need some components.
• We can customize ActionController
• Building a fast, lightweight REST service
with Rails 3
http://pivotallabs.com/users/jdean/blog/articles/1419-building-a-fast-
lightweight-rest-service-with-rails-3
54. # config/appliction.rb
%w(
active_record
action_controller
action_mailer
).each do |framework|
begin
require "#{framework}/railtie"
rescue LoadError
end
end
55. # config/application.rb
[
Rack::Sendfile,
ActionDispatch::Flash,
ActionDispatch::Session::CookieStore,
ActionDispatch::Cookies,
ActionDispatch::BestStandardsSupport,
Rack::MethodOverride,
ActionDispatch::ShowExceptions,
ActionDispatch::Static,
ActionDispatch::RemoteIp,
ActionDispatch::ParamsParser,
Rack::Lock,
ActionDispatch::Head
].each do |klass|
config.middleware.delete klass
end
# config/environments/production.rb
config.middleware.delete
ActiveRecord::ConnectionAdapters::ConnectionManagement
56. # /app/controllers/application_controller.rb
class ApplicationController < ActionController::Base
class ApplicationController < ActionController::Metal
include AbstractController::Logger
include Rails.application.routes.url_helpers
include ActionController::UrlFor
include ActionController::Rendering
include ActionController::Renderers::All
include ActionController::MimeResponds
if Rails.env.test?
include ActionController::Testing
# Rails 2.x compatibility
include ActionController::Compatibility
end
end
http://ihower.tw/blog/archives/4561
57. APIs design best practices (1)
• Routing doesn't need Rails resources
mechanism , but APIs design should follow
RESTful.
(This is because we don't have view in service and we don't need URL
helpers. So use resources mechanism is too overkill)
• RESTful APIs is stateless, each APIs should
be independent. So, requests which have
dependency relationship should be
combined into one API request. (atomic)
58. APIs design best practices (2)
• The best format in most case is JSON.
( one disadvantage is we can’t return binary data directly. )
• Use Yajl as parser.
# config/application.rb
ActiveSupport::JSON.backend = "Yajl"
• Don't convert data to JSON in Model, the
converting process to JSON should be
place in Controller.
59. APIs design best practices (3)
• I suggest it shouldn't include_root_in_json
# config/application.rb
ActiveRecord::Base.include_root_in_json = false
• Please notice “the key is JSON must be string”.
whether you use symbol or string in Ruby, after JSON
encode should all be string.
• related key format should be xxx_id or xxx_ids for
example:
{ "user_id" => 4, "product_ids" => [1,2,5] }.to_json
• return user_uri field in addition to the user_id field if
need
60. a return data example
model.to_json and model.to_xml is easy to use, but not useful in practice.
# one record
{ :name => "a" }.to_json
# collection
{ :collection => [ { :name => "a" } , { :name =>
"b" } ], :total => 123 }.to_json
If you want to have pagination, you
need total number.
61. APIs design best practices (4)
• except return collection, we can also
provide Multi-Gets API. through params :
ids. ex. /users?ids=2,5,11,23
• client should sort ID first, so we can design
cache mechanism much easier.
• another topic need to concern is the URL
length of GET. So this API can also use
POST.
63. APIs design best practices (5)
• error_codes & errors is optional, you can
define it if you need.
• errors is used to put model's validation
error : model.errors.to_json
64. HTTP status code
We should return suitable HTTP status code
• 200 OK
• 201 Created ( add success)
• 202 Accepted ( receive success but not
process yet, in queue now )
• 400 Bad Request ( ex. Model Validation
Error or wrong parameters )
• 401 Unauthorized
67. Note
• No active_record, we get data from service
through HTTP client (typhoeus)
• Model can include some ActiveModel,
modules so we can develop more efficiently.
• This model is logical model, mapping to the
data from API, not database table. It's
different to service's physical model ( ORM-
based)
68. # config/appliction.rb
%w(
action_controller
action_mailer
).each do |framework|
begin
require "#{framework}/railtie"
rescue LoadError
end
end
69. Setup a global Hydry
# config/initializers/setup_hydra.rb
HYDRA = Typhoeus::Hydra.new
70. An example you can
inherited from (1)
class LogicalModel
extend ActiveModel::Naming
include ActiveModel::Conversion
include ActiveModel::Serializers::JSON
include ActiveModel::Validations
include ActiveModel::MassAssignmentSecurity
self.include_root_in_json = false
# continued...
end
71. class LogicalModel An example you can
# continued... inherited from (2)
def self.attribute_keys=(keys)
@attribute_keys = keys
attr_accessor *keys
end
def self.attribute_keys
@attribute_keys
end
class << self
attr_accessor :host, :hydra
end
def persisted?
!!self.id
end
def initialize(attributes={})
self.attributes = attributes
end
def attributes
self.class.attribute_keys.inject(ActiveSupport::HashWithIndifferentAccess.new) do |result, key|
result[key] = read_attribute_for_validation(key)
result
end
end
def attributes=(attrs)
sanitize_for_mass_assignment(attrs).each { |k, v| send("#{k}=", v) if respond_to?("#{k}=") }
end
72. Model usage example
class Person < LogicalModel
self.attribute_keys = [:id, :name, :bio, :user_id, :created_at, :updated_at]
self.host = PEOPLE_SERVICE_HOST
self.hydra = HYDRA
validates_presence_of :title, :url, :user_id
# ...
end
73. class Person < LogicalModel
# ...
paginate
def self.people_uri
"http://#{self.host}/apis/v1/people.json"
end
def self.async_paginate(options={})
options[:page] ||= 1
options[:per_page] ||= 20
request = Typhoeus::Request.new(people_uri, :params => options)
request.on_complete do |response|
if response.code >= 200 && response.code < 400
log_ok(response)
result_set = self.from_json(response.body)
collection = result_set[:collection].paginate( :total_entries => result_set[:total] )
collection.current_page = options[:page]
yield collection
else
log_failed(response)
end
end
self.hydra.queue(request)
end
def self.paginate(options={})
result = nil
async_paginate(options) { |i| result = i }
self.hydra.run
result
end
end
74. will_paginate hack!
• in order to use will_paginate's helper, we
must set current_page manually, so we hack
this way:
# /config/initializers/hack_will_paginate.rb
# This is because our search result via HTTP API is an array and need be paginated.
# So we need assign current_page, unless it will be always 1.
module WillPaginate
class Collection
def current_page=(s)
@current_page = s.to_i
end
end
end
75. from_json & logging
class LogicalModel
# ...
def self.from_json(json_string)
parsed = ActiveSupport::JSON.decode(json_string)
collection = parsed["collection"].map { |i| self.new(i) }
return { :collection => collection, :total => parsed["total"].to_i }
end
def self.log_ok(response)
Rails.logger.info("#{response.code} #{response.request.url} in #{response.time}s")
end
def self.log_failed(response)
msg = "#{response.code} #{response.request.url} in #{response.time}s FAILED: #{ActiveSupport::JSON.decode
(response.body)["message"]}"
Rails.logger.warn(msg)
end
def log_ok(response)
self.class.log_ok(response)
end
def log_failed(response)
self.class.log_failed(response)
end
end
76. class Person < LogicalModel
# ...
find
def self.person_uri(id)
"http://#{self.host}/apis/v1/people/#{id}.json"
end
def self.async_find(id)
request = Typhoeus::Request.new( person_uri(id) )
request.on_complete do |response|
if response.code >= 200 && response.code < 400
log_ok(response)
yield self.new.from_json(response.body)
else
log_failed(response)
end
end
This from_json is defined by
ActiveModel::Serializers::JSON
self.hydra.queue(request)
end
def self.find(id)
result = nil
async_find(id) { |i| result = i }
self.hydra.run
result
end
end
77. class Person < LogicalModel
create&update
# ...
def create
return false unless valid?
response = Typhoeus::Request.post( self.class.people_uri, :params => self.attributes )
if response.code == 201
log_ok(response)
self.id = ActiveSupport::JSON.decode(response.body)["id"]
return self
else
log_failed(response)
return nil
end
end
def update(attributes)
self.attributes = attributes
return false unless valid?
response = Typhoeus::Request.put( self.class.person_uri(id), :params => self.attributes )
if response.code == 200
log_ok(response)
return self
else
log_failed(response) Normally data writes do not
return nil need to occur in parallel
end
end
end Or write to a messaging
system asynchronously
78. delete&destroy
class Person < LogicalModel
# ...
def self.delete(id)
response = Typhoeus::Request.delete( self.person_uri(id) )
if response.code == 200
log_ok(response)
return self
else
log_failed(response)
return nil
end
end
def destroy
self.class.delete(self.id)
end
end
79. Service client Library
packaging
• Write users.gemspec file
• gem build users.gemspec
• distribution
• http://rubygems.org
• build your local gem server
• http://ihower.tw/blog/archives/4496