Cloud Native API Design and Management

Cloud Native API Design & Management

Overview
● Introduction
○ Problem statements
○ Multi dimensional problem
● RPC based solutions
○ gRPC / Thrift / Avro
● REST based solutions
○ Contracting terminology
○ Swagger / RAML / OData / Spring Cloud Contract / PACT
● Historical overview
● License overview
● Trends overview (Google Trends and StackOverflow)
● Conclusion

Introduction - problem statements
We have a microservice that passes all unit tests, but integration in a real
production environment is painful or fails.
We have a front-end development team which has to wait 2 weeks because the
back-end is lagging behind.
We have an external customer that wants to use our OpenAPI compliant services
in the spirit of PSD2.
How can we safely evolve API’s keeping back/forward compatibility in mind.

Introduction - Multi dimensional problem
● Multiple API interactions
○ Traditional front-back-end multi-tier interaction
○ Intra microservices communication - async vs sync
○ External/Public API consumers (API Gateways)
● Multiple languages
○ Typed: Java (Spring), Scala, GoLang, Python, Swift...
○ Untyped: JavaScript
● Multiple API types
○ Sync: CRUD vs RPC
○ Async: Message Queues
○ (we focus on SYNC only here)
● ...

RPC mechanisms
● RPC is like calling local functions, but towards remote entities, without having
to understand the network details
● They use binary encoded protocols, supported by a schema
● Schema’s handle backward and forward capabilities to a certain extent
● For more complex services, RPC provides more flexibility
● Domain specific
● More strongly typed experience via stubs
● To be considered when performance / latency is a primary design goal

RPC mechanisms
Evaluated technologies:
● gRPC
● Apache Thrift
● Avro

gRPC - design goals
Build an open source, standards based, best-of-breed, feature rich RPC system
● Efficient and idiomatic
○ create easy-to-use, efficient and idiomatic libraries
● Performant and scalable
○ provide a performant and scalable RPC framework
● Micro-services
○ enable developers to build microservice-based applications
A high performance, open source universal RPC framework.
Created by Google, making 10’s of billions of calls per second within their global
datacentres.

gRPC - in a nutshell
● A high level service definition to describe the API using Protocol Buffers
● Client and server code generated from the service definition
○ 10+ languages: C++, Java, Objective-C, Python, Ruby, Go, C#, Node.js.
● Efficiency in serialization with Protocol Buffers
● Connections with HTTP/2
● Multiple connection options
○ Unary
○ Server-side streaming
○ Client-side streaming
○ Bi-directional streaming
● Multiple authentication options
○ SSL/TLS
○ Token based authentication

gRPC - example
service Greeter {
rpc SayHello (HelloRequest) returns (HelloReply) {}
}
message HelloRequest {
string first_name = 1;
string last_name = 2;
}
● gRPC can use protocol buffers as both its IDL and as its underlying message
interchange format
● Service definitions and messages are stored in *.proto files
● From those *.proto files, data access classes with simple accessors are
generated, as well as gRPC client/server code
message HelloReply {
string message = 1;
}

gRPC - protocol buffers
message Person {
required string user_name =
1;
optional int64 favourite_number =
2;
repeated string interests =
3;
}
● a language-neutral, platform-neutral, extensible way of serializing structured
data for use in communications protocols, data storage, and more.
SERIALIZATION

gRPC - evaluation
Pro’s
● Binary (Fast)
● Based on protocol buffers IDL (google)
● RPC is transport agnostic
Con’s
● Binary (not always compatible with Firewalls, API gateways,...)
● Does not map on REST, you describe operations/procedures
● Poor browser support
● Requires HTTP/2

Apache Thrift - introduction
A scalable cross-platform, cross-language binary RPC code generation engine.
● Define all necessary data structures and interfaces for a complex service in a
single short file
● Thrift Interface Definition Logic file or Thrift IDL file (*.thrift files) that
auto-generates code
○ Support for C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript,
Node.js, Smalltalk, Ocaml, Delphi, etc…
● It handles both data transport and serialization
Developed by Facebook, ported to Apache project in 2007 and used by Evernote,
Cassandra, HBase and Hadoop

Apache Thrift - protocol stack
Protocols
● Binary
● Compact
● JSON and SimpleJSON
● Debug
Transports
● Socket (websockets!)
● Framed
● File
● Memory
● Zlib
Can be adjusted at runtime without recompilation!

Apache Thrift - example
service UserStorage {
void store(1: UserProfile user),
UserProfile retrieve(1: i32 uid)
}
struct UserProfile {
1: i32 uid,
2: string name,
3: string blurb
}
● Interface definitions, thrift types and services are stored in *.thrift files
● From those *.thrift files, thrift client/server code is generated
# thrift --gen java example.thrift

Apache Thrift - evaluation
Pro’s
● Binary (Fast)
● Thrift IDL (originally from facebook)
● Has support for most languages
● Is great on mobile devices and embedded systems
Con’s
● Binary (not always compatible with Firewalls, API gateways,...)
● Lack of tooling
● Relative new
Developed by Facebook, used by Evernote and Cassandra

Avro - introduction
● Avro files are used for data serialization (file/message) and data exchange
● It stores both the data definition and the data together in one message/file
● Schema definition is represented in JSON format or Avro IDL
● Does not require a code generation step
● Robust support for data schemas that change over time
○ Separate reader and writer schema for schema evolution
● Avro includes API’s for different languages
○ Supported: Java, Python, Perl, Ruby, PHP, C, C++ and C#
● Selected as first class citizen for asynchronous messaging with Kafka
○ Using a separate Avro Schema registry
● Avro RPC interfaces are specified in JSON.
○ Interface has protocol declaration (message definitions) and wire format
● Can Serialize into Avro/Binary or Avro/JSON

Avro - schema evolution
● Reader and writer schema don’t have to be the same (only compatible)
● At decode time schema differences are resolved (schema resolution)

Avro - schema registries in Kafka

Avro
● From Apache, started as a subproject from Hadoop
● Binary
● Schemas are equivalent to protocol buffers proto files, but they do not have to
be generated
● Using separate reader and writer schema’s makes it more robust for API
evolution
● Can Serialize into Avro/Binary or Avro/JSON

REST based solutions
● Introducing / clarifying some terminology
○ Bottom up vs. top down design
○ Endpoint declaration vs. validation of an API
○ Declarative vs. scenario based contracts
○ API development lifecycle
● REST maturity levels
○ HATEOAS
○ Hypermedia JSON formats
● Overview of solutions
○ Formal definitions: Swagger / OpenAPI Spec / RAML / OData
○ Consumer Driver Design: Spring Cloud Contract / PACT

Contracting - Bottom Up vs Top Down
Bottom Up - Code First
● Annotated backend controllers generate a producer compliant contract
● The consumer uses this contract as the truth (e.g. generate consumer code)
● A lot of responsibility at the producer side
● High risk of “domain model dumping”
Top Down - Contract First
● Contract is written at design time
● Producer and consumer prove they comply using tests
It is highly recommended to choose for a top down approach!

Contracting - Endpoint declaration and validation
It is useful to make a distinction between the endpoint declaration AND the
validation of a REST API Contract.
Endpoint declaration:
● Enumeration of the endpoints (/api/user/{id})
● Enumeration of the actions that can be performed (GET/POST)
● A description of the expected structure of the payload (JSON Fields)
● Possible return values (HTTP 201 created, 400 bad request)
Validation:
● Actual restraints / expectations on the content of the payload
● Example: an account number should end mod(97)

Contracting - declarative vs scenario based
Declarative based (eg. Swagger / OpenAPI Spec / RAML)
● Focus on the structure of the API
● Basic assumptions on the structure of payloads
● Can be used to generate consumer code
● Can be used to generate tests
○ But you need to generate/maintain test data => drift!
Scenario based (eg. Spring Cloud Contract / PACT)
● Enumeration of possible inputs with resulting output
● More work to describe all possible input/output test scenarios
● Mainly used from a testing perspective, because it can be used to auto-generate tests
and stubs/mocks
● Very similar to BDD (Given-When-Then)

REST and HATEOAS
Maturity Level 3 - Hypermedia As The Engine Of Application State.
● Make your API navigate-able by following contextual links
● Loosen coupling between producer and consumer (no more hard coded
resource paths)
● Different standards to enable Hypermedia Links in JSON
○ HAL_links
○ JSON-LD @id
○ Siren Links
○ Collection+JSON
○ Cfr https://sookocheff.com/post/api/on-choosing-a-hypermedia-format

REST - Hypermedia Link formats

OpenAPI spec aka swagger 2.0
● OpenAPI Initiative is focused on creating, evolving and promoting a vendor
neutral API Description Format, based on the Swagger Specification
● A Swagger Spec is a concrete description of a REST API:
○ Request methods + URIs
○ Query, Path and Body Parameters
○ Headers
○ Responses
● Swagger-UI is a tool for documentation and developer’s exploration
● Swagger-Codegen to generate documentation, API clients and server stubs
● Swagger-HUB is an online webtool to design your API
● Swagger-Diff is a tool to detect (breaking) API changes
● Swagger has many language bindings
○ Clojure, D, Erlang, Elixir, Go, Haskell, Java, JavaScript, Jolie, Lua, TypeScript, NET, NodeJS,
Perl, PHP, Python, Ruby, Scala, Swift

OpenAPI spec - swagger usage
Swagger can be used by both consumer and producer to generate interfaces.

OpenAPI spec - repository
● Avoid performance hit from JIT spec fetching over the network
● Clients explicitly expect a particular interface
● A convenient place to inject other smart behavior (eg OpenTracing headers)

OpenAPI spec - evaluation
Pro’s
● Large community, user base and tooling landscape
● Looks like it is becoming an industry standard
● Supported by all major API Gateways
Con’s
● Can be hard to maintain (single plain text file headaches)
● Less powerful compared with RAML
● Not designed with modularisation in mind

RAML
● Based on the YAML 1.2 specification
● Support for XML and JSON Schema
● Support for facets (restriction like min/max, user defined as well)
● Support for templating (avoiding pattern repetition in your API spec)
● Focus on modularization of your API specifications: includes, libraries,
overlays and extensions
● Security scheme types:
○ OAuth 1.0/2.0, Basic/Digest Auth, Pass Through, x-<other>
● More powerful than Swagger
○ programming language concepts, like multiple inheritance
● Steeper learning curve than Swagger
● IntelliJ auto-completion supported
Used by Spotify and VMWare

RAML types
● Types are similar to Java classes and borrow additional features from JSON
Schema, XSD and more expressive object oriented languages.
● Multiple type inheritance is allowed.
● Types are split into four families: external, object, array, and scalar.
● Types can define two types of members: properties and facets. Both are
inherited.
○ Facets are special configurations. You specialize types based on characteristics of facet
values. Examples: minLength, maxLength

RAML - evaluation
Pro’s
● Large community, user base and tooling landscape
● Looks like it is becoming an industry standard
● More powerful compared with OpenAPI Spec
● Designed with modularisation and reuse in mind
● Supported by all major API Gateways
Con’s
● Steeper learning curve the OpenAPI Spec
● Less tooling available compared with OpenAPI Spec

OData
● Created by MicroSoft, backed by OASIS since 2014
● Open Data Protocol to exchange data over the web
● It is HTTP based and designed with a RESTful mindset
○ Atom or JSON serialization
○ Using HTTP verbs on URI’s
● URI based query language (~ SQL over HTTP)
○ similarities with JDBC/ODBC but not limited to relational DBs
○ allows publishers to tag/label data with domain specific vocabularies
● Advanced features
○ Support for asynchronous queries
○ Request only the changes/updates since your last query
● Popular in the .NET space, but no traction or even abandoned by others
○ eg. Netflix has abandoned OData in 2013, eBay did so as well

OData compared with SQL
SQL Query OData Request
SELECT * FROM products WHERE id = 1 /Products(1)
SELECT * FROM products WHERE name = ‘Milk’ /Products?$filter=name eq ‘Milk’
SELECT name FROM products /Products?$select=name
SELECT * FROM products ORDER BY name /Products?$orderby=name
SELECT * FROM products OFFSET 10 LIMIT 10 /Products?$top=10&$skip=10
SELECT * FROM prices r, products p WHERE r.id = p.id
(does not map directly to SQL, but close)
/Products(1)?$expand=Prices
(returns nested resource data)

OData - evaluation
Pro’s
● Backed by Microsoft
Con’s
● High risk for domain model dumping or database exposure
○ You need an extra transformation layer in between to mask your domain model / DB
● Some big vendors have abandoned their OData tracks

Consumer Driven Testing
● Traditional end-to-end tests (dependency driven)
○ Too late to get feedback
○ Expensive to change so late
○ Difficult to orchestrate. Your client might not exist yet.
○ Don’t know who is using your API
● Consumer driven (scenario driven)
○ Focus on APIs that provide value
■ Provider should never implement logic a consumer is not asking for!
○ Fail fast
○ Independent communication/collaboration with dependent teams

Consumer Driven Design
● Centralised “contract” storage
○ Publish test scenario’s each build
○ Golden source of client dependencies
○ Download and test each build
○ Test both consumer and provider against contract
● Frameworks
○ Spring Cloud Contract
○ Pact (realestate.au)
● Stub Provider Service
○ Testing service not downstream
○ Spring @Profile annotation
● NOT suitable for public API’s
● Ideal for intra-microservice API’s
Integration tests provide a false level of confidence (no silver bullet!)

Spring Cloud Contract
Pro’s
● Maps on REST
● IDL with code completion in your IDE (Groovy) and integrating well with the
Spring ecosystem.
● Can be exported to PACT
● Encourages consumer driven contracts
○ Plugin to generate consumer wiremocks. Let consumer’s test the contract while the provider is
not yet implemented.
○ Plugin to generate unit tests for the Provider. Let the provider test the contract and implement
the correct implementation.
○ The producer has a clean contract based on what consumers are requesting (via PR)

Spring Cloud Contract - workflow

Spring Cloud Contract - example
org.springframework.cloud.contract.spec.Contract.make {
request {
method 'PUT'
url '/fraudcheck'
body("""
{
"clientId":"1234567890",
"loanAmount":99999
}
""")
headers {
header('Content-Type', 'application/vnd.fraud.v1+json')
}
}
response {
status 200
body("""
{
"fraudCheckStatus": "FRAUD",
"rejectionReason": "Amount too high"
}
""")
headers {
header('Content-Type': 'application/vnd.fraud.v1+json')
}
}
}

Pact
● Pact is a testing tool that guarantees that Consumer Driven Contracts are
satisfied
● PACT scenario’s can be expressed using JSON or specific PACT IDL
● A two step process (consumer & producer side)
○ Set up a mock server and expectations
○ Act as client and make assertions
● Pact broker to centralise pact files
○ Supports auto generated documentation
○ Creates a network graph of service inter-dependencies

Pact - example
animal_service.given("an alligator named Mary exists").
upon_receiving("a request for an alligator").
with(
method: "get",
path: "/alligators/Mary",
headers: {"Accept" => "application/json"}).
will_respond_with(
status: 200,
headers: {"Content-Type" => "application/json"},
body: {
name: "Mary",
dateOfBirth: Pact.term(
generate: "02/11/2013",
matcher: /d{2}/d{2}/d{4}/)
})

Historical Overview - RPC solutions
Technology Initial Commit Date
GRPC 22-11-2014
Thrift 16-05-2008
Avro 09-04-2009
NOTE: used http://farhadg.github.io/init/landing to determine original commit

License overview - RPC solutions
Technology License
GRPC BSD 3-clause
Thrift Apache License 2.0
Avro Apache License 2.0

StackOverflow Trends - RPC solutions
REF: http://sotagtrends.com

Historical Overview - REST solutions
Technology Initial Commit Date
Swagger (old) 08-10-2011
RAML 30-09-2013
API Blueprint 27-04-2013
OpenAPI Specification 03-03-2014
OData 16-04-2010
Spring Cloud Contract 06-12-2014
Pact 17-03-2016
NOTE: used http://farhadg.github.io/init/landing to determine original commit

License overview - REST solutions
Technology License
Swagger (old) Apache 2.0 License
RAML Apache License 2.0
API Blueprint The MIT License
OpenAPI Specification Apache 2.0 License
OData The MIT License
Spring Cloud Contract Apache 2.0 License
Pact The MIT License

Google Trends - REST solutions

StackOverflow Trends - REST solutions
REF: http://sotagtrends.com

Conclusion
● Because …
○ REST is most adopted and “easy”
○ Latency & performance are not first design concerns
● … we choose REST in favor of RPC
● Because…
○ Swagger popularity (OpenAPI Initiative backed)
○ Consumer driven testing is NOT a formal contract
○ PACT is language neutral
● … we choose a combination Swagger + PACT

Cloud Native API Design and Management

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Cloud Native API Design and Management

Similar to Cloud Native API Design and Management (20)

Recently uploaded

Recently uploaded (20)

Cloud Native API Design and Management