Railsplitter is a framework which significantly reduces development cost to expose a hierarchical data model as a production quality Create, Read, Update, and Delete (CRUD) web service. Railsplitter adopts JSON API [10] as the standard for the service definition given its focus on consumption by front-end developers. Inherent in the design of JSON API are capabilities that reduce the number of round trips from client to server to fetch or update data. Updates on disparate models can happen in a single request allowing the server to build atomicity guarantees. Rather than starting from scratch with a domain-specific language (DSL) to describe a data model, Railsplitter adopts Java Persistence API (JPA) [6] - a modeling definition that is rich and has a long tenure of proven provider implementations. Unlike other approaches, Railsplitter addresses the fundamental needs of flexible, model driven authorization, interoperability with client side applications, and test automation.
1. Railsplitter: Simplify Your CRUD
Aaron Klish, William Cekan, Dennis McWherter, Clay Reimann, Dan Chen,
Swathi Subramanian, Arun Thomas, Akash Duseja, Jon Kilroy, Mike Musson
{klish, wcekan, dmcwherter, clayreimann, danchen,
swathis1, arunthomas, duseja, kilroy, musson}@yahoo-inc.com
1. INTRODUCTION
Railsplitter is a framework which significantly reduces devel-
opment cost to expose a hierarchical data model as a pro-
duction quality Create, Read, Update, and Delete (CRUD)
web service. Railsplitter adopts the JSON API standard
[10] for the service definition given its focus on consumption
by front-end developers. Inherent in the design of JSON
API are capabilities that reduce the number of round trips
from client to server to fetch or update data. Updates
on disparate models can happen in a single request allow-
ing the server to build atomicity guarantees. Rather than
starting from scratch with a domain-specific language (DSL)
to describe a data model, Railsplitter adopts Java Persis-
tence API (JPA) [6] - a modeling definition that is rich and
has a long tenure of proven provider implementations. Un-
like other approaches, Railsplitter addresses the fundamen-
tal needs of flexible, model driven authorization, interoper-
ability with client side applications, and test automation.
Building a service layer with Railsplitter consists of the fol-
lowing steps:
1. Expose new or existing JPA data models with Rail-
splitter annotations for access and authorization.
2. Embed the Railsplitter library, annotated data model,
and JPA provider into a web serving container.
3. Optionally configure the test automation framework to
verify security was correctly specified.
4. Optionally leverage the JavaScript client library that
simplifies integration and caching as well as insulat-
ing the client from future changes in the JSON API
specification.
Categories and Subject Descriptors
[Experiences]
Keywords
JSON API, Web Services, JPA, Data Model, DSL
2. MOTIVATION
The problems solved by Railsplitter have existed for some
time. After exploring existing solutions, we found that no
existing framework to date encompasses all of the advan-
tages provided by Railsplitter.
Figure 1: Railsplitter libraries and services
2.1 Relay & GraphQL
Facebook has announced plans (but no current public re-
lease) to open source a reference implementation of Re-
lay and GraphQL [5]. Their proposal is to improve re-
liability and evolution of backend services by providing a
queryable backend service and corresponding query opti-
mizing front-end library. GraphQL defines a new trans-
port agnostic grammar for queries. Unfortunately, the pro-
tocol can only be relayed over HTTP post, effectively dis-
abling caching. Railsplitter addresses the concerns listed by
Facebook [5] while utilizing Representational State Transfer
(REST) APIs [2]. Namely, we achieve complicated object
graph retrieval in a single round-trip and enforce type-safety
on requests via server-side request validation.
2.2 Katharsis
Katharsis [3] is a Java implementation of JSON API. Sim-
ilarly to Railsplitter, it provides a mechanism of annotat-
ing entity classes and generating a JSON API compliant
REST API. However, Katharsis’ implementation is limited.
The data model must be defined with custom annotations.
Consequently, existing JPA annotated models will not work
out-of-box. Also, Katharsis does not provide the ability to
perform multiple reads or writes in a single atomic action.
Failing to provide transactions, concurrency issues may arise
within the system such as the ABA problem [15]. Likewise,
1
2. 1 /**
2 * Includes 100% of the the beans in a package at
3 * the root of the URL path.
4 */
5 @Include(rootLevel=true)
6 package com.yahoo.myapplication;
7 import com.yahoo.railsplitter.annotation.Include;
Figure 2: Example package-info.java
the Katharsis implementation does not address security and
opens up the entire data model. As a result, while Katharsis
may be useful in prototyping, we find the implementation to
be unusable for production services.
2.3 Spring REST
Spring REST [1], similarly to Katharsis, suffers a feature
deficit that inhibits rapid deployment of production services.
In particular, security and transactions are not explicit fea-
tures of the Spring REST framework. Again, we emphasize
the importance of security and data integrity in any broker
system to a datastore and that implementing these things
should not be arduous for developers. Additionally, every
model is made accessible as a collection at the root of the
URL path (e.g. /comments, /posts, /users). As discussed
later, this makes some security policies impossible to imple-
ment.
3. ENTITY EXPOSITION
Railsplitter limits which models can be interacted with through
two annotations: Include and Exclude. By default, models
are implicitly excluded unless marked otherwise.
Include allows access to a given model or package of models.
Include also specifies which models are accessible at the root
of the URL path. Exclude disallows access to a given entity
or package of entities. Annotations on classes override anno-
tations on packages. See figure 2 for an example annotation
on packages.
4. AUTHORIZATION
Railsplitter authorization involves a few core concepts:
• Permissions - a set of annotations that describe how
a model (create, read, update, delete) or model field
(read, update) can be accessed.
• Checks - custom code that verifies whether a user can
perform an action on the model. Checks are assigned
to permissions.
• User - an opaque object which represents the user
identity and is passed to checks.
• Model - the entity object being accessed.
4.1 Authorization Enforcement
Security is applied in three ways:
1. Granting or denying access. If a specific model or
model field is accessed and the requesting user does not
belong to a role that has the associated permission, the
request will be rejected with a HTTP 403 error code.
Otherwise the request is granted.
2. Filtering collections. If a model has any associ-
ated read permission checks, these checks are evaluated
against each model that is a member of the collection.
Only the model instances the user has access to are
returned in the response.
3. Filtering model fields. If a user has read access to
a model but only for a subset of a model’s fields, the
disallowed fields are simply excluded from the output
(rather than denying the request). This filtering does
not apply for explicit requests for sparse fieldsets.
4.2 Hierarchical Security
JSON API does not qualify whether a URL can traverse the
model-relationship graph. All documented examples only
demonstrate access to a single collection, single resource, or
a field belonging to a resource.
Without a hierarchy, all models must be accessible at the
URL root. This poses a number of problems for the decla-
ration and evaluation of security policies:
1. All models must enumerate all security checks. The
declarations become highly redundant and error prone.
2. Security check implementations may require access to
a number of related models. If a resource has more
than a single parent or if the resource cannot access
a parent (the relationship is unidirectional), and the
parent is needed to evaluate a check, it is not possi-
ble to determine the correct context to grant or deny
authorization.
Given these difficulties, Railsplitter clarifies JSON API by
explicitly allowing nested resource URL composition.
Checks are applied in a sequence based on the order in which
models and their fields are accessed. Consider an update
on ’/post/1/comment/4’ which changes the comment name
field. Checks would be evaluated in the following sequence:
1. Read permission check on the post model
2. Read field permission check on post.comment
3. Read permission check on the comment model
4. Write permission check on the comment model
5. Write field permission check on post.comment
4.3 Checks
A check is simply a class which implements the interface as
specified in figure 3.
2
3. 1 public interface Check {
2 boolean ok(PersistentResource model,
3 Object user);
4 }
Figure 3: Check interface class
4.4 Permission Annotations
Permissions include CreatePermission, ReadPermission, Up-
datePermission, and DeletePermission. Permissions are con-
figured in one of two ways:
1. Any - A list of Check classes. If any of the checks
evaluate to true, permission is granted.
2. All - A list of Check classes. If all of the checks evaluate
to true, permission is granted.
More complex check expressions can be implemented by
composing and evaluating checks inside another check class.
4.5 Audit
Audit assigns semantic meaning to CRUD operations for the
purposes of logging and audit. For example, we may want to
log when users change their password or when an account is
locked. Both actions are PATCH operations on a user entity
that update different fields.
Audit can assign these actions to parameterized, human
readable logging statements that can be logged to a file,
written to a database, or even displayed to users.
All models in Railsplitter are accessed through JPA anno-
tated relationships. For example, if a request URL has
the path ’/company/53/user’, the user model is accessed
through its relationship with a specific company. The se-
quence of prior models traversed to access a particular model
is called that model’s lineage. A model and every prior
model in its lineage are fully accessible to parameterize au-
dit logging in Railsplitter.
4.5.1 Audit Annotations
Railsplitter audits operations on classes and class fields marked
with the Audit annotation. The Audit annotation takes sev-
eral arguments:
1. The CRUD action performed (CREATE, READ,UPDATE,
or DELETE).
2. An operation code which uniquely identifies the se-
mantic meaning of the action.
3. The statement to be logged. This is a template string
that allows ’{}’ variable substitution.
4. An ordered list of Unified Expression Language (UEL)
[14] expressions that are used to substitute ’{}’ in the
log statement. Railsplitter binds the model that is be-
ing audited and every model in its lineage to variables
1 /**
2 * Example audit annotation for a password
3 * update action
4 */
5 @Entity
6 public class User {
7 ...
8 @Audit(action = Audit.Action.UPDATE,
9 operation = UPDATE_USER_PASSWORD,
10 logStatement = "User {0} from company {1}
11 changed password.",
12 logExpressions = {"${user.userid}",
13 "${company.name}"})
14 public void setPassword(String password) {
15 ...
16 }
17 ...
18 }
Figure 4: Audit annotation example
that are accessible to the UEL expressions. The vari-
able names map to model’s type (typically the class
name).
Figure 4 is an example of a class with an audit annotation.
5. EXTENDING RAILSPLITTER
It is sometimes required to customize CRUD operations with
business logic. For example, a user model might require a
cryptographic hash applied to a password field or there may
be a limit on the number of model entities that a user can
create.
However, not all business logic belongs in the service layer –
especially when developing client side applications. One of
the benefits of developing client side applications is that it
forces the developer to expose data via APIs. The cleaner
the API, the more likely it is that multiple applications can
be layered on top of it. Logic that was traditionally in a
DAO on a server-side application is often (but not always)
moved to the client.
That said, not all business logic can or should be moved
client side. Any logic related to security must be imple-
mented on the server side. Also, simple integrity checks and
field validations on data can be implemented as database
constraints or via JPA provider validation annotations. Sim-
ple logic that doesn’t reference other entities can also be im-
plemented in the JPA model classes themselves. To date,
we have found that every use case for customized logic can
either be accommodated by moving business logic to the
application, adding customized security checks, adding au-
dit annotations, adding table or validation constraints, or
adding simple logic inside the model class.
6. TESTING METHODOLOGY
Exhaustive testing of an API for a web service with a hierar-
chical data model has always been a challenge for developers
3
4. and testers. There are more negative test cases than posi-
tive ones. The negative tests can broadly be classified into
one of the following categories:
• Invalid Endpoints: The scope of tests in this category
is to verify that invalid URL combinations generate
HTTP 404 responses. This includes verifying entities
which should not be exposed (implicitly or explicitly
excluded from Railsplitter) remain hidden.
• Valid but unauthorized endpoints: The scope of these
tests is to verify the security of the web service (e.g.
the URL is valid but the user accessing the URL is
unauthorized to do so). The requests usually return
with HTTP 403 responses.
Positive tests coverage ideally should exhaust all valid end-
points and all CRUD operations for all user roles.
The Railsplitter testing framework automates this process
by requiring an alternate declaration of the security pol-
icy that has been defined by the Railsplitter annotations.
Similar to entering a password twice for a registration page,
declaring the same policy in two different ways and then ver-
ifying the results match significantly reduces the likelihood
of human error. The test framework requires the Railsplit-
ter library, a data store with test data, and a configuration
file which includes the following elements:
• A list of which entities should be exposed through Rail-
splitter, including which entities are accessible from
the root of the URL path.
• A list of user profiles. Each profile is a role or class
of users who share the same permissions. Each profile
includes:
– An alias or name for the profile
– A table which defines the following columns
∗ An entity name
∗ A list of entity permissions (create, read, up-
date, delete) the user role has access to
∗ A list of fields the user role cannot read
∗ A list of fields the user role cannot write
∗ The complete list of entity identifiers (the pri-
mary key) in the data store that the user role
has access to
Rows are only required in the table if an entity has explicit
permission annotations defined. In practice, the number of
entities with explicit checks is a small subset of the total.
Using this configuration file, the test framework can generate
JSON API CRUD requests that completely explore every
row in every table in the test data store for positive as well
as negative test cases.
7. JAVASCRIPT LIBRARY
Railsplitter also has a client-side Data Access Layer (DAL).
It is designed to allow efficient access to data from the REST-
ful web-service. The JavaScript library exposes a fluent,
promise based, API allowing user to express queries natu-
rally. All that is required to use the JavaScript library is
your favorite promise library and the schema (or portion of
the schema you wish to access) expressed as a JSON object.
Currently, the model objects needs to be handwritten, but
eventually they will be generated directly by a build-time
tool that understands the server-side JPA annotated data
model.
There are five basic methods that are exposed by the API:
find, create, update, delete, and commit. The methods
closely align with the CRUD functions of the server. The li-
brary’s methods return promises so that clients do not need
to worry about when their activities access the network.
In its default configuration the library maintains a local
cache of all the objects fetched to improve performance and
reduce load on the server. All write operations are cached
locally until the caller commits them to the server. We send
updates to the server using the JSON Patch specification [7].
Using JSON Patch allows communication with the server to
be compact and efficient. The library is designed in an ex-
tensible way so that other datastores could be implemented
if users decide that the data could be modeled more effi-
ciently in another database style (e.g. nosql) or wants to
employ a different caching strategy.
8. PRODUCTIVITY IMPROVEMENT MEA-
SUREMENTS
To compare the productivity gains from using Railsplitter
over a classic web service implementation, we compared our
classic Pulse 1.5 REST web service currently in production
to the next generation Pulse 2.0 version built with Railsplit-
ter.
The classic Pulse 1.5 web service provides a REST interface
to the Pulse 1.5 database tables. Lines of code and person-
week were collected from each corresponding GIT repository
during the 4 weeks of initial product development.
Next the work required for Pulse 1.5 was then extrapolated
to estimate work required for the complete Flurry DevPor-
tal application if it were written as a new project. These
estimates compare well (∆1% ) with the actual lines of code
in the production Flurry product.
In our case we found an estimated savings of over 85% or 109
person-weeks. See table 1 for measurements in comparing
Pulse 1.5 web service to Railsplitter enabled Pulse 2.0.
While generating the schema using JPA annotated beans is
the same work as without Railsplitter, in our case over 80%
of the time was spent writing the REST interfaces and tests.
Using the framework provides the incredible savings.
For our sample project, adding the Railsplitter security and
auditing to the existing Flurry beans is all that is required.
Business logic previously implemented with Data Access Ob-
ject (DAO) classes moved to the security constraints. How-
ever, building a project from scratch will require both beans
and security classes to be built together. Table 1 shows that
building the schema is a fraction of the time of a full classic
implementation.
4
5. Pulse 1.5
Classic
(Sample)
DevPortal
(Extrapolated)
Flurry
(Actual)
Pulse 2.0
using Railsplitter
Railsplitter
DevPortal
(Estimate)
Entity Beans 15 220 220 17 263
Schema
1,062 lines
1 person-week
15,000 lines
14 person-week
31,825 lines
1,096 lines
1 person-week
17,000 lines
14 person-week
(if from scratch)
DAO/Security
1,103 lines
1 person-week
16,000 lines
14 person-week
17,326 lines
113 lines
1 person-week
1,700 lines
3 person-week
Web Service
19,323 lines
5 person-week
280,000 lines
70 person-weeks
256,242 lines
567 lines
<1 person-week
567 lines
<1 person-week
Testing
1,519 lines
2 person-weeks
22,000 lines
30 person-weeks
11,539 lines
Automated
<1 person-week
Automated
<1 person-week
Project Total
23,007 lines
9 person-weeks
333,000 lines
128 person-weeks
316,932 lines
1,776 lines
3 person-weeks
19,000 lines
19 person-weeks
Table 1: Comparison of classic Pulse 1.5 web service to Railsplitter enabled Pulse 2.0
9. FUTURE WORK
Interoperability of a web service with client libraries is paramount
to adoption and ease of use. Currently, the frontend Rail-
splitter data access library requires a schema providing meta-
data regarding entities and relationships to be used for domain-
driven development. A proposed extension to the Railsplit-
ter Core library is a compile-time generation of a Railsplitter
JavaScript entity-relationship schema. This would be imple-
mented as a build library such as a maven plugin. The gen-
erated schema can then be exported into a frontend project
implementing the Railsplitter DAL. This will provide a con-
sistent view of the data model between the frontend and the
web service, while reducing schema development effort on
the frontend.
Another proposed extension to the library is providing a dis-
covery web service with the API web service. Similar to the
first proposed extension, a discovery web service would pro-
vide exposition of entity-relationship schema but instead of
at compile-time, it would expose the schema at runtime ei-
ther on another web service endpoint or as inline metadata
with the response payload. The Railsplitter DAL would also
be extended to communicate with the discovery web service.
The discovery web service will provide several advantages.
First, the Railsplitter DAL can be used without schema de-
velopment. As such, the Railsplitter DAL would request the
schema and build it’s knowledge of the data model and capa-
bilities on runtime. Related work in this area include the Hy-
permedia as the Engine of Application State (HATEOAS)
[4] and HAL [3] efforts. RESTful web service implementa-
tion of these efforts are seen in Spring HATEOAS [12] and
Jersey [9]/JAX-RS 2.0 Link extension [8] implementing Web
Links (RFC 5988) [11]. However, these implementations are
limited to simple RESTful web services and cannot be used
to express the capabilities of Railsplitter.
The discovery web service can also be the foundation for
documentation and publishing the API web service. The
discovery web service can be templated to provide an API
or data model documentation. Having a runtime accessible
discovery web service that is self-documenting is valuable
for development and adoption. For public programmatic ac-
cess, the discovery web service can provide a rich experience
paired with an interactive API-only user interface. Related
works in this area include Swagger [13]. Swagger currently
documents RESTful web services and is currently not capa-
ble of documenting the API functionality of Railsplitter web
service.
JPA provides for complex relationships includes maps of en-
tities (where the key, value, or both are separate models).
A JSON API extension could be developed to allow access
to these relationships.
10. CONCLUSION
Railsplitter is a framework to create production quality web
services for hierarchical data. By adopting the JSON API
[10] convention, Railsplitter allows developers to focus on
their business logic through annotated model definitions and
alleviates the need to build several standard components.
Given the inescapable need for hierarchical data services at
Yahoo, we hope Railsplitter can be used as a standard li-
brary to deliver developer productivity, consistency and co-
hesion across the company.
11. REFERENCES
[1] Building a restful web service.
https://spring.io/guides/gs/rest-service.
[2] Graphql introduction.
http://facebook.github.io/react/blog/2015/05/
01/graphql-introduction.html.
[3] Hal - hypertext application language.
http://stateless.co/hal_specification.html.
[4] Hateoas. https://en.wikipedia.org/wiki/HATEOAS.
[5] Introducing relay and graphql.
https://facebook.github.io/react/blog/2015/02/
20/introducing-relay-and-graphql.html.
[6] Java persistence api.
http://www.oracle.com/technetwork/java/javaee/
tech/persistence-jsp-140049.html.
[7] Javascript object notation (json) patch.
[8] Jax-rs 2.0 link class.
https://jax-rs-spec.java.net/nonav/2.0/
apidocs/javax/ws/rs/core/Link.html.
[9] Jersey link class. https://jersey.java.net/
5
6. documentation/latest/uris-and-links.html.
[10] Json api. http://jsonapi.org.
[11] Rfc 5988: Web linking.
http://tools.ietf.org/html/rfc5988.
[12] Spring hateoas.
http://projects.spring.io/spring-hateoas.
[13] Swagger. http://swagger.io.
[14] Unified expression language. https://uel.java.net/.
[15] D. Dechev, P. Pirkelbauer, and B. Stroustrup.
Understanding and effectively preventing the aba
problem in descriptor-based lock-free designs. In
Proceedings of the 2010 13th IEEE International
Symposium on Object/Component/Service-Oriented
Real-Time Distributed Computing, ISORC ’10, pages
185–192, Washington, DC, USA, 2010. IEEE
Computer Society.
6