ComputeNext started 3 years ago to develop the first open marketplace for cloud computing services.
We started by using the technologies we were most familiar with - C# and SQL Server, and our initial architecture and implementation was based on these technologies.
Over time, we have progressively introduced more open source elements, including MongoDB, RabbitMQ and Node.js.
Now we are at the point where most of our back-end services rely on Node.js. The talk will talk about why we did this, how we did this, and discuss our experiences - both good and bad.
Vector Databases 101 - An introduction to the world of Vector Databases
Evolution of a cloud start up: From C# to Node.js
1. Evolution of a Cloud Start-Up:
From C# to Node.js
Steve Jamieson
Lead Developer at ComputeNext
steve@computenext.com
2. Overview
• ComputeNext started 3 years ago to develop the first open
marketplace for cloud computing services.
• We started by using the technologies we were most familiar with -
C# and SQL Server, and our initial architecture and
implementation was based on these technologies.
• Over time, we have progressively introduced more open source
elements, including MongoDB, RabbitMQ and Node.js.
• Now we are at the point where most of our back-end services rely
on Node.js. The talk will talk about why we did this, how we did
this, and discuss our experiences - both good and bad.
3. What do we do? (vision)
• Provide a marketplace for Cloud resources
• Choices in resource types (VM/VS etc.)
• Choices in level – IaaS/PaaS/SaaS
• Choices in providers & regions
• Able to search for what you need
• Able to buy what you want
4. How do we do it? (challenges)
• How to define all these different cloud resource types from
different sources in a normalized way?
• How do we provide the appearance of a single cloud over multiple
different cloud providers & regions?
• How do we provide a single interface (API) to multiple cloud
providers?
• How do we manage accounts and keep track of billing?
5. Components (6)
• Web UI
• User Management
• Billing
• Search/Catalog (resources)
• Fulfilment/Provisioning
• Infrastructure & Monitoring
6. Step 1 – C#
• Web UI – ASP.NET & C#
• Users – ASP.NET & SQL Server
• Billing – C# & WCF, SQL Server
• Search – “semantic web” – C/RDF/SPARQL
• Provisioning – C# & WCF, SQL Server
• Federation Server (FS)
• Provider Gateway (PG) - connector model
• REST API – Service Stack
• Infra – Windows services & tracing
• Service restart
7. Step 1 – All C#
FEDERATION
SERVER
PROVIDER
GATEWAY
C1
INTERNAL
REST API
REST WCF WCF
SQL
P1HTTP
BILLING
RESOURCES
C2 P2HTTP
C3 P3HTTP
WEB UI
ASP.NET
C# Windows Service
C# IIS Web Service
8. V1 Provisioning
• Workload and Transaction Model
• Declarative model of a workload in JSON
• Workload is a collection of workload elements
• Transaction is a “running workload” (collection of running VM or VS instances)
• V1 API
• WCF first, then REST (ServiceStack)
• Design: AWS and OpenStack, workloads & transactions
• Workload & transactions immutable (cannot be changed after execute)
• Federation Server – C#/WCF
• SQL access via Data Access Layer (DAL)
• Event driven model, graph for dependencies
• Resource types hardcoded (C# class)
• Provider Gateway – C#/WCF
• Connector Model
9. Why Node.js?
• Deployment Flexibility
• Platform independent
• Might need to deploy services anywhere
• We heard it could scale…
• Small footprint
• Fits well with our new proposed architecture (REST services)
• Lots of packages
• Good fit with NoSql (MongoDB) - JSON
• Good “word on the street”
10. Step 2 - Hybrid
• Web UI – Wordpress & PHP
• Users – SQL Server
• authentication service (Node.js + MongoDB)
• roles (authorization) service (Node.js + MongoDB)
• Billing – C# & SQL Server
• Search – Node.js & MongoDB
• Infrastructure
• iisnode + Nagios + tracing
• Other
• insight service (Node.js)
11. V1 Node.js services (4)
• authentication
• usernames/passwords/cookies/tokens/apikeys
• passwords salted & hashed
• roles (authorization)
• roles based on REST API, nested roles
• GET /workload > get.workload
• insight
• execute prepared SQL queries on demand
• run “sqlcmd” as XML, convert to JSON, REST API
• resources
• “triples” in MongoDB (subject/predicate/object)
12. Step 2 – C# and some Node.js
FEDERATION
SERVER
PROVIDER
GATEWAY
C1
WEB UI
WP PHP
INTERNAL
REST API
REST WCF WCF
SQL
P1HTTP
EXTERNAL
REST API
BILLING
RESOURCES
C2 P2HTTP
C3 P3HTTP
WCF
ROLESAUTH
INSIGHT
M M
C# Windows Service
C# IIS Web Service
Node.js
M MongoDB
M
13. Step 3 (V2) – Mostly Node.js!
• Web UI – Wordpress, PHP & Magento (MySql)
• Users – SQL Server & authentication & roles (Node.js)
• Billing – Node.js & MySql, still some C#
• Search – Magento/SOLR, plus resources for API
• Provisioning – Node.js & RabbitMQ & MongoDB
• Infrastructure - iisnode & Nagios & tracing
14. V2 Node.js services (10 + 3)
• instance – requests & instances, MongoDB
• provider – JavaScript connector model (simplified from V1)
• workload – plan/execute model
• insight (V2) - timing & inventory
• gateway – external API
• billing
• resources – catalog and search
• chef – deploy Chef cookbooks (to VM)
• archive – stores data for analytics
• upload – upload private images to regions
15. Step 3 – Mostly Node.js!
INSTANCE PROVIDER
C1
WEB UI
MAGENTO
WORKLOAD
P1HTTP
GATEWAY
BILLING
RESOURCES
C2 P2HTTP
C3 P3HTTP
ROLESAUTH
TIMING &
INVENTORY
MYSQL
SQL
USER
RABBITMQ
M
M
M M
M
M C# IIS Web Service
Node.js
RabbitMQ
Other
16. Migration
• Instances
• Migrate all the instances from V1 to V2
• Export all active instances from SQL Server
• Import instances into MongoDB
• Solution: Node.js scripts, JSON
• Connectors
• V1 (C#) wants to use the new V2 (JavaScript) connectors
• V2 (JavaScript) wants to use the old V1 (C#) connectors
• Solution: C# and Node.js can talk over RabbitMQ
23. What we’ve learned
• A good tracing system is critical
• End-to-end JSON (Node.js + MongoDB + JavaScript) makes life a
lot easier
• Small self-contained services with well defined REST interfaces are
quick to write/change/fix (but lots of communication)
• Async – takes some getting used to - single threaded but highly
concurrent
• Data driven design pays off (esp. if data is JSON)
• Separate concerns (workloads/instances/plan/execute)
• File system is a pretty good database
24. What we’ve learned about clouds
• Lots of diversity in cloud object models (connectors)
• normalization is not straightforward
• concepts don’t line-up exactly
• easier for some resource types than others
• error handling is key
• not always well documented
• things change very fast!
• Diversity across cloud implementations (providers)
• “standard” clouds… are not!
• lots of data to manage and keep up to date
• performance – test, monitor & track
• Relationships with cloud providers are key
• marketplace helps providers understand their customers
• Futures
• beyond virtualization & images – Chef/Puppet/Docker
25. Favorite Node.js packages
• express (REST API)
• async (flow control)
• json-schema (input validation)
• JSONPath (like XPath for JSON - used to parse JSON from connectors)
• string (C# like string functions – startsWith/endsWith)
• request (simple HTTP client)
• node-yaml-config (YAML for config files)
• cjson (JSON with comments, great for data-driven code)
• underscore (useful functions)
• datejs (expire dates)
• xml2json & xml2js (XML to JSON conversions)
• mongodb/amqp/mysql
• tracer (customized)
• node-cache (better than do-it-yourself)
• edge (call C# from Node.js)
26. Great things about Node.js
• NPM – lots of packages
• Version control - it doesn’t break!
• End-to-end JavaScript & JSON
• Single thread eliminates whole class of bugs
• Single thread – no locks, easier to program
• REST is easy (with express)
• No compile
• It’s a scripting language too!
• “functions” beat “objects”?
• Platform independent (Windows or Linux)
27. Not so great…
• Miss tools integration & easy to use debugger, IntelliSense
• Miss C# AppDomains – not so good in Node (Domains)
• Service model limited (iisnode - web services only)
• Error handling & exceptions can be confusing
• No binary model – everyone sees source code
• JavaScript/JSON – no schemas, so forget!
• JavaScript – beware gotcha’s
• Because not compiled, can get silly runtime errors
• Async - can get no callbacks, or multiple callbacks!
• Packages – some are Linux only, some don’t work “as advertised”
28. Where are we now?
• www.ComputeNext.com
• ~ 40 providers
• ~ 145 regions
• ~ 25 connectors
• 10 resource types: vm/vs/kp/sg/ip/image/lb/mp/snap/obs
• www.MediaPaaS.com
• External REST API, tools & docs
You need to be able to search for what you need across all provider and regions.
You need to be able to have one account and one bill for all the resources you use across any providers and regions.
WCF = Windows Communication Foundation
The Provider Gateway (PG) implemented a “connector model” by which we could independently develop multiple connectors and load them into the PG as required. This allowed us to develop our various connectors more quickly.
FS and PG were Windows services, which gave us some restart capabilities if the service crashed for some reason.
Circles are essentially process boundaries.
Blue = Windows Service, C#
Yellow = Web Service running under IIS, C#
We developed our V1 API using as WCF interfaces first, then extended it to REST using ServiceStack.
We moved away from ASP.NET to gain more speed & flexibility in the UI development.
Having done that we lost the user management capabilities of ASP.NET so we had to develop authentication & authorization (roles) services – this became our “proof of concept” for using Node.js.
Adding our own authentication service allowed us to add new capabilities that we needed such as supporting “apikeys” for our REST API authentication.
The “insight” service was added to allow us to do simple queries on our SQL Server database. At this point the Node.js driver for SQL Server was not available, so Node.js would run “sqlcmd” as a separate process, get the output as XML, and convert to JSON to be made available through a REST interface for our UI.
The “M” indicates that the service uses MongoDB. Unlike V1, where we shared one SQL database across multiple services, in V2 each service uses a separate MongoDB database, so every service is “self contained”.
Magento – open source e-commerce platform
Apache SOLR – open source search engine
iisnode is a Node.js package from Microsoft that allows Node.js services to run as IIS web services managed by IIS. This gives us process pooling, process recycling and restart.
Unlike V1, in V2 we now split the API between the “instance service” and the “workload service”.
The “instance service” is what you might expect from AWS or OpenStack – an API that allows single instances of particular resources to be provisioned.
The “workload service” is similar to what we had in V1, a collection of declarative workload elements that specifies what you want.
In V2, the workload service calls the instance service to provision each individual resource.
Also, in V2 the workloads are a lot more flexible – you can add things to them, remove things from them, and deactivate and re-activate them again, unlike in V1 where you basically got one shot at it.
For V2, The insight service, although it has a similar name, is quite different from the insight service in V1.
The V2 insight service now provides timing information and inventory for the workload service and the UI.
This slide shows our transition to mostly Node.js services.
The connectors are now JavaScript.
The red arrows show our use of RabbitMQ – you can see that the timing & inventory services (in the V2 insight service) and the billing service “listen-in” on the queue between the instance service and the provider service. This makes communication between services simpler.
As of June 2014. Adding more providers and regions every month!
vm = virtual machine
vs = volume storage
kp = key pair
sg = security group
ip = IP address
image = image
lb = load balancer
mp = MediaPaas
snap = volume snapshot
obs = object storage