3. Cellular networks are choking
Automatic optimization to the rescue:
1. Collect analytics
2. Analyze and update network configuration
3. Back to 1!
SON – self optimizing networks
An example: a loaded cell
We’re a proud Python shop
4. Agenda
Why and how we migrated to MongoDB
Do you need an API?
What is a RESTful API?
A review of Intucell’s API
MongoDB best practices
5. Why MongoDB?
Scale and failover just works!
Data center partition tolerance
Development speed
Fast prototyping – schema changes frequently
Slows down when in need for joins and transactions
6. Migration Challenges
Migrating from MySQL to MongoDB
People have direct access to the DB
20 developers
40 analysts and tech support
“No joins? SQL? Transactions? GUI?”
A lot to make up for!
7. Why An API?
Complement mongo – reports (joins!) and PQL
Hide implementation – data store(s), short names
Security - auth isn’t enough: {$where:'while(1){}‟}
Resource management – run slow queries on slaves
Schema and referential integrity
8. Type Of API
Small layer on top of your driver
Dictionaries and hashes - not OO!
MongoEngine/MongoKit (ODM)
Your own!
RESTful
Cross language
Inherent to web apps
Standards for caching, auth, throttling
9. RESTful
“Representational state transfer”
Not a standard but an architectural style
Basically it’s a bunch of guidelines!
Real world APIs break some of them
HTTP as a communication layer
Implementing CRUD using HTTP
10. RESTful Routes
Resource Method and Route Meaning
Users collection GET /users/ Read users
DELETE /users/ Delete users
PUT /users/ Update users
POST /users/ Create user/s
A user GET /users/<id> Read a user
DELETE /users/<id> Delete a user
PUT /users/<id> Update a user
POST /users/<id> Create a user
* RESTful APIs usually don’t support batch operations of create/update/delete
11. HTTP Crash Course
GET /search?q=foo&source=web HTTP/1.1
Host: www.google.co.il
Cache-Control: max-age=0
User-Agent: Mozilla/5.0
Accept: text/html,application/xml
Accept-Encoding: gzip,deflate,sdch
Cookie: PREF=ID=9a768e836b317d:U=fd620232bd98bd
* Note that I removed and shortened some headers
* query string parameters are limited to 2k! (browser specific)
12. HTTP Crash Course
POST /api/v1/system/auth/users/alonho/ HTTP/1.1
Host: localhost
Content-Length: 20
Content-Type: application/json
User-Agent: python-requests/0.9.3
Cookie: token=6f01a9decd518f5cf5b4e14bddad
{"password": "none"}
* Note that I removed and shortened some headers
* Content (body) is allowed only in POST/PUT
13. CLI for HTTP
A CLI can make your life easier
Each API call is defined by:
A resource
A method
Parameters
% son_cli –-create users name=„alon‟
+--------------------------+------+
| id | name |
+==========================+======+
| 5192605a9716ab5a94b37d3c | alon |
+--------------------------+------+
14. Resource Generation
We already use MongoEngine
Declarative
Enforces schema
Supports inheritance (multiple types in one collection)
class User(Document):
name = StringField(required=True)
age = IntField(min_value=13,
help_text=„Years alive‟,
required=True)
register_mongo_resource(User, „/users‟)
15. Create
% son_cli –c users age=3
{„error‟: „Bad Request‟,
„code‟: 400,
„message‟: „Value 3 for field “age” is less
than minimum value: 13‟}
% son_cli -c users name='alon' age=120
+--------------------------+------+-----+
| id | name | age |
+==========================+======+=====+
| 5192605a9716ab5a94b37d3c | alon | 120 |
+--------------------------+------+-----+
16. Read
% son_cli –r users
+--------------------------+------+-----+
| id | name | age |
+==========================+======+=====+
| 5192605a9716ab5a94b37d3c | alon | 120 |
+--------------------------+------+-----+
| 5192608d9716ab5a94b37d3d | john | 100 |
+--------------------------+------+-----+
| 519265909716ab5a94b37d3e | snow | 30 |
+--------------------------+------+-----+
Sane defaults: by default read returns first 50 documents
17. Read Less
% son_cli -r users page_size=2 page=0 fields=name,age
+------+-----+
| name | age |
+======+=====+
| alon | 120 |
+------+-----+
| john | 100 |
+------+-----+
18. Read Ordered
% son_cli -r users fields=name,age order=age
+------+-----+
| name | age |
+======+=====+
| snow | 30 |
+------+-----+
| john | 100 |
+------+-----+
| alon | 120 |
+------+-----+
How would you order by ascending age and descending name:
% son_cli -r users order=age,-name
19. Read Filtered
% son_cli -r users query=„age < 40 or name == “john”‟
+--------------------------+------+-----+
| id | name | age |
+==========================+======+=====+
| 5192608d9716ab5a94b37d3d | john | 100 |
+--------------------------+------+-----+
| 519265909716ab5a94b37d3e | snow | 30 |
+--------------------------+------+-----+
26. Defying REST
Collection level updates are rarely seen
Performance – how long will it take?
Query strings too long for GET (2k)
Fall back to POST/PUT (lose caching)
Extend OPTIONS for route completion
OPTIONS returns supported methods
Added an extension that returns routes
27. Route Discovery
% curl -X OPTIONS http://localhost/api/v1/
{„options‟: [„users/‟, „posts/‟]}
% curl –X OPTIONS http://localhost/api/v1/users/
{„options‟: [„alon‟, „john‟]}
% curl http://localhost/api/v1/users/alon
{„name‟: „alon‟, „twitter‟: „alonhorev‟}
* Available as an extension to flask called route-options
30. Querying
Lets filter some users by names:
Mongo:
user_names = [„foo‟, „bar‟]
db.users.find({„name‟: {„$in‟: user_names}})
SQL:
name_list = „, ‟.join(map(sql_escape, user_names))
sql = „select * from users where
name in ({})‟.format(name_list)
* SQL users: do yourselves a favor and use an ORM.
31. Querying
Lets find users older than 60 or younger than 20:
Mongo:
db.users.find({„$or‟: [{„age‟: {„$gt‟: 60}},
{„age‟: {„$lt‟: 20}}])
SQL:
sql = „select * from users where age > 60 or age < 20‟
32. PQL
Mongo’s queries are easier to compose
SQL is easier to write when invoking ad-hoc queries
PQL was born – Mongo queries for humans!
>>> pql.find('age < 20 or age > 60‟)
{'$or': [{'age': {'$lt': 20}},
{'age': {'$gt': 60}}]}
34. PQL - Aggregations
Car listing:
{made_on: ISODate("1973-03-24T00:00:02.013Z”),
price: 21000}
Number of cars and total of prices per year in 1970-1990:
> from pql import project, match, group
> collection.aggregate(
project(made_on='year(made_on)',
price='price') |
match('made_on >= 1970 and made_on <= 1990') |
group(_id='made_on',
count='sum(1)',
total='sum(price)'))
38. BSON != JSON
ObjectID and Date are BSON specific!
Convert them to strings
Using a codec is better – symmetrical!
>>> from bson import json_util
>>> json_util.dumps(datetime.datetime.now())
{"$date”: 1367970875910}
>>> json_util.dumps(bson.ObjectId())
{"$oid": "51896a43b46551eff3f43594"}
39.
40. Python != JSON
JSON Document Python Dictionary
Key type Only strings Anything immutable
Key order Ordered Unordered
Example: user id to name mapping
Python: {1234: „Alon Horev‟, 1038: „John Wayne‟}
Javascript: [{„id‟: 1234, „name‟: „Alon Horev‟},
{„id‟: 1038, „name‟: „John Wayne‟}]
42. References
http://python-eve.org/ - A new RESTful API for MongoDB written in Python
http://flask.pocoo.org/– A great python web framework
https://github.com/alonho/pql - The PQL query translator
https://github.com/micha/resty - resty enhances curl for RESTful API calls
Learn from others! Twitter and Facebook have great RESTful APIs
Notas do Editor
Developers use the database for debugging and introspection.Analysts learned SQL and used the database for performance analysis and report generation.
You would not find a spec or a reference implementation.There are good examples out there (facebook, twitter) and good framworks to help you build RESTful APIs.