In this session, you’ll learn about how Apache Cassandra is used with Python in the NY Times ⨍aбrik messaging platform. Michael will start his talk off by diving into an overview of the NYT⨍aбrik global message bus platform and its “memory” features and then discuss their use of the open source Apache Cassandra Python driver by DataStax. Progressive benchmark to test features/performance will be presented: from naive and synchronous to asynchronous with multiple IO loops; these benchmarks tailored to usage at the NY Times. Code snippets, followed by beer, for those who survive. All code available on Github!
12. A Global Mesh with a Memory
Message-based: WebSocket, AMQP, SockJS
If in doubt:
• Resend
• Reconnect
• Reread
Idempotent:
• Replicating
• Racy
• Resolving
Classes of service:
• Gold: replicate/race
• Silver: prioritize
• Bronze: queueable
Millions of users
13.
14. Message: an event with data
CREATE TABLE source_data (
hash_key int, -- real ones are more complex
message_id timeuuid,
body blob, -- whatever
metadata text, -- JSON
PRIMARY KEY (hash_key, message_id)
);
39. Push some messages
usage: bm_push.py [-h] [-c [CQL_HOST [CQL_HOST ...]]] [-d LOCAL_DC]
[--remote-dc-hosts REMOTE_DC_HOSTS] [-p PREFETCH_COUNT]
[-w WORKER_COUNT] [-a] [-t]
[-n {ONE, TWO, THREE, QUORUM, ALL, LOCAL_QUORUM,
EACH_QUORUM, SERIAL, LOCAL_SERIAL, LOCAL_ONE}]
[-r] [-j] [-l
{CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}]
Push messages from a RabbitMQ queue into a Cassandra table.
40. Push messages many times
usage: run_push.py [-h] [-c [CQL_HOST [CQL_HOST ...]]] [-i ITERATIONS]
[-d LOCAL_DC] [-w [worker_count [worker_count ...]]]
[-p [prefetch_count [prefetch_count ...]]]
[-n [level [level ...]]] [-a] [-t] [-m MESSAGE_EXPONENT]
[-b BODY_EXPONENT]
[-l {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET}]
Run multiple test cases based upon the product of worker_counts,
prefetch_counts, and consistency_levels. Each test case may be run with up to
4 variations reflecting the use or not of the dc_aware and token_aware
policies. The results are output to stdout as a JSON object.