No sql & dq2 tracer service

NoSQL & Dq2 Tracer Service
Donal Zang (IHEP)
PH‐ADP‐DDM

ATLAS Software & Computing Workshop
July 21,2011

ph‐adp‐ddm‐lab@cern.ch 1

content
• DQ2 tracer service
• NoSQL experience

NoSQL & DQ2 Trace Service 2

DQ2 tracer service
• Records relevant information about dataset/file access and
usage on the Grid
– type, status, local site, remote site, file size, time, usrdn, etc
• Used by dq2 client tools (dq2‐get,dq2‐put) and other apps
(PanDA, Athena)
• Traces can be analyzed for many purposes
– dataset popularity (popularity.cern.ch by Angelos)
– DDM simulations
– User behavior analysis
– DDM system monitoring
– …
• There are ~5 million traces every day


Tracer monitoring use cases
• Whole system monitoring (real time)
– local‐read, local‐write, remote‐read, remote‐write
– failed‐operation
– breakdown by applications, dataset types, sites, DNs
• Dq2‐get statistics in DDM dashboard (real time)
– transfer rate in files and GB from each DDM endpoint or from each site/SE
– https://savannah.cern.ch/support/?121744
• Specified report (monthly/yearly)
– Get the amount of dq2‐getted data, per dataset type , per destination, per
domain, per DN
– For all end‐points, get the number of dq2‐get operations, breakdown by
distinct user
– For all groupdisk end‐points, give the number of all operations, read, write,
local‐read, remote‐read and distinct users, breakdown by application


Problem
• All these use‐cases need aggregation(count,
sum) queries
• On the production Oracle, it usually takes tens
of minutes or hours
• These queries place a significant I/O workload
on Oracle
• The aggregation metrics can be very dynamic
and in large number
• We want to make the analysis in real time

Possible ways
1. Can we just store the traces in table (Oracle
or NoSQL) and do ad‐hoc queries on it
whenever we need it?
2. If not, we may need to pre‐compute the trace
and store the indexes or counters, and query
on the them


NoSQL ‐ Cassandra
• About Cassandra
– A distributed database, bringing together Dynamo's fully distributed design
and Bigtable's ColumnFamily‐based data model.
– Apache open source
• Some concepts
– Column based
– Replication factor (N)
– Eventually consistence (R+W > N)
– Partition (order‐preserving vs random)
• Order‐preserving partition may cause data imbalance between nodes and need manually
rebalanced
• Random partition balances very well, but loses the ability to do a range query on keys
– MemTable && SSTable
• Memory >> Disk
• Sequential >> Random
– commitlog


Data model in Cassandra
• Column
(name,value,timestamp)
• Row
key:{column1,column2,…}
• Column family
– Something like a table in relational DataBase
• Keyspace
– Usually one application has one keyspace
• Example
Keyspace: DDMTracer
Column family:
t_traces{
1311196995640667:{
‘eventType’ : ‘get’,
‘localSite’ : ‘CERN‐PROD_DATADISK’,
...
}


Test results ‐ write performance
• Using multi‐mechanize, run time:10 minutes ,ramp up: 5s
• Row by row insertion, each row is ~3KB
• Tried 2*5 ,4*5,8*5,16*5 threads,1 connection per thread

Oracle INTR 8*5 threads Oracle RDTEST1 16*5 threads

Mongodb 8*5 threads Cassandra 16*5 threads

https://svnweb.cern.ch/trac/dq2/wiki/Oracle%20and%20NOSQL%20performance%20study#Writeperformance

Test results ‐ query performance
• Migrate one month’s traces (90,578,231 rows / 34 gigabytes) to a test table
• Query 1
– Get the total number of traces
• Query 2
– For each '%GROUPDISK%‘ endpoint, get the "Total Traces“, "Write Traces“, "Total Users", for
the last month

Oracle Oracle RDTEST1 Oracle production
Query Oracle INTR Cassandra
RDTEST1 cache ADCR

Query 1 39 seconds 30 seconds ~1 second 1.14 hour 2.2 minutes

Query 2 47 seconds 30 seconds ~3 seconds >5 hours 28.3 minutes

• Notes on Oracle
– Thanks to Luca
– INTR and RDTEST1 use parallel sequential reading from IO.
• /*+ parallel (t 16) */
– In RDTEST1 with current IO setup speed is ~1.5 GB/sec
– In RDTEST1 cache, 34GB was used.
• Notes on Cassandra
– 9 nodes, default settings
– Using random partition, good for data balance between nodes, bad for range query on keys


conclusion
• For large amount of data, aggregation usually involves lots
of disk I/O and is very slow, and has a significant impact on
Oracle
• Ad‐hoc queries on both Oracle(production) and Cassandra
don’t satisfy our need
• Oracle 11g on RDTEST1 performs well, looking forward it in
production, but
– Queries still affect oracle performance, need separate instances?
– For even larger data (i.e. 1 year), queries would still be slow
• I tried another way: make use of the insertion rate, to get
faster queries
– build up many pre‐defined indexes (slide 12)
– use distributed counters (slide 13)


Use column family to build index
• Query test
– Query: get the count and sum of traces group by site and
eventType in a specific time period
– Use Cassandra CF to build indexes like
{‘site:eventType:traceID’ : filesize}
– Cassandra data model
t_index = {
'2011052017:remoteSite:eventType':{
'CERN‐PROD_DATADISK:put_sm:1304514380628696' : 23444,
'CERN‐PROD_DATADISK:get:1304514380628697' : 32232,
'CERN‐PROD_GROUPDISK:put_sm:1304514380628696' : 43122,
...
},
....
}

– Query results
Oracle(production, ADCR) Cassandra(use CF as index)
48 minutes (query t_traces) 10 seconds (query the index)


Use distributed counters
• The process
– Agents read traces from the Queue
– Buffer for N(10) messages ActiveMQ ActiveMQ
– Increase the corresponding counters in
Cassandra
Trace message config
• This structure is simple
– All components are scalable
(distributed) agent agent agent
– Persistence is supported by MQ server
and Cassandra
– Do not need the trace messages to Increment
come in time order
• High performance on both write and
read Cassandra
– Can afford >10,000 update per second cluster
– Query usually takes less than 0.1
second DQ2 Tracer Infrastructure
– We can use replay to add new
counters on history data quickly

Some monitoring plots from counters
count of dq2‐get for data type,June 2011 count of dq2‐get for dest sites ,June 2011
user CERN‐PROD

NTUP ROAMING

other TOKYO‐LCG2

AOD UKI‐SOUTHGRID‐OX‐HEP

ESD unidentified_BNL

TAG DESY‐HH

• Ref. Eric’s talk
• Will provide a general API for DDM Monitoring


Thanks!
Questions?


backup ‐ Test‐bed setup
• MongoDB (2 nodes)
– Hardware type: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (2/8), 24098 MB (6142 MB)
– MongoDB version: 1.8.1 (latest stable)
• Cassandra (9 nodes cluster)
– Hardware type: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (2/8),24098 MB (28662 MB)
– Cassandra version: apache‐cassandra‐0.7.6‐2
– Puppet configuration: https://github.com/ddmlab/cassandra
• Oracle
– Hardware type: Intel(R) Xeon(R) CPU L5640 @ 2.27GHz (2/12), 48290 MB (16387 MB)
– Storage: ASM and 8Gbps dual‐ported HBAs. 2 storage arrays, 24 SAS disks in total. NAS on
10GigE also available.
– Oracle version: 11g
– DB_name: rdtest1

– Hardware type: Intel(R) Xeon(R) CPU L5520 @ 2.27GHz (23/8), 24097 M
– Storage: ASM and 4Gbps dual‐ported HBAs. 3 storage arrays,36 SATA disks in total.
– Oracle version: 10g
– DB_name: intr


No sql & dq2 tracer service

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (6)

Semelhante a No sql & dq2 tracer service

Semelhante a No sql & dq2 tracer service (20)

Último

Último (20)

No sql & dq2 tracer service