Mais conteúdo relacionado Semelhante a CDR-Stats : VoIP Analytics Solution for Asterisk and FreeSWITCH with MongoDB (20) CDR-Stats : VoIP Analytics Solution for Asterisk and FreeSWITCH with MongoDB2. Problems to solve
- Millions of Call records
- Multiple sources
- Multiple data formats
- Replication
- Fast Analytics
- Multi-Tenant
- Realtime
- Fraud detection
3. Why MongoDB
- NoSQL - Schema-Less
- Capacity / Sharding
- Upserts
- Replication : Increase read capacity
- Async writes : Millions of entries / acceptable losses
- Compared to CouchDB - native drivers
9. Under the hood
- FreeSWITCH (freeswitch.org)
- Asterisk (asterisk.org)
- Django (djangoproject.com)
- Celery (celeryproject.org)
- RabbitMQ (rabbitmq.com)
- Socket.IO (socket.io)
- MongoDB (mongo.org)
- PyMongo (api.mongo.org)
- and more...
10. Our Data - Call Detail Record (CDR)
1) Call info :
2) BSON : CDR = { 'hangup_cause_q850':'20',
... 'hangup_cause':'NORMAL_CLEARING',
'callflow':{ 'sip_received_ip':'192.168.1.21',
'caller_profile':{ 'sip_from_host':'127.0.0.1',
'tts_voice':'kal',7',
'username':'1000',
'accountcode':'1000',
'destination_number':'5578193435', 'sip_user_agent':'Blink 0.2.8 (Linux)',
'ani':'71737224', 'answerusec':'0',
'caller_id_name':'71737224', 'caller_id':'71737224',
... 'call_uuid':'adee0934-a51b-11e1-a18c-
}, 00231470a30c',
... 'answer_stamp':'2012-05-23 15:45:09.856463',
}, 'outbound_caller_id_name':'FreeSWITCH',
'variables':{ 'billsec':'66',
'mduration':'12960', 'progress_uepoch':'0',
'effective_caller_id_name':'Extension 1000', 'answermsec':'0',
'sip_via_rport':'60536',
'outbound_caller_id_number':'0000000000',
'uduration':'12959984',
'duration':'3', 'sip_local_sdp_str':'v=0no=FreeSWITCH
'end_stamp':'2012-05-23 15:45:12.856527', 1327491731n'
'answer_uepoch':'1327521953952257', },
'billmsec':'12960', ...
...
3) Insert Mongo : db.cdr.insert(CDR);
12. Pre-Aggregate - Daily Collection
Produce data easier to manipulate :
current_y_m_d = datetime.strptime(str(start_uepoch)[:10], "%Y-%m-%d")
CDR_DAILY.update({
'date_y_m_d': current_y_m_d,
'destination_number': destination_number,
'hangup_cause_id': hangup_cause_id,
'accountcode': accountcode,
'switch_id': switch.id,
},{
'$inc':
{'calls': 1,
'duration': int(cdr['variables']['duration']) }
}, upsert=True)
Output db.CDR_DAILY.find() :
{ "_id" : ..., "date_y_m_d" : ISODate("2012-04-30T00:00:00Z"), "accountcode" : "1000", "calls" : 1, "destination_number"
: "0045277522", "duration" : 23, "hangup_cause_id" :9, "switch_id" :1 }
...
- Faster to query pre-aggregate data
- Upsert is your friend / update if exists - insert if not
13. Map-Reduce - Emit Step
- MapReduce is a batch processing of data
- Applying to previous pre-aggregate collection (Faster / Less data)
map = mark_safe(u'''
function(){
emit( {
a_Year: this.date_y_m_d.getFullYear(),
b_Month: this.date_y_m_d.getMonth() + 1,
c_Day: this.date_y_m_d.getDate(),
f_Switch: this.switch_id
},
{calldate__count: 1, duration__sum: this.duration} )
}''')
14. Map-Reduce - Reduce Step
Reduce Step is trivial, it simply sums up and counts :
reduce = mark_safe(u'''
function(key,vals) {
var ret = {
calldate__count : 0,
duration__sum: 0,
duration__avg: 0
};
for (var i=0; i < vals.length; i++){
ret.calldate__count += parseInt(vals[i].calldate__count);
ret.duration__sum += parseInt(vals[i].duration__sum);
}
return ret;
}
''')
15. Map-Reduce
Query :
out = 'aggregate_cdr_daily'
calls_in_day = daily_data.map_reduce(map, reduce, out, query=query_var)
Output db.aggregate_cdr_daily.find() :
{ "_id" : { "a_Year" : 2012, "b_Month" : 5, "c_Day" : 13, "f_Switch" :1 }, "value" : { "calldate__count" : 91,
"duration__sum" : 5559, "duration__avg" : 0 } }
{ "_id" : { "a_Year" : 2012, "b_Month" : 5, "c_Day" : 14, "f_Switch" :1 }, "value" : { "calldate__count" : 284,
"duration__sum" : 13318, "duration__avg" : 0 } }
...
17. WAT else...?
- Website : http://www.cdr-stats.org
- Code : github.com/star2billing/cdr-stats
- FOSS / Licensed MPLv2
- Get started : Install script
Try it, it's easy!!!
18. Questions ?
Twitter : @areskib
Email : areski@gmail.com