3. Mbox
From santa@northpole.example.org Fri Dec 25 >
00:06:42 2009 > Please proceed per the norm.
Message-ID: >
<16159836.1075855377439@mail.northpole.exampl > Regards,
e.org> > Buddy
References: >
<88364590.8837464573838@mail.northpole.exampl > --
e.org> > Buddy the Elf
In-Reply-To: > Chief Elf
<194756537.0293874783209@mail.northpole.exam > Workshop Operations
ple.org> > North Pole
Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT) > buddy.the.elf@northpole.example.org
From: St. Nick <santa@northpole.example.org> From buddy.the.elf@northpole.example.org Fri Dec
To: rudolph@northpole.example.org 25 00:03:34 2009
Subject: RE: FWD: Tonight Message-ID:
Mime-Version: 1.0 <88364590.8837464573838@mail.northpole.exampl
Content-Type: text/plain; charset=us-ascii e.org>
Content-Transfer-Encoding: 7bit Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT)
Sounds good. See you at the usual location. From: Buddy
<buddy.the.elf@northpole.example.org>
Thanks, To: workshop@northpole.example.org
-S Subject: Tonight
-----Original Message----- Mime-Version: 1.0
From: Rudolph Content-Type: text/plain; charset=us-ascii
Sent: Friday, December 25, 2009 12:04 AM Content-Transfer-Encoding: 7bit
To: Claus, Santa Last batch of toys was just loaded onto sleigh.
Subject: FWD: Tonight Please proceed per the norm.
Santa - Regards,
Running a bit late. Will come grab you shortly. Buddy
Standby. --
Rudy Buddy the Elf
Begin forwarded message: Chief Elf
> Last batch of toys was just loaded onto sleigh. Workshop Operations
North Pole
buddy.the.elf@northpole.example.org
4. Mbox
From santa@northpole.example.org Fri Dec 25 >
00:06:42 2009 > Please proceed per the norm.
Message-ID: >
<16159836.1075855377439@mail.northpole.exampl > Regards,
e.org> > Buddy
References: >
<88364590.8837464573838@mail.northpole.exampl > --
e.org> > Buddy the Elf
In-Reply-To: > Chief Elf
<194756537.0293874783209@mail.northpole.exam > Workshop Operations
ple.org> > North Pole
Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT) > buddy.the.elf@northpole.example.org
From: St. Nick <santa@northpole.example.org> From buddy.the.elf@northpole.example.org Fri Dec
To: rudolph@northpole.example.org 25 00:03:34 2009
Subject: RE: FWD: Tonight Message-ID:
Mime-Version: 1.0 <88364590.8837464573838@mail.northpole.exampl
Content-Type: text/plain; charset=us-ascii e.org>
Content-Transfer-Encoding: 7bit Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT)
Sounds good. See you at the usual location. From: Buddy
Thanks, <buddy.the.elf@northpole.example.org>
-S To: workshop@northpole.example.org
-----Original Message----- Subject: Tonight
From: Rudolph Mime-Version: 1.0
Sent: Friday, December 25, 2009 12:04 AM Content-Type: text/plain; charset=us-ascii
To: Claus, Santa Content-Transfer-Encoding: 7bit
Subject: FWD: Tonight Last batch of toys was just loaded onto sleigh.
Santa - Please proceed per the norm.
Running a bit late. Will come grab you shortly. Regards,
Standby. Buddy
Rudy --
Begin forwarded message: Buddy the Elf
> Last batch of toys was just loaded onto sleigh. Chief Elf
Workshop Operations
North Pole
buddy.the.elf@northpole.example.org
5. Mbox
From santa@northpole.example.org Fri Dec 25 >
00:06:42 2009 > Please proceed per the norm.
Message-ID: >
<16159836.1075855377439@mail.northpole.exampl > Regards,
e.org> > Buddy
References: >
<88364590.8837464573838@mail.northpole.exampl > --
e.org> > Buddy the Elf
In-Reply-To: > Chief Elf
<194756537.0293874783209@mail.northpole.exam > Workshop Operations
ple.org> > North Pole
Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT) > buddy.the.elf@northpole.example.org
From: St. Nick <santa@northpole.example.org> From buddy.the.elf@northpole.example.org Fri Dec
To: rudolph@northpole.example.org 25 00:03:34 2009
Subject: RE: FWD: Tonight Message-ID:
Mime-Version: 1.0 <88364590.8837464573838@mail.northpole.exampl
Content-Type: text/plain; charset=us-ascii e.org>
Content-Transfer-Encoding: 7bit Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT)
Sounds good. See you at the usual location. From: Buddy
Thanks, <buddy.the.elf@northpole.example.org>
-S To: workshop@northpole.example.org
-----Original Message----- Subject: Tonight
From: Rudolph Mime-Version: 1.0
Sent: Friday, December 25, 2009 12:04 AM Content-Type: text/plain; charset=us-ascii
To: Claus, Santa Content-Transfer-Encoding: 7bit
Subject: FWD: Tonight Last batch of toys was just loaded onto sleigh.
Santa - Please proceed per the norm.
Running a bit late. Will come grab you shortly. Regards,
Standby. Buddy
Rudy --
Begin forwarded message: Buddy the Elf
> Last batch of toys was just loaded onto sleigh. Chief Elf
Workshop Operations
North Pole
buddy.the.elf@northpole.example.org
6. Mbox
{
"From": "St. Nick <santa@northpole.example.org>",
"Content-Transfer-Encoding": "7bit",
"To": [
"rudolph@northpole.example.org"
],
"parts": [
{
"content": "Sounds good. See you at the usual location.nnThanks,...",
"contentType": "text/plain"
}
],
"References": "<88364590.8837464573838@mail.northpole.example.org>",
"Mime-Version": "1.0",
"In-Reply-To": "<194756537.0293874783209@mail.northpole.example.org>",
"Date": "Fri, 25 Dec 2001 00:06:42 -0000 (GMT)",
"Message-ID": "<16159836.1075855377439@mail.northpole.example.org>",
"Content-Type": "text/plain; charset=us-ascii",
"Subject": "RE: FWD: Tonight"
},
{
"From": "Buddy <buddy.the.elf@northpole.example.org>",
"Content-Transfer-Encoding": "7bit",
"To": [
"workshop@northpole.example.org"
],
"parts": [
{
"content": "Last batch of toys was just loaded onto sleigh. nn...",
"contentType": "text/plain"
}
],
"Mime-Version": "1.0",
"Date": "Fri, 25 Dec 2001 00:03:34 -0000 (GMT)",
"Message-ID": "<88364590.8837464573838@mail.northpole.example.org>",
"Content-Type": "text/plain; charset=us-ascii",
"Subject": "Tonight"
}
]
11. couchDB -+ Python
{# -*- coding: utf-8 -*-
import sys
import os
import couchdb
try:
import jsonlib2 as json
except ImportError:
import json
JSON_MBOX = sys.argv[1] # i.e. enron.mbox.json
DB = os.path.basename(JSON_MBOX).split('.')[0]
server = couchdbkit.Server('http://localhost:5984')
db = server.create(DB)
docs = json.loads(open(JSON_MBOX).read())
db.update(docs, all_or_nothing=True)
12. couchDB - Views
def dateTimeToDocMapper(doc):
# Note that you need to include imports used by your mapper
# inside the function definition
from dateutil.parser import parse
from datetime import datetime as dt
if doc.get('Date'):
# [year, month, day, hour, min, sec]
_date = list(dt.timetuple(parse(doc['Date']))[:-3])
yield (_date, doc)
# Specify an index to back the query. Note that the index won't
be
# created until the first time the query is run
view = ViewDefinition('index', 'by_date_time',
dateTimeToDocMapper,
language='python')
view.sync(db)
13. couchDB – Map/Reduce
def dateTimeCountMapper(doc):
from dateutil.parser import parse
from datetime import datetime as dt
if doc.get('Date'):
_date = list(dt.timetuple(parse(doc['Date']))[:-3])
yield (_date, 1)
def summingReducer(keys, values, rereduce):
return sum(values)
view = ViewDefinition('index', 'doc_count_by_date_time',
dateTimeCountMapper,
reduce_fun=summingReducer, language='python')
view.sync(db)