SlideShare a Scribd company logo
1 of 21
Download to read offline
Mining The Social Web



 NAVER 아키텍트를 꿈꾸는 사람들
        발표 : 김연기
Mail Boxes
누가 메일을 보내나?
답장을 받는 시간대가 있나?
누가 자주 메일을 보내나?
요즘 핫이슈는??
Mbox
From santa@northpole.example.org Fri Dec 25         >
00:06:42 2009                                       > Please proceed per the norm.
Message-ID:                                         >
<16159836.1075855377439@mail.northpole.exampl       > Regards,
e.org>                                              > Buddy
References:                                         >
<88364590.8837464573838@mail.northpole.exampl       > --
e.org>                                              > Buddy the Elf
In-Reply-To:                                        > Chief Elf
<194756537.0293874783209@mail.northpole.exam        > Workshop Operations
ple.org>                                            > North Pole
Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT)         > buddy.the.elf@northpole.example.org
From: St. Nick <santa@northpole.example.org>        From buddy.the.elf@northpole.example.org Fri Dec
To: rudolph@northpole.example.org                   25 00:03:34 2009
Subject: RE: FWD: Tonight                           Message-ID:
Mime-Version: 1.0                                   <88364590.8837464573838@mail.northpole.exampl
Content-Type: text/plain; charset=us-ascii          e.org>
Content-Transfer-Encoding: 7bit                     Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT)
Sounds good. See you at the usual location.         From: Buddy
                                                    <buddy.the.elf@northpole.example.org>
Thanks,                                             To: workshop@northpole.example.org
-S                                                  Subject: Tonight
-----Original Message-----                          Mime-Version: 1.0
From: Rudolph                                       Content-Type: text/plain; charset=us-ascii
Sent: Friday, December 25, 2009 12:04 AM            Content-Transfer-Encoding: 7bit
To: Claus, Santa                                    Last batch of toys was just loaded onto sleigh.
Subject: FWD: Tonight                               Please proceed per the norm.
Santa -                                             Regards,
Running a bit late. Will come grab you shortly.     Buddy
Standby.                                            --
Rudy                                                Buddy the Elf
Begin forwarded message:                            Chief Elf
> Last batch of toys was just loaded onto sleigh.   Workshop Operations
                                                    North Pole
                                                    buddy.the.elf@northpole.example.org
Mbox
From santa@northpole.example.org Fri Dec 25         >
00:06:42 2009                                       > Please proceed per the norm.
Message-ID:                                         >
<16159836.1075855377439@mail.northpole.exampl       > Regards,
e.org>                                              > Buddy
References:                                         >
<88364590.8837464573838@mail.northpole.exampl       > --
e.org>                                              > Buddy the Elf
In-Reply-To:                                        > Chief Elf
<194756537.0293874783209@mail.northpole.exam        > Workshop Operations
ple.org>                                            > North Pole
Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT)         > buddy.the.elf@northpole.example.org
From: St. Nick <santa@northpole.example.org>        From buddy.the.elf@northpole.example.org Fri Dec
To: rudolph@northpole.example.org                   25 00:03:34 2009
Subject: RE: FWD: Tonight                           Message-ID:
Mime-Version: 1.0                                   <88364590.8837464573838@mail.northpole.exampl
Content-Type: text/plain; charset=us-ascii          e.org>
Content-Transfer-Encoding: 7bit                     Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT)
Sounds good. See you at the usual location.         From: Buddy
Thanks,                                             <buddy.the.elf@northpole.example.org>
-S                                                  To: workshop@northpole.example.org
-----Original Message-----                          Subject: Tonight
From: Rudolph                                       Mime-Version: 1.0
Sent: Friday, December 25, 2009 12:04 AM            Content-Type: text/plain; charset=us-ascii
To: Claus, Santa                                    Content-Transfer-Encoding: 7bit
Subject: FWD: Tonight                               Last batch of toys was just loaded onto sleigh.
Santa -                                             Please proceed per the norm.
Running a bit late. Will come grab you shortly.     Regards,
Standby.                                            Buddy
Rudy                                                --
Begin forwarded message:                            Buddy the Elf
> Last batch of toys was just loaded onto sleigh.   Chief Elf
                                                    Workshop Operations
                                                    North Pole
                                                    buddy.the.elf@northpole.example.org
Mbox
From santa@northpole.example.org Fri Dec 25         >
00:06:42 2009                                       > Please proceed per the norm.
Message-ID:                                         >
<16159836.1075855377439@mail.northpole.exampl       > Regards,
e.org>                                              > Buddy
References:                                         >
<88364590.8837464573838@mail.northpole.exampl       > --
e.org>                                              > Buddy the Elf
In-Reply-To:                                        > Chief Elf
<194756537.0293874783209@mail.northpole.exam        > Workshop Operations
ple.org>                                            > North Pole
Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT)         > buddy.the.elf@northpole.example.org
From: St. Nick <santa@northpole.example.org>        From buddy.the.elf@northpole.example.org Fri Dec
To: rudolph@northpole.example.org                   25 00:03:34 2009
Subject: RE: FWD: Tonight                           Message-ID:
Mime-Version: 1.0                                   <88364590.8837464573838@mail.northpole.exampl
Content-Type: text/plain; charset=us-ascii          e.org>
Content-Transfer-Encoding: 7bit                     Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT)
Sounds good. See you at the usual location.         From: Buddy
Thanks,                                             <buddy.the.elf@northpole.example.org>
-S                                                  To: workshop@northpole.example.org
-----Original Message-----                          Subject: Tonight
From: Rudolph                                       Mime-Version: 1.0
Sent: Friday, December 25, 2009 12:04 AM            Content-Type: text/plain; charset=us-ascii
To: Claus, Santa                                    Content-Transfer-Encoding: 7bit
Subject: FWD: Tonight                               Last batch of toys was just loaded onto sleigh.
Santa -                                             Please proceed per the norm.
Running a bit late. Will come grab you shortly.     Regards,
Standby.                                            Buddy
Rudy                                                --
Begin forwarded message:                            Buddy the Elf
> Last batch of toys was just loaded onto sleigh.   Chief Elf
                                                    Workshop Operations
                                                    North Pole
                                                    buddy.the.elf@northpole.example.org
Mbox
{
"From": "St. Nick <santa@northpole.example.org>",
"Content-Transfer-Encoding": "7bit",
"To": [
"rudolph@northpole.example.org"
],
"parts": [
{
"content": "Sounds good. See you at the usual location.nnThanks,...",
"contentType": "text/plain"
}
],
"References": "<88364590.8837464573838@mail.northpole.example.org>",
"Mime-Version": "1.0",
"In-Reply-To": "<194756537.0293874783209@mail.northpole.example.org>",
"Date": "Fri, 25 Dec 2001 00:06:42 -0000 (GMT)",
"Message-ID": "<16159836.1075855377439@mail.northpole.example.org>",
"Content-Type": "text/plain; charset=us-ascii",
"Subject": "RE: FWD: Tonight"
},
{
"From": "Buddy <buddy.the.elf@northpole.example.org>",
"Content-Transfer-Encoding": "7bit",
"To": [
"workshop@northpole.example.org"
],
"parts": [
{
"content": "Last batch of toys was just loaded onto sleigh. nn...",
"contentType": "text/plain"
}
],
"Mime-Version": "1.0",
"Date": "Fri, 25 Dec 2001 00:03:34 -0000 (GMT)",
"Message-ID": "<88364590.8837464573838@mail.northpole.example.org>",
"Content-Type": "text/plain; charset=us-ascii",
"Subject": "Tonight"
}
]
Mbox + couchDB
DB 에 저장하여 통계를낼수
있다.
Json API를 제공
couchDB
문서 기반 DB Server
Json API를 제공
Views
Schema-Free
couchDB
Install couchdb on centOS
  yum install couchdb
  /etc/init.d/couchdb start
couchDB -+ Python
Install Couchdb Kit (On CentOS)
  curl -O
  http://peak.telecommunity.com/dist/ez_se
  tup.py
  http://pypi.python.org/pypi/setuptools#r
  pm-based-systems
  $ sudo python ez_setup.py -U setuptools

Python – Couchdb API
  http://packages.python.org/CouchDB
couchDB -+ Python
{# -*- coding: utf-8 -*-
import sys
import os
import couchdb
try:
import jsonlib2 as json
except ImportError:
import json
JSON_MBOX = sys.argv[1] # i.e. enron.mbox.json
DB = os.path.basename(JSON_MBOX).split('.')[0]
server = couchdbkit.Server('http://localhost:5984')
db = server.create(DB)
docs = json.loads(open(JSON_MBOX).read())
db.update(docs, all_or_nothing=True)
couchDB - Views
def dateTimeToDocMapper(doc):
# Note that you need to include imports used by your mapper
# inside the function definition
from dateutil.parser import parse
from datetime import datetime as dt
if doc.get('Date'):
# [year, month, day, hour, min, sec]
_date = list(dt.timetuple(parse(doc['Date']))[:-3])
yield (_date, doc)
# Specify an index to back the query. Note that the index won't
be
# created until the first time the query is run
view = ViewDefinition('index', 'by_date_time',
dateTimeToDocMapper,
language='python')
view.sync(db)
couchDB – Map/Reduce
def dateTimeCountMapper(doc):
from dateutil.parser import parse
from datetime import datetime as dt
if doc.get('Date'):
_date = list(dt.timetuple(parse(doc['Date']))[:-3])
yield (_date, 1)
def summingReducer(keys, values, rereduce):
return sum(values)
view = ViewDefinition('index', 'doc_count_by_date_time',
dateTimeCountMapper,
reduce_fun=summingReducer, language='python')
view.sync(db)
couchDB – Lucene
JAVA 기반의 검색 엔진
Library
Look Who’s Talking
 검색어에 해당하는 메시지 ID를
couchdb-lucene 에 질의.
 메시지 ID가 있는 모든 메일을
찾는다.
 메일중에서 메시지가 있는 메일
의 유니크한 메일 주소를 찾아 낸다.
Look Who’s Talking
Look Who’s Talking
Look Who’s Talking
Look Who’s Talking
Look Who’s Talking
Analyzing Mail Data
Getmail
Poplib
Imaplib
Graph Your Inbox
  Google Chrome Extension

More Related Content

Viewers also liked

Evaluation – question 3
Evaluation – question 3Evaluation – question 3
Evaluation – question 3JakeHafer
 
Raymond & Rachel Engagement Dinner
Raymond & Rachel Engagement DinnerRaymond & Rachel Engagement Dinner
Raymond & Rachel Engagement DinnerRiver Rock
 
V norte 1web
V norte 1webV norte 1web
V norte 1webAnam
 
Dad powerpoint2
Dad powerpoint2Dad powerpoint2
Dad powerpoint2michelirvo
 
The romans 3
The romans 3The romans 3
The romans 3FranJLte
 
Project in mapeh(bravo)
Project in mapeh(bravo)Project in mapeh(bravo)
Project in mapeh(bravo)Joyjoy Pena
 
Last Minute Holiday Fundraising Strategies
Last Minute Holiday Fundraising StrategiesLast Minute Holiday Fundraising Strategies
Last Minute Holiday Fundraising Strategiesgailperry
 
Sachin tuli
Sachin tuliSachin tuli
Sachin tulisknsz
 
Spotkanie z krzysztofem śliwińskim w ramach wiosennej szkoły
Spotkanie z krzysztofem śliwińskim w ramach wiosennej szkołySpotkanie z krzysztofem śliwińskim w ramach wiosennej szkoły
Spotkanie z krzysztofem śliwińskim w ramach wiosennej szkołysknsz
 
Beit mikdash ii& old city
Beit mikdash ii& old cityBeit mikdash ii& old city
Beit mikdash ii& old citymarlena1st
 
ฉันเหมือนใคร 8
ฉันเหมือนใคร 8ฉันเหมือนใคร 8
ฉันเหมือนใคร 8popkullatida
 
Application Software
Application SoftwareApplication Software
Application SoftwareBeth
 
Security Testing hands on Workshop Material
Security Testing hands on Workshop MaterialSecurity Testing hands on Workshop Material
Security Testing hands on Workshop MaterialvodQA
 
Kolom biostratigrafi
Kolom biostratigrafiKolom biostratigrafi
Kolom biostratigrafiReski Srem
 
ฉันเหมือนใคร 9
ฉันเหมือนใคร 9ฉันเหมือนใคร 9
ฉันเหมือนใคร 9popkullatida
 
Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00
Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00
Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00Paul G. Huppertz
 

Viewers also liked (18)

Evaluation – question 3
Evaluation – question 3Evaluation – question 3
Evaluation – question 3
 
Raymond & Rachel Engagement Dinner
Raymond & Rachel Engagement DinnerRaymond & Rachel Engagement Dinner
Raymond & Rachel Engagement Dinner
 
V norte 1web
V norte 1webV norte 1web
V norte 1web
 
Dad powerpoint2
Dad powerpoint2Dad powerpoint2
Dad powerpoint2
 
The romans 3
The romans 3The romans 3
The romans 3
 
Project in mapeh(bravo)
Project in mapeh(bravo)Project in mapeh(bravo)
Project in mapeh(bravo)
 
Last Minute Holiday Fundraising Strategies
Last Minute Holiday Fundraising StrategiesLast Minute Holiday Fundraising Strategies
Last Minute Holiday Fundraising Strategies
 
Mupe5 120312
Mupe5 120312Mupe5 120312
Mupe5 120312
 
Sachin tuli
Sachin tuliSachin tuli
Sachin tuli
 
Spotkanie z krzysztofem śliwińskim w ramach wiosennej szkoły
Spotkanie z krzysztofem śliwińskim w ramach wiosennej szkołySpotkanie z krzysztofem śliwińskim w ramach wiosennej szkoły
Spotkanie z krzysztofem śliwińskim w ramach wiosennej szkoły
 
Beit mikdash ii& old city
Beit mikdash ii& old cityBeit mikdash ii& old city
Beit mikdash ii& old city
 
ฉันเหมือนใคร 8
ฉันเหมือนใคร 8ฉันเหมือนใคร 8
ฉันเหมือนใคร 8
 
Re:new
Re:newRe:new
Re:new
 
Application Software
Application SoftwareApplication Software
Application Software
 
Security Testing hands on Workshop Material
Security Testing hands on Workshop MaterialSecurity Testing hands on Workshop Material
Security Testing hands on Workshop Material
 
Kolom biostratigrafi
Kolom biostratigrafiKolom biostratigrafi
Kolom biostratigrafi
 
ฉันเหมือนใคร 9
ฉันเหมือนใคร 9ฉันเหมือนใคร 9
ฉันเหมือนใคร 9
 
Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00
Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00
Keynote 'Mr. Service - Composer & Conductor of Service Providing' V01.02.00
 

More from scor7910

대규모 서비스를 지탱하는기술 Ch14
대규모 서비스를 지탱하는기술 Ch14대규모 서비스를 지탱하는기술 Ch14
대규모 서비스를 지탱하는기술 Ch14scor7910
 
Head first statistics ch15
Head first statistics ch15Head first statistics ch15
Head first statistics ch15scor7910
 
Head first statistics ch.11
Head first statistics ch.11Head first statistics ch.11
Head first statistics ch.11scor7910
 
어플 개발자의 서버개발 삽질기
어플 개발자의 서버개발 삽질기어플 개발자의 서버개발 삽질기
어플 개발자의 서버개발 삽질기scor7910
 
Mining the social web ch8 - 1
Mining the social web ch8 - 1Mining the social web ch8 - 1
Mining the social web ch8 - 1scor7910
 
Software pattern
Software patternSoftware pattern
Software patternscor7910
 
Google app engine
Google app engineGoogle app engine
Google app enginescor7910
 
Half sync/Half Async
Half sync/Half AsyncHalf sync/Half Async
Half sync/Half Asyncscor7910
 
Cpp 0x kimRyungee
Cpp 0x kimRyungeeCpp 0x kimRyungee
Cpp 0x kimRyungeescor7910
 
Component configurator
Component configuratorComponent configurator
Component configuratorscor7910
 
Proxy pattern
Proxy patternProxy pattern
Proxy patternscor7910
 
Reflection
ReflectionReflection
Reflectionscor7910
 

More from scor7910 (12)

대규모 서비스를 지탱하는기술 Ch14
대규모 서비스를 지탱하는기술 Ch14대규모 서비스를 지탱하는기술 Ch14
대규모 서비스를 지탱하는기술 Ch14
 
Head first statistics ch15
Head first statistics ch15Head first statistics ch15
Head first statistics ch15
 
Head first statistics ch.11
Head first statistics ch.11Head first statistics ch.11
Head first statistics ch.11
 
어플 개발자의 서버개발 삽질기
어플 개발자의 서버개발 삽질기어플 개발자의 서버개발 삽질기
어플 개발자의 서버개발 삽질기
 
Mining the social web ch8 - 1
Mining the social web ch8 - 1Mining the social web ch8 - 1
Mining the social web ch8 - 1
 
Software pattern
Software patternSoftware pattern
Software pattern
 
Google app engine
Google app engineGoogle app engine
Google app engine
 
Half sync/Half Async
Half sync/Half AsyncHalf sync/Half Async
Half sync/Half Async
 
Cpp 0x kimRyungee
Cpp 0x kimRyungeeCpp 0x kimRyungee
Cpp 0x kimRyungee
 
Component configurator
Component configuratorComponent configurator
Component configurator
 
Proxy pattern
Proxy patternProxy pattern
Proxy pattern
 
Reflection
ReflectionReflection
Reflection
 

Recently uploaded

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

Mining the social web ch3

  • 1. Mining The Social Web NAVER 아키텍트를 꿈꾸는 사람들 발표 : 김연기
  • 2. Mail Boxes 누가 메일을 보내나? 답장을 받는 시간대가 있나? 누가 자주 메일을 보내나? 요즘 핫이슈는??
  • 3. Mbox From santa@northpole.example.org Fri Dec 25 > 00:06:42 2009 > Please proceed per the norm. Message-ID: > <16159836.1075855377439@mail.northpole.exampl > Regards, e.org> > Buddy References: > <88364590.8837464573838@mail.northpole.exampl > -- e.org> > Buddy the Elf In-Reply-To: > Chief Elf <194756537.0293874783209@mail.northpole.exam > Workshop Operations ple.org> > North Pole Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT) > buddy.the.elf@northpole.example.org From: St. Nick <santa@northpole.example.org> From buddy.the.elf@northpole.example.org Fri Dec To: rudolph@northpole.example.org 25 00:03:34 2009 Subject: RE: FWD: Tonight Message-ID: Mime-Version: 1.0 <88364590.8837464573838@mail.northpole.exampl Content-Type: text/plain; charset=us-ascii e.org> Content-Transfer-Encoding: 7bit Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT) Sounds good. See you at the usual location. From: Buddy <buddy.the.elf@northpole.example.org> Thanks, To: workshop@northpole.example.org -S Subject: Tonight -----Original Message----- Mime-Version: 1.0 From: Rudolph Content-Type: text/plain; charset=us-ascii Sent: Friday, December 25, 2009 12:04 AM Content-Transfer-Encoding: 7bit To: Claus, Santa Last batch of toys was just loaded onto sleigh. Subject: FWD: Tonight Please proceed per the norm. Santa - Regards, Running a bit late. Will come grab you shortly. Buddy Standby. -- Rudy Buddy the Elf Begin forwarded message: Chief Elf > Last batch of toys was just loaded onto sleigh. Workshop Operations North Pole buddy.the.elf@northpole.example.org
  • 4. Mbox From santa@northpole.example.org Fri Dec 25 > 00:06:42 2009 > Please proceed per the norm. Message-ID: > <16159836.1075855377439@mail.northpole.exampl > Regards, e.org> > Buddy References: > <88364590.8837464573838@mail.northpole.exampl > -- e.org> > Buddy the Elf In-Reply-To: > Chief Elf <194756537.0293874783209@mail.northpole.exam > Workshop Operations ple.org> > North Pole Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT) > buddy.the.elf@northpole.example.org From: St. Nick <santa@northpole.example.org> From buddy.the.elf@northpole.example.org Fri Dec To: rudolph@northpole.example.org 25 00:03:34 2009 Subject: RE: FWD: Tonight Message-ID: Mime-Version: 1.0 <88364590.8837464573838@mail.northpole.exampl Content-Type: text/plain; charset=us-ascii e.org> Content-Transfer-Encoding: 7bit Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT) Sounds good. See you at the usual location. From: Buddy Thanks, <buddy.the.elf@northpole.example.org> -S To: workshop@northpole.example.org -----Original Message----- Subject: Tonight From: Rudolph Mime-Version: 1.0 Sent: Friday, December 25, 2009 12:04 AM Content-Type: text/plain; charset=us-ascii To: Claus, Santa Content-Transfer-Encoding: 7bit Subject: FWD: Tonight Last batch of toys was just loaded onto sleigh. Santa - Please proceed per the norm. Running a bit late. Will come grab you shortly. Regards, Standby. Buddy Rudy -- Begin forwarded message: Buddy the Elf > Last batch of toys was just loaded onto sleigh. Chief Elf Workshop Operations North Pole buddy.the.elf@northpole.example.org
  • 5. Mbox From santa@northpole.example.org Fri Dec 25 > 00:06:42 2009 > Please proceed per the norm. Message-ID: > <16159836.1075855377439@mail.northpole.exampl > Regards, e.org> > Buddy References: > <88364590.8837464573838@mail.northpole.exampl > -- e.org> > Buddy the Elf In-Reply-To: > Chief Elf <194756537.0293874783209@mail.northpole.exam > Workshop Operations ple.org> > North Pole Date: Fri, 25 Dec 2001 00:06:42 -0000 (GMT) > buddy.the.elf@northpole.example.org From: St. Nick <santa@northpole.example.org> From buddy.the.elf@northpole.example.org Fri Dec To: rudolph@northpole.example.org 25 00:03:34 2009 Subject: RE: FWD: Tonight Message-ID: Mime-Version: 1.0 <88364590.8837464573838@mail.northpole.exampl Content-Type: text/plain; charset=us-ascii e.org> Content-Transfer-Encoding: 7bit Date: Fri, 25 Dec 2001 00:03:34 -0000 (GMT) Sounds good. See you at the usual location. From: Buddy Thanks, <buddy.the.elf@northpole.example.org> -S To: workshop@northpole.example.org -----Original Message----- Subject: Tonight From: Rudolph Mime-Version: 1.0 Sent: Friday, December 25, 2009 12:04 AM Content-Type: text/plain; charset=us-ascii To: Claus, Santa Content-Transfer-Encoding: 7bit Subject: FWD: Tonight Last batch of toys was just loaded onto sleigh. Santa - Please proceed per the norm. Running a bit late. Will come grab you shortly. Regards, Standby. Buddy Rudy -- Begin forwarded message: Buddy the Elf > Last batch of toys was just loaded onto sleigh. Chief Elf Workshop Operations North Pole buddy.the.elf@northpole.example.org
  • 6. Mbox { "From": "St. Nick <santa@northpole.example.org>", "Content-Transfer-Encoding": "7bit", "To": [ "rudolph@northpole.example.org" ], "parts": [ { "content": "Sounds good. See you at the usual location.nnThanks,...", "contentType": "text/plain" } ], "References": "<88364590.8837464573838@mail.northpole.example.org>", "Mime-Version": "1.0", "In-Reply-To": "<194756537.0293874783209@mail.northpole.example.org>", "Date": "Fri, 25 Dec 2001 00:06:42 -0000 (GMT)", "Message-ID": "<16159836.1075855377439@mail.northpole.example.org>", "Content-Type": "text/plain; charset=us-ascii", "Subject": "RE: FWD: Tonight" }, { "From": "Buddy <buddy.the.elf@northpole.example.org>", "Content-Transfer-Encoding": "7bit", "To": [ "workshop@northpole.example.org" ], "parts": [ { "content": "Last batch of toys was just loaded onto sleigh. nn...", "contentType": "text/plain" } ], "Mime-Version": "1.0", "Date": "Fri, 25 Dec 2001 00:03:34 -0000 (GMT)", "Message-ID": "<88364590.8837464573838@mail.northpole.example.org>", "Content-Type": "text/plain; charset=us-ascii", "Subject": "Tonight" } ]
  • 7. Mbox + couchDB DB 에 저장하여 통계를낼수 있다. Json API를 제공
  • 8. couchDB 문서 기반 DB Server Json API를 제공 Views Schema-Free
  • 9. couchDB Install couchdb on centOS yum install couchdb /etc/init.d/couchdb start
  • 10. couchDB -+ Python Install Couchdb Kit (On CentOS) curl -O http://peak.telecommunity.com/dist/ez_se tup.py http://pypi.python.org/pypi/setuptools#r pm-based-systems $ sudo python ez_setup.py -U setuptools Python – Couchdb API http://packages.python.org/CouchDB
  • 11. couchDB -+ Python {# -*- coding: utf-8 -*- import sys import os import couchdb try: import jsonlib2 as json except ImportError: import json JSON_MBOX = sys.argv[1] # i.e. enron.mbox.json DB = os.path.basename(JSON_MBOX).split('.')[0] server = couchdbkit.Server('http://localhost:5984') db = server.create(DB) docs = json.loads(open(JSON_MBOX).read()) db.update(docs, all_or_nothing=True)
  • 12. couchDB - Views def dateTimeToDocMapper(doc): # Note that you need to include imports used by your mapper # inside the function definition from dateutil.parser import parse from datetime import datetime as dt if doc.get('Date'): # [year, month, day, hour, min, sec] _date = list(dt.timetuple(parse(doc['Date']))[:-3]) yield (_date, doc) # Specify an index to back the query. Note that the index won't be # created until the first time the query is run view = ViewDefinition('index', 'by_date_time', dateTimeToDocMapper, language='python') view.sync(db)
  • 13. couchDB – Map/Reduce def dateTimeCountMapper(doc): from dateutil.parser import parse from datetime import datetime as dt if doc.get('Date'): _date = list(dt.timetuple(parse(doc['Date']))[:-3]) yield (_date, 1) def summingReducer(keys, values, rereduce): return sum(values) view = ViewDefinition('index', 'doc_count_by_date_time', dateTimeCountMapper, reduce_fun=summingReducer, language='python') view.sync(db)
  • 14. couchDB – Lucene JAVA 기반의 검색 엔진 Library
  • 15. Look Who’s Talking  검색어에 해당하는 메시지 ID를 couchdb-lucene 에 질의.  메시지 ID가 있는 모든 메일을 찾는다.  메일중에서 메시지가 있는 메일 의 유니크한 메일 주소를 찾아 낸다.
  • 21. Analyzing Mail Data Getmail Poplib Imaplib Graph Your Inbox Google Chrome Extension