Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
semantic and social intraweb for corporate intelligence and watch
1. ISICIL
semantic and social intraweb for
corporate intelligence and watch
ANR project CONTINT 2009-2011
Fabien Gandon, http://fabien.info
Leader Wimmics research team (INRIA, CNRS, Univ. Nice)
W3C AC Rep. for INRIA
5. ISICIL
reconcile latest viral applications of the web
with formal models and business processes
new tools to support business intelligence
and technological watch
interfaces of web 2.0 app. for interaction
(blog, wikis, social bookmarking, feeds, etc.)
semantic web formalisms and processing
social epistemology as theoretical framework
7. proposed overview…
integrating requirement analysis methods
examples of challenges and derived functionalities
overview of this open-source platform
http://isicil.inria.fr
8. extracts of the requirement analysis and specifications
MERGING METHODOLOGIES
9. usage analysis an specification
analyze and model key business processes
Analyze interactions between members of the group
ADEME « roadmap for urban mobility »
campaigns of questionnaires at Orange Labs
trend analysis of intelligence market and watch
comparison of the APIs, widgets and other applications
12. convergence matrix
detections of needs or redundancies in key scenarios
Etapes des scénarios Fonctionnalités identifiées Fonctions SI
Présenter problématique au SVIC Mailing, Q&A Envoyer
Demander ce qui est Extraire, filtrer
incontournable et ce que font les Consultation d’experts
autres ingénieurs
Prendre en compte demande Workflow, Outils de collaboration communiquer
Moteur de recherche, équation de rechercher
Préparer requêtes
recherche
Recueillir résultats Abonnement, push… Extraire, annoter
Vérifier pertinence des résultats Analyses, outils de filtrage filtrer
Messagerie électronique, chat, envoyer
Informer l'ingénieur
vidéo-conférence
S'approprier les résultats et les Annoter, organiser
requêtes
Equation de recherche, profil, tags
Devenir le destinataire des diffuser
alertes
Diffusion par profil
18. SOCIAL TAGGING
collaboratively create
and manage tags to
annotate and
categorize content
19.
20. a crowd of users creating massive
categorizations
21. assited structuring of folksonomies
[Limpens et al.]
web 2.0 flat folksonomies thesaurus
pollutant energy
related related
? pollution
has narrower
soil pollution
SKOS
22. #tag92
hasLabel
hasTag
industries #bk34
hasBookmark
#Freddy
#Fabien
hasBookmark
#bk81 industry
hasLabel
hasTag
#tag27
global giant graph
link users, actions, knowledge, resources, groups, etc.
23. [Limpens et al ]
folksonomies → ontologies contributions…
… [Mika, 2005] hierarchies / community inclusion.
… [Heymann et al., 2006] hierarchies / centrality in graph Tag-Ressource
… [Schmitz, 2006] hierarchies / conditional probabilies & co-occurrence
… [Cattuto et al., 2008] [Markines et al., 2009] different metrics
… [Specia et al., 2007] [Begelman et al., 2006] clustering de tags
variations around metrics & space (tag-resource-user).
24. [Limpens et al ]
folksonomies + ontologies contributions…
... [Gruber, 2005] [Tanasescu et al., 2007] tagging tags
… [Specia et al., 2007][Cattuto et al., 2008][Giannakidou et al., 2008]
[Ronzano et al., 2008] [Tesconi et al., 2008] automated structuring
using external linguistic resources.
... [Good et al., 2007] manual disambiguation referencing a vocabulary
… [Passant et al., 2007] manual disambiguation referencing a thesaurus
… [Huynh-Kim Bang et al. , 2008] structured tagging “Paris<France”
25. [Limpens et al ]
ontologies → folksonomies contributions…
… [Gruber, 2005] [Newman et al., 2005] ontology of the tagging act
… [Breslin et al., 2005] SIOC resources shared on social web sites
… [Kim et al., 2007] SCOT representing tags and their cloud
… [Passant et al., 2008] MOAT, associating a meaning to a tag.
26. SoA… you are here
Computed Tag Tag-Concept Sem-Web Multi-points
Users' contrib.
similarity mapping formalism of view
Angeletou et al.
✓ ✓ ✓
(2008)
Huynh-Kim Bang et
✓ ✓
al. (2008)
Passant &
✓ ✓ ✓
Laublet(2008)
Lin & Davis (2010) ✓ ✓ ✓ ✓
Braun et al. (2007) ✓ ✓
Limpens et al.
(2010)
✓ ✓ ✓ ✓
29. Comparison of the mean value
of the JaroWinkler metric for
each type of semantic relation
Mean value of the difference
s(t1,t2) - s(t2,t1) with s being the
Monge-Elkan QGram metric for
each set of tag pairs.
determine thesaurus relations
38. handling conflicts
arbitration rules
IF num(narrower)/num(broader) ≥ c
THEN narrower/broader
ELSE related
purely automatic
conflicting
arbitrated conflict
debated
consensual
42. Chine: 1 600 millions
Inde: 1 200 millions
acebook
800 millions
43. Graphs, graphs, graphs
Fabien Michel
Marco Guillaume Rémi d ( p ) x ; rel( x , p )
in
din (Guillaume) 4
Nicolas
social network analysis
Researcher
owner
type author owner Adult
Fabien doc.html
type sub property sub class
title
Adult author Researcher
Semantic web is not antisocial
semantic web
44. [Ereteo et al ]
semantic social network analysis
contributions…
… [Goldbeck et al 2003] propagating trust
… [Finin et al 2005] power law of degrees & community struct
… [Paolillo et al 2006] classical SNA on FOAF from LiveJournal
… [Goldbeck et Rothstein 2008] merging FOAF profiles
… [Anyanwu et al 2007] [Kochut et al 2007] [Corby et al 2004]
[Corby 2008] [Baget et al, 2007] paths in SPARQL
… [Ereteo et al 2009] type-parameterized SNA and SemTagP
… [Rowe et al. 2011] User Behaviour in Online Communities
51. ipernity.com dataset in RDF
61 937 actors & 494 510 relationships
–18 771 family links between 8 047 actors
–136 311 friend links implicating 17 441 actors
–339 428 favorite links for 61 425 actors
etc.
c.f. [Erétéo et al.]
52. some interpretations
validated with managers of ipernity.com
friendOf, favorite, message, comment
small diameter, high density
family as expected: large diameter, low density
favorite: highly centralized around Ipernity animator.
friendOf, family, message, comment: power law of
53. some interpretations
existence of a largest component in all sub networks
"the effectiveness of the social network at doing its job" [Newman 2003]
70000 know s
60000
favorite
50000
40000 friend
30000
family
20000
10000 message
0
comment
number actors size largest component
54. e.g. of results: different
key actors for different
kinds of links
c.f. [Erétéo et al.]
55. PERFORMANCES & LIMITS
time projections
Knows 0.71 s 494 510
Comprel (G) Favorite 0.64 s 339 428
Friend 0.31 s 136 311
Family 0.03 s 18 771
Message 1.98 s 795 949
Comment 9.67 s 2 874 170
D ( y)
rel ,1
Knows 20.59 s 989 020
Favorite 18.73 s 678 856
Friend 1.31 s 272 622
Family 0.42 s 37 542
Message 16.03 s 1 591 898
Comment 28.98 s 5 748 340
Shortest paths used Knows Path length <= 2: 14m 50.69s 100 000
to calculate Path length <= 2: 2h 56m 34.13s 1 000 000
Cbrel (b) Path length <= 2: 7h 19m 15.18s 2 000 000
Favorite Path length <= 2: 5h 33m 18.43s 2 000 000
Friend Path length <= 2: 1m 12.18 s 1 000 000
Path length <= 2: 2m 7.98 s 2 000 000
Family Path length <= 2 : 27.23 s 1 000 000
Path length <= 2 : 2m 9.73 s 3 681 626
Path length <= 3 : 1m 10.71 s 1 000 000
Path length <= 4 : 1m 9.06 s 1 000 000
62. hierarchical algorithms
output dendrograms of larger and larger
communities from top to bottom.
• agglomerative algorithms [Donetti &
Munoz 2004] [Zhou & Lipowsky 2004]
[Xu et al 2007] [Newman 2004]
• divisive algorithms [Girvan & Newman
2002] [Radicchi et al 2004]
[Eretéo et al., 2011]
63. heuristic based algorithms
• similarity with electrical networks [Wu 2004]
• random walk [Dongen 2000] [Pons et al 2005]
• label propagation [Raghavan et al 2007]
[Eretéo et al., 2011]
66. tags to detect and label communities
extension of algorithm RAK/LP :
from random labels to structured tags
rugby, foot hockey salt, water sport sport condiment
pepper, wine condiment
foot, movie mustard sport condiment
[Eretéo et al., 2011]
67. experimented algorithm
1. Algorithm SemTagP(RDFGraph network, Type relation)
2. DO
3. old_network = network
4. FOREACH user in network.users
5. user.tag = mostUsedNeighborTag(user, relationType)
6. END FOREACH
7. WHILE modularity(network) > modularity(old_network)
8. RETURN old_network
inject semantics here
[Eretéo et al., 2011]
68. semantic tag propagation
exploit folksonomy for label assignment
wiki mobile
b e
mobile inria
a d f
c g
sweetwiki mobile
[Eretéo et al., 2011]
69. semantic tag propagation
apply social pressure of RAK/LP
wiki mobile
b e
mobile inria
a d f
c g
sweetwiki mobile
[Eretéo et al., 2011]
70. semantic tag propagation wiki
take thesaurus into account in propagating
skos:narrower
sweetwiki mediawiki
wiki mobile
b e
mobile inria
a d f
c g
sweetwiki mobile
[Eretéo et al., 2011]
71. semantic tag propagation wiki
take thesaurus into account in propagating
skos:narrower
sweetwiki mediawiki
wiki mobile
b e
wiki inria
a d f
c g
sweetwiki mobile
[Eretéo et al., 2011]
72. semantic tag propagation
etc. leading to 2 communities
wiki mobile
b e
wiki mobile
a d f
c g
wiki mobile
[Eretéo et al., 2011]
91. to know more
deployment & test campaign (4… 20… +) .
deliverables and publications
http://isicil.inria.fr
open source code on INRIA forge
https://gforge.inria.fr/projects/isicil/
models
http://ns.inria.fr/
93. social web 2.0
epistemology semantic web
theoretical framework
extensible models
process and interaction
services and interfaces
94. tomorrow, he, who controls the metadata,
controls the web.
@fabien_gandon
http://fabien.info
95. What is WWW2012?
21st International World Wide Web Conference
a “A rated” scientific conference
~12% acceptance & 1000-1500 participants
Lyon- France from 16t to 20th April 2012
RESEARCHERS
USERS INDUSTRIALS
www2012.org @www2012Lyon