More Related Content Similar to Infovore: An Open Source MapReduce Framework For Processing Graph Data (20) More from Paul Houle (20) Infovore: An Open Source MapReduce Framework For Processing Graph Data9. Scaling Limits of Triple Stores
CPU Main Memory
CPU
CPU
CPU
CPU
CPU
Random-access bottleneck
Hard Drive or Flash Storage
13. Preprocessing Freebase
• Expand prefixes
• Remove
• fbase:type.type.instance
• fbase:type.type.expected_by
• rdfs:type w/ fbase:* subject
• Reverse
• Fbase:type.permission.controls
• Fbase:dataworld_gardening_hint.replaced_by
• Rewrite
• Fbase:type.object.type to rdfs:type
15. sort | uniq
:Surgeon a :Occupation .
:Surgeon rdfs:label “Surgeon” @en.
:Surgeon :mustHave :Md.
:Tree a :Plant .
:Tree rfs:label “Tree” @en .
:Tree :has :Leaves .
:Victory a :AbstractConcept .
:Vectory rdfs:label “Victory” .
:Victory :emotialTone :Positive .
17. Pig, Hadoop and All That…
Source: http://www.dbis.informatik.hu-berlin.de/forschung/projekte/query-optimization-in-rdf-databases.html
18. Monitoring for Quality Control
Operational Statistics(rdf)
Preprocess Partition Clean Sort Classify Filter