Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
LarKC (Newsfromthefront 2010)
1. the
Large Knowledge Collider
Creative Commons License:
allowed to share & remix,
but must attribute & non-commercial
2. Cool technology
1. Platform for building semantic web workflows
– anytime, lazy pipes, remote/distributed/parallel
– Open source, Sourceforge, Apache license
2. Set of plugins (30+)
– identify datasources,
– transform data-formats,
– select relevant subsets,
– reason over data
3. WebPIE: Fastest reasoner on the planet
– OWL Horst on Hadoop, scales linearly
– UniProt, 1.5B triples, 6hrs (32 machines)
– LUBM, 100B triples, 45hrs (32 machines)
“a platform for infinitely scalable reasoning on the data-web”
3. Cool datasets
• LinkedLifeData.com (lifesciences)
– 2.7B explit, 4.1 closure, 580m things
– 20+ datasources: UniProt, GO, Entrez-Gene, …
• LDSR.ontotext.com (general knowledge)
– 1.3B explit, 2.2B closure, 400m things
– DBPedia, MusicBrainz, Freebase, Geonames, …
• Interest-enhanced DBLP
– all CS authors with their interest profile
• Interest-enhanced PubMED
– all medical authors with their interest profile (MeSH)
• Milan traffic grid
available as SPARQL endpoints
4. Cool collaborations
We’re open for business
• Build a pipeline:
– easier to get your application scenario to scale
• Provide a plugin:
– easier integration with components by others,
– wider take up of your own component by others
• Deploy WebPIE on Amazon EC2:
– all you need is a credit card (actually: it’s cheap)
• Use the datasets:
– SPARQL endpoints on a big machine
We will even help you to do this!