SlideShare uma empresa Scribd logo
1 de 39
Integrating Chemistry Scholarship with Web Architectures, Grid Computing and Semantic Web SashiKiranChalla, Marlon Pierce, Suresh Marru Indiana University, Bloomington
Microsoft Research’s ORECHEM Project “A collaboration between chemistry scholars and information scientists to develop and deploy the infrastructure, services, and applications to enable new models for research and dissemination of scholarly materials in the chemistry community.” http://research.microsoft.com/en-us/projects/orechem/ 2
OAI-ORE and ORE-Chem 	Open Archive Initiative – Object Reuse and Exchange defines standards for the description and exchange of aggregations of Web resources. based around the ORE-Model which introduces the Resource Map (ReM) that makes it possible to associate an identity with aggregations of resources and make assertions about their structure and semantics. ReMs are expressed in ATOM/XML, RDF/XML, n3, turtle formats. We want to use, extend this to describe all aspects of crystallography experiments Publication links and metadata, data,   3
[object Object]
Citations
Figures
Tables
Chunks
Reactions
Molecular Compounds
NMR Spectra and Structural Data
Experiment data Southampton PSU Cambridge Indiana ,[object Object]
servicesTriplestore On Azure Cloud Carl Lagoze’s OreCHEM eScience Presentation Slides  4
Our Objective To build a pipeline to: Fetch ATOM feeds Transform ATOM feeds into triples and store them into a triple store ( Using GRDDL/Saxon HE) Extract Crystallographically obtained 3D coordinates information Submit compute intensive electronic structure calculations, geometry optimization tasks to tools like Gaussian09 on TeraGrid. Transform the Gaussian output into triples and store them into a triple store 5
OREChem-Computation Workflow Convert CML to Gaussian Input format  Extract Moiety feeds in CML format Gaussian on  TeraGrid Moiety files Gaussian Output to RDF triples ATOM Feeds from eCrystals or CrystalEye N3 files or RDF/XML Triplestore 6 Implemented Yet to Implement From Partners
RESTful Web services ,[object Object]
URI for a resource.
HTTP GET/POST/PUT/DELETE
Very easy to build one using Java APIs (JAX-RS Jersey (server & client)) 7
Jersey Skeleton Methods @Singleton @Path("/cml3d") public class MoietyHarvester{ 	@GET  @Path("/csv") 	@Produces("text/plain”) public Stringharvestfeeds(@QueryParam("harvester") String harvester, @DefaultValue("10") @QueryParam("numofentries") String num_entries){ ......... } @GET @Path("/json") 	@Produces("application/json") publicJSONArrayharvestfeedsJSON(@QueryParam("harvester") String harvester,@DefaultValue("10") @QueryParam("numofentries") String num_entries){ .......... } } http://gf18.ucs.indiana.edu/FeedsHarvester/cml3d/csv?parameters http://gf18.ucs.indiana.edu/FeedsHarvester/cml3d/json?parameters 8
ORECHEM REST Services 9
ORECHEM REST Services http://gf18.ucs.indiana.edu:8146/FeedsHarvester/cml3d/csv?harvester=moiety&numofentries=5 http://gf18.ucs.indiana.edu:8146/CML2GaussianSemCompChem/gauss/inputgenerator 10
Testing Services public class JerseyClient{ public static void main(String[] args) { Client client = Client.create(); WebResource cml2gauss = client.resource ( " "+ "http://localhost:8080" + "/CML2GaussianSemCompChem/gauss/inputgenerator“ ); 		String cmlfileURL= "http://gridfarm018.ucs.indiana.edu/" +  "orechem/moieties/ic0620900sup1_comp9_” +  moiety_1.complete.cml.xml"; 		 String gaussURL = cml2gauss.accept(MediaType.TEXT_PLAIN_TYPE,MediaType.APPLICATION_XML_TYPE).post(String.class,cmlfileURL); System.out.println(gaussURL); 	} } 11 Jersey Client API
TeraGrid 12
13 OREChem Workflow in XBaya
Triple Store A triple store is framework used for storing and querying RDF data. It provides a mechanism for persistent storage and access of RDF graphs.  	Commercial: Allegrograph, BigOWLIM, 				Virtuoso 	Open Source: Jena SDB, Sesame, 					Virtuoso, Intellidimension 14
Virtuoso Triple Store ORDBMS extended into a Triple store. Command line loaders; isql utility (interactive sql access to a database) Support for SPARQL and web server to perform SPARQL queries  Uploading of data over HTTP, WEBDAV browser. 15 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSRDFWP
What’s in Triple Store RDF Graph Experiments performed on a particular crystal Journal articles containing this crystal (research groups working with the crystal) Moieties in the crystal, their energies geometries, vibrational frequencies, etc. All this information in the triple store can be queried on, using a single GRAPH IRI. 16
Virtuoso Triple Store  GRAPH IRI : used to perform sparql query on the RDF triples. 	* Unique for every file uploaded.  http://local.virt/DAV/home/schalla/rdf_sink/oreatomfeed_102.rdf 	* A common GRAPH IRI for all the data uploaded into rdf_sink . 	(virt:rdf_graph, virt:rdf_sponger) http://localhost:8890/DAV/home/schalla/rdf_sink/ 17
Future Work Real future work (through Dec 2010) Use OGCE workflow interpreter engine to run workflow as a service. Integrate with simple visualization services (JMOL). Store input and output URLs persistently in the triple store. Anticipating higher level services. Better support for REST services in OGCE GFAC and XBaya Hopeful future work (next year) Integrate with services from GridChem/ParamChem Handle larger scale job submission Develop a full gateway for public browsing and retrieval. Investigate push-style publish/subscribe solutions for notifications. Great deal of JMS and Web Service experience with this, but very scalable REST messaging for RSS/Atom is coming Pubsubhubbub and Twitter live feeds for example.   OGCE Messaging system prototyped with REST interfaces for small iPlant collaboration. 18
Come by the IU booth for more information on OGCE tools used here. Mini-symposium: 10-12 noon on Tuesday Interactive presentations all week at the flat screen kiosk. NCSA walkup demos: 1-2 PM on Wednesday Source code for our ORE-Chem services is available from SourceForge Contact: mpierce@cs.indiana.edu 19 More Information
Thank You 20
Future Work Google’s PubSubHubbub :  	As soon as a feed is published, hub notifies the subscriber. Thus get the new entry and start the pipeline. Publisher Hub Subscriber http://code.google.com/p/pubsubhubbub/ 21
Questions ?? 22
ATOM to RDF/XML ,[object Object],GRDDL is a mechanism for Gleaning Resource Descriptions from Dialects of Languages.  atom-grddl.xsl - XSLT stylesheet 	GRDDLReader grddl=new GRDDLReader(); 	grddl.read (defaultmodel, atomfeedURL); GRDDL  W3C documentation: http://www.w3.org/TR/grddl/ 23
24 ORE Representation of an Aggregation of a Moiety in Turtle format
ATOM to RDF/XML ,[object Object],ByteArrayOutputStreamtransformOutputStream = new ByteArrayOutputStream(); TransformerFactory factory = TransformerFactory.newInstance(); StreamSourcexslSource = new StreamSource(xslstream); StreamSourcexmlSource = new StreamSource(atomstream); StreamResultoutResult = new StreamResult(transformOutputStream); 	Transformer transformer = factory.newTransformer(xslSource); transformer.transform(xmlSource, outResult); transformOutputStream.close(); 25
OGCE-Workflow Suite Tools to wrap command-line applications as light weight web services, compose workflows from those web services and, execute and monitor the workflows. 1) GFAC : allows users to wrap any command-line application as a web service. 2) XRegistry :XRegistry is the information repository of the workflow suite enabling users to register, search and access application service and workflow deployment descriptions. 	3) XBaya :Java webstart workflow composer. Used for composing workflows from web services created by the GFAC, and running and monitoring those workflows. Open Grid Computing Environments Wiki   http://www.collab-ogce.org/ogce/index.php/Workflow 26
27

Mais conteúdo relacionado

Mais procurados

Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Hortonworks
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
DataWorks Summit
 
Apache Storm
Apache StormApache Storm
Apache Storm
Edureka!
 

Mais procurados (20)

OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Support
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
Apache Ambari BOF - Blueprints + Azure - Hadoop Summit 2013
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
 
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
 
Galaxy
GalaxyGalaxy
Galaxy
 
Jump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
 
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F... Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Microservices, Containers, and Machine Learning
Microservices, Containers, and Machine LearningMicroservices, Containers, and Machine Learning
Microservices, Containers, and Machine Learning
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
Rapid Prototyping in PySpark Streaming: The Thermodynamics of Docker Containe...
 
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
 
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
Accelerating Apache Spark-based Analytics on Intel Architecture-(Michael Gree...
 

Semelhante a OREChem Services and Workflows

Project
ProjectProject
Project
Xu Liu
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
University of California, San Diego
 
Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009
Jie Bao
 
Chemical Semantics Sopron Talk
Chemical Semantics Sopron TalkChemical Semantics Sopron Talk
Chemical Semantics Sopron Talk
sopekmir
 
Chemical Semantics at Sopron CC Conference
Chemical Semantics at Sopron CC Conference Chemical Semantics at Sopron CC Conference
Chemical Semantics at Sopron CC Conference
sopekmir
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6
umavanth
 
Linking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish researchLinking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish research
Royal Society of Chemistry
 

Semelhante a OREChem Services and Workflows (20)

Project
ProjectProject
Project
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the Haystack
 
BioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogueBioIT Europe 2010 - BioCatalogue
BioIT Europe 2010 - BioCatalogue
 
PNNL April 2011 ogce
PNNL April 2011 ogcePNNL April 2011 ogce
PNNL April 2011 ogce
 
Eclipse Day India 2015 - Rest with Java (jax rs) and jersey
Eclipse Day India 2015 - Rest with Java (jax rs) and jerseyEclipse Day India 2015 - Rest with Java (jax rs) and jersey
Eclipse Day India 2015 - Rest with Java (jax rs) and jersey
 
Rest with java (jax rs) and jersey and swagger
Rest with java (jax rs) and jersey and swaggerRest with java (jax rs) and jersey and swagger
Rest with java (jax rs) and jersey and swagger
 
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...The Materials Project Ecosystem - A Complete Software and Data Platform for M...
The Materials Project Ecosystem - A Complete Software and Data Platform for M...
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open DataArabidopsis Information Portal: A Community-Extensible Platform for Open Data
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
 
Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009Semantic Wiki @ RPI, Apr 2009
Semantic Wiki @ RPI, Apr 2009
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
 
Chemical Semantics Sopron Talk
Chemical Semantics Sopron TalkChemical Semantics Sopron Talk
Chemical Semantics Sopron Talk
 
Chemical Semantics at Sopron CC Conference
Chemical Semantics at Sopron CC Conference Chemical Semantics at Sopron CC Conference
Chemical Semantics at Sopron CC Conference
 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDBDeep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDB
 
iServe Version 1
iServe Version 1iServe Version 1
iServe Version 1
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
NAME's Appendix - L
NAME's Appendix - LNAME's Appendix - L
NAME's Appendix - L
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6
 
Linking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish researchLinking chemistry: wider lessons for how we publish research
Linking chemistry: wider lessons for how we publish research
 
Semantic Web concepts used in Web 3.0 applications
Semantic Web concepts used in Web 3.0 applicationsSemantic Web concepts used in Web 3.0 applications
Semantic Web concepts used in Web 3.0 applications
 

Mais de marpierc

Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10
marpierc
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009
marpierc
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
marpierc
 

Mais de marpierc (13)

IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavata
 
XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorial
 
TG11 ORPS Poster
TG11 ORPS PosterTG11 ORPS Poster
TG11 ORPS Poster
 
SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overview
 
Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
Building Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialBuilding Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocial
 
OGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research TechnologiesOGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research Technologies
 
Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10
 
OGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track PresentationOGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track Presentation
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
 
GTLAB Overview
GTLAB OverviewGTLAB Overview
GTLAB Overview
 

OREChem Services and Workflows

  • 1. Integrating Chemistry Scholarship with Web Architectures, Grid Computing and Semantic Web SashiKiranChalla, Marlon Pierce, Suresh Marru Indiana University, Bloomington
  • 2. Microsoft Research’s ORECHEM Project “A collaboration between chemistry scholars and information scientists to develop and deploy the infrastructure, services, and applications to enable new models for research and dissemination of scholarly materials in the chemistry community.” http://research.microsoft.com/en-us/projects/orechem/ 2
  • 3. OAI-ORE and ORE-Chem Open Archive Initiative – Object Reuse and Exchange defines standards for the description and exchange of aggregations of Web resources. based around the ORE-Model which introduces the Resource Map (ReM) that makes it possible to associate an identity with aggregations of resources and make assertions about their structure and semantics. ReMs are expressed in ATOM/XML, RDF/XML, n3, turtle formats. We want to use, extend this to describe all aspects of crystallography experiments Publication links and metadata, data, 3
  • 4.
  • 11. NMR Spectra and Structural Data
  • 12.
  • 13. servicesTriplestore On Azure Cloud Carl Lagoze’s OreCHEM eScience Presentation Slides 4
  • 14. Our Objective To build a pipeline to: Fetch ATOM feeds Transform ATOM feeds into triples and store them into a triple store ( Using GRDDL/Saxon HE) Extract Crystallographically obtained 3D coordinates information Submit compute intensive electronic structure calculations, geometry optimization tasks to tools like Gaussian09 on TeraGrid. Transform the Gaussian output into triples and store them into a triple store 5
  • 15. OREChem-Computation Workflow Convert CML to Gaussian Input format Extract Moiety feeds in CML format Gaussian on TeraGrid Moiety files Gaussian Output to RDF triples ATOM Feeds from eCrystals or CrystalEye N3 files or RDF/XML Triplestore 6 Implemented Yet to Implement From Partners
  • 16.
  • 17. URI for a resource.
  • 19. Very easy to build one using Java APIs (JAX-RS Jersey (server & client)) 7
  • 20. Jersey Skeleton Methods @Singleton @Path("/cml3d") public class MoietyHarvester{ @GET @Path("/csv") @Produces("text/plain”) public Stringharvestfeeds(@QueryParam("harvester") String harvester, @DefaultValue("10") @QueryParam("numofentries") String num_entries){ ......... } @GET @Path("/json") @Produces("application/json") publicJSONArrayharvestfeedsJSON(@QueryParam("harvester") String harvester,@DefaultValue("10") @QueryParam("numofentries") String num_entries){ .......... } } http://gf18.ucs.indiana.edu/FeedsHarvester/cml3d/csv?parameters http://gf18.ucs.indiana.edu/FeedsHarvester/cml3d/json?parameters 8
  • 22. ORECHEM REST Services http://gf18.ucs.indiana.edu:8146/FeedsHarvester/cml3d/csv?harvester=moiety&numofentries=5 http://gf18.ucs.indiana.edu:8146/CML2GaussianSemCompChem/gauss/inputgenerator 10
  • 23. Testing Services public class JerseyClient{ public static void main(String[] args) { Client client = Client.create(); WebResource cml2gauss = client.resource ( " "+ "http://localhost:8080" + "/CML2GaussianSemCompChem/gauss/inputgenerator“ ); String cmlfileURL= "http://gridfarm018.ucs.indiana.edu/" + "orechem/moieties/ic0620900sup1_comp9_” + moiety_1.complete.cml.xml"; String gaussURL = cml2gauss.accept(MediaType.TEXT_PLAIN_TYPE,MediaType.APPLICATION_XML_TYPE).post(String.class,cmlfileURL); System.out.println(gaussURL); } } 11 Jersey Client API
  • 26. Triple Store A triple store is framework used for storing and querying RDF data. It provides a mechanism for persistent storage and access of RDF graphs. Commercial: Allegrograph, BigOWLIM, Virtuoso Open Source: Jena SDB, Sesame, Virtuoso, Intellidimension 14
  • 27. Virtuoso Triple Store ORDBMS extended into a Triple store. Command line loaders; isql utility (interactive sql access to a database) Support for SPARQL and web server to perform SPARQL queries Uploading of data over HTTP, WEBDAV browser. 15 http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSRDFWP
  • 28. What’s in Triple Store RDF Graph Experiments performed on a particular crystal Journal articles containing this crystal (research groups working with the crystal) Moieties in the crystal, their energies geometries, vibrational frequencies, etc. All this information in the triple store can be queried on, using a single GRAPH IRI. 16
  • 29. Virtuoso Triple Store GRAPH IRI : used to perform sparql query on the RDF triples. * Unique for every file uploaded.  http://local.virt/DAV/home/schalla/rdf_sink/oreatomfeed_102.rdf * A common GRAPH IRI for all the data uploaded into rdf_sink . (virt:rdf_graph, virt:rdf_sponger) http://localhost:8890/DAV/home/schalla/rdf_sink/ 17
  • 30. Future Work Real future work (through Dec 2010) Use OGCE workflow interpreter engine to run workflow as a service. Integrate with simple visualization services (JMOL). Store input and output URLs persistently in the triple store. Anticipating higher level services. Better support for REST services in OGCE GFAC and XBaya Hopeful future work (next year) Integrate with services from GridChem/ParamChem Handle larger scale job submission Develop a full gateway for public browsing and retrieval. Investigate push-style publish/subscribe solutions for notifications. Great deal of JMS and Web Service experience with this, but very scalable REST messaging for RSS/Atom is coming Pubsubhubbub and Twitter live feeds for example. OGCE Messaging system prototyped with REST interfaces for small iPlant collaboration. 18
  • 31. Come by the IU booth for more information on OGCE tools used here. Mini-symposium: 10-12 noon on Tuesday Interactive presentations all week at the flat screen kiosk. NCSA walkup demos: 1-2 PM on Wednesday Source code for our ORE-Chem services is available from SourceForge Contact: mpierce@cs.indiana.edu 19 More Information
  • 33. Future Work Google’s PubSubHubbub : As soon as a feed is published, hub notifies the subscriber. Thus get the new entry and start the pipeline. Publisher Hub Subscriber http://code.google.com/p/pubsubhubbub/ 21
  • 35.
  • 36. 24 ORE Representation of an Aggregation of a Moiety in Turtle format
  • 37.
  • 38. OGCE-Workflow Suite Tools to wrap command-line applications as light weight web services, compose workflows from those web services and, execute and monitor the workflows. 1) GFAC : allows users to wrap any command-line application as a web service. 2) XRegistry :XRegistry is the information repository of the workflow suite enabling users to register, search and access application service and workflow deployment descriptions. 3) XBaya :Java webstart workflow composer. Used for composing workflows from web services created by the GFAC, and running and monitoring those workflows. Open Grid Computing Environments Wiki http://www.collab-ogce.org/ogce/index.php/Workflow 26
  • 39. 27
  • 40. Experiments, Protocols ??? (Experimental Data) Moieties’, their energies, latent heats of fusion, vibrational frequencies ? (Molecular Properties,etc) Who ? Where ? When ? (Bibliographic Data) 28
  • 41. Microsoft Research’s ORECHEM Project “A collaboration between chemistry scholars and information scientists to develop and deploy the infrastructure, services, and applications to enable new models for research and dissemination of scholarly materials in the chemistry community.” http://research.microsoft.com/en-us/projects/orechem/ 29
  • 42. 30 ORE representation of a Resource Map in Turtle format
  • 44. 32 Moiety and its 3D co-ordinates. every atom & it’s X,Y,Z co-ordinates. Currently ~30000 moieties in Crystal Eye Repository bond order , Smiles & InChI representations
  • 45. OGCE-Workflow Suite OGCE Workflow Toolkit for Multi-Disciplinary Science Applications, Suresh Marru’s Presentation. 33
  • 47. Acknowledgements Dr. Marlon Pierce Assistant Director, Community Grid Labs, Pervasive Technology Institute, Indiana University Dr. David J.Wild Assistant Professor of Informatics & ComputingDirector of Cheminformatics ProgramSchool of Informatics and Computing, Indiana University Orechem Group : Dr. Carl Lagoze(Cornell University), Dr. Peter Murray Rust, Nick Day, Jim Downing (University of Cambridge), Mark Borkum(University of Southampton), Na Li (Penn State), Alex, Lee Dirks (Microsoft Research) Suresh Marru Research Scientist, Pervasive Technology Institute, Indiana University JaliyaEkanayake, Scott Beason, All the members in Pervasive Technology Institute 35
  • 48. Future Work Wrap the tool that generates triples from gaussian output, into a REST service. Install Virtuoso triple store on the Azure cloud. Fetch & process the feeds from Southampton, Penn State. 36
  • 49. 37 Moiety and its 3D co-ordinates. every atom & it’s X,Y,Z co-ordinates. Currently ~30000 moieties in Crystal Eye Repository bond order , Smiles & InChI representations
  • 50. 38 ORE representation of a Resource Map in Turtle format
  • 51. Virtuoso Triple Store Windows and Linux versions are installed and tested. Currently Linux version being used. Conductor: http://gf18.ucs.indiana.edu:8890/conductor Sparql endpoint : http://gf18.ucs.indiana.edu:8890/sparql Implementing a SPARQL compliant RDF Triple Store using a SQL-ORDBMS. http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSRDFWP 39