SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
Intoduction       Use case      My efforts       Package / Project meta-data   Fin




              Using RDF metadata for traceability among
                      projects and distributions

                  Olivier Berger <mailto:obergix@debian.org>
                          Debian + Télécom SudParis


                             Thursday 04/04/2013
                             Distro Recipes - Paris
Intoduction      Use case      My efforts       Package / Project meta-data   Fin



                            Quick Introduction
                                   Short bio


      Olivier BERGER
      <mailto:olivier.berger@telecom-sudparis.eu>
      <mailto:obergix@debian.org>
      Research Engineer at TELECOM SudParis, expert on software
      development forges, and interoperability in Libre Software
      development projects. Contributor to FusionForge, Debian, etc.
Intoduction          Use case        My efforts     Package / Project meta-data   Fin



                                How much duplication ?




          • Think about all the duplicate bug reports, and the number of
              people involved. . . usually not the ones who can help, btw
          • Can you navigate between all of these ?
          • Does google search help ?
Intoduction          Use case      My efforts        Package / Project meta-data   Fin



                                   Some issues

          • Universal Free Software Project description format ; (nay)
          • Universal distribution Package description format ; (nay)
          • Common Semantics ? (probably)
          • How to inter-link documents about the same program :
              • Bug reports (upstream and in all distros)
                     • Launchpad
                • Security advisories
                • Debian’s Homepage: control field
          • Are we maintaining Free Software distributions in close silos ?
          • External directories (FSF, Freshmeat/Freecode, JoinUp, etc.)
          • What’s usually wrong with [ XML | JSON | YAML | RFC822 ]
              format ?
Intoduction         Use case            My efforts      Package / Project meta-data     Fin



                                       An approach
                               Linked Open Development Meta-Data



          • Let’s try and make as much distro facts as possible available
              to humans + machines ?
          • Adopting the 5          Open Data principles using RDF for distro
              meta-data :

                            make your stuff available on the web
                            make it available as structured data
                            non-proprietary format
                            use URLs to identify things, so that people can point
                            at your stuff (RDF)
                            link your data to other people’s data to provide context
                            (Linked RDF)
Intoduction          Use case         My efforts         Package / Project meta-data   Fin



     Project/program/package traceability over the FLOSS
                         ecosystem

          • Assembling a graph of descriptions of packages/projects
              published as Linked Data (DOAP or ADMS.SW) on their
              forges / project portals.
              For instance :
                • For Debian, from the Debian PTS (already Linked Data proof)
                • For Apache, Gnome, Pypi, from DOAP files (not yet all
                   Linked Data, but close)
                • . . . Add your preferred upstream . . .

          • Consumed by developer/maintainer/packager tools : following
              links between packages, (and their bugs, security alerts), all in
              semantic interoperable meta-data formats (RDF) !
Intoduction                Use case                   My efforts              Package / Project meta-data   Fin



                       Matching project/package descriptions

      Example (SPARQL query to match packages by their
      homepages)
      PREFIX doap : <h t t p : / / u s e f u l i n c . com/ n s / doap>

      SELECT ∗ WHERE
      {
        GRAPH <h t t p : / / p a c k a g e s . qa . d e b i a n . o r g />
        {
          ? dp doap : homepage ? h
        }
        GRAPH <h t t p : / / p r o j e c t s . a p a c h e . o r g />
        {
          ? ap doap : homepage ? h
        }
      }


      “Semantic query” : Trying to match source packages in Debian
      whose upstream project’s homepages match those of the Apache
      project’s DOAP descriptors.
Intoduction             Use case             My efforts             Package / Project meta-data     Fin



                                      Matching packages
      Results : 62 matching Apache projects packaged in Debian (for
      which maintainers did set the Homepage Control field consistently).
      Example (Matching upstream Apache project homepages with
      Debian source packages’)
              dp                             h                          ap
              ivy                            ant.a.o/ivy/               ant.a.o/ivy/
              apr                            apr.a.o/                   apr.a.o/
              apr-util                       apr.a.o/                   apr.a.o/
              libcommons-cli-java            commons.a.o/cli/           commons.a.o/cli/
              libcommons-codec-java          commons.a.o/codec/         commons.a.o/codec/
              libcommons-collections3-java   commons.a.o/collections/   commons.a.o/collections/
              libcommons-collections-java    commons.a.o/collections/   commons.a.o/collections/
              commons-daemon                 commons.a.o/daemon/        commons.a.o/daemon/
              libcommons-discovery-java      commons.a.o/discovery/     commons.a.o/discovery/
              libcommons-el-java             commons.a.o/el/            commons.a.o/el/
              libcommons-fileupload-java      commons.a.o/fileupload/     commons.a.o/fileupload/
              commons-io                     commons.a.o/io/            commons.a.o/io/
              commons-jci                    commons.a.o/jci/           commons.a.o/jci/
              libcommons-launcher-java       commons.a.o/launcher/      commons.a.o/launcher/
              ...                            ...                        ...


      Matching program names gives more results but is ambiguous
Intoduction             Use case         My efforts          Package / Project meta-data   Fin



                                   My current experiment
                                     mining project descriptions



          • Running on my laptop ATM
          • Currently use Python to harvest meta-data from DOAP files
              for :
                 •    Gnome
                 •    Apache
                 •    Pypi.python.org
                 •    Debian
                 •    (add you ?)
          • Down to a virtuoso Triple store : > 2.5 M triples ATM
          • A python app to perform queries
          • May be published as a public service some day
Intoduction      Use case      My efforts   Package / Project meta-data   Fin



      Adding RDF to the Debian Package Tracking System
      http ://packages.qa.debian.org/
Intoduction       Use case        My efforts   Package / Project meta-data   Fin



                  Adding ADMS.SW for FusionForge



          • Adding ADMS.SW support in FusionForge
              • Projects
              • Releases
              • Trove categories
          • Expected deployment on :
              • Cenatic
              • Adullact
              • Debian’s Alioth
Intoduction            Use case            My efforts           Package / Project meta-data   Fin



                        Model : Graph of RDF resources
      Reference Linked Data resources with canonical URI like
      <http://packages.qa.debian.org/PACKAGE#RESOURCE_ID>




      The greyed resources correspond to upstream components
Intoduction                   Use case                      My efforts                       Package / Project meta-data                           Fin



                                           Forget about RDF/XML

      Yes, RDF can be expressed :
           • as XML
           • as Turtle (PREFERRED) : text (close to YAML / RFC 822)
           • as JSON

      Turtle example :
      @prefix         r d f : < h t t p : //www . w3 . o r g /1999/02/22 − r d f −s y n t a x −n s# .
                                                                                                   >
      @prefix         f o a f : < h t t p : // x m l n s . com/ f o a f / 0 . 1 /> .
      @prefix         o w l : < h t t p : //www . w3 . o r g / 2 0 0 2 / 0 7 / o w l# .
                                                                                     >

      < h t t p : // p e o p l e . d e b i a n . o r g /~ o b e r g i x / f o a f . t t l #me>
              a foaf:Person ;
              foaf:name " O l i v i e r ␣ Berger " ;
              foaf:nick " obergix " ;
              foaf:mbox " m a i l t o : o b e r g i x @ d e b i a n . org " ;
              f o a f : h o m e p a g e < h t t p : // p e o p l e . d e b i a n . o r g /~ o b e r g i x /> ;
              o w l : s a m e A s < h t t p : //www     −p u b l i c . t e l e c o m −s u d p a r i s . eu /~ b e r g e r _ o / f o a f . r d f
                        #me> .


      See also RDF Primer — Turtle version
Intoduction                   Use case                     My efforts                       Package / Project meta-data                         Fin



                             Apache2 Debian packaging as RDF

      http://packages.qa.debian.org/a/apache2.ttl
      @ p r e f i x d o a p : < h t t p : // u s e f u l i n c . com/ n s / doap# .
                                                                                  >
      @ p r e f i x a d m s s w : < h t t p : // p u r l . o r g / adms / sw /> .

      < h t t p : // p . qa . d . o / a p a c h e 2#p r o j e c t>
              a admssw:SoftwareProject ;
              doap:name " apache2 " ;
              d o a p : d e s c r i p t i o n " Debian ␣ apache2 ␣ s o u r c e ␣ packaging " ;
              d o a p : h o m e p a g e < h t t p : // p a c k a g e s . d . o / s r c : a p a c h e 2> ;
              d o a p : h o m e p a g e < h t t p : // p . qa . d . o / a p a c h e 2> ;
              d o a p : r e l e a s e < h t t p : // p . qa . d . o / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1> ;
              schema:contributor [
                  a foaf:OnlineAccount ;
                  f o a f : a c c o u n t N a m e " D e b i a n ␣ Apache ␣ M a i n t a i n e r s " ;
                  f o a f : a c c o u n t S e r v i c e H o m e p a g e < h t t p : // qa . d . o / d e v e l o p e r . php ? l o g i n=
                             d e b i a n −a p a c h e @ l i s t s . d . o>
              ] .

      < h t t p : // p . qa . d . o / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1>
              a admssw:SoftwareRelease ;
              r d f s : l a b e l " apache2 ␣ 2.2.22 −11 " ;
              d o a p : r e v i s i o n " 2.2.22 −11 " ;
              a d m s s w : p a c k a g e < h t t p : // p . qa . d . o / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1 . d s c> ;
              a d m s s w : i n c l u d e d A s s e t < h t t p : // p . qa . d . o / a p a c h e 2#u p s t r e a m s r c _ 2 . 2 . 2 2> ;
              a d m s s w : i n c l u d e d A s s e t < h t t p : // p . qa . d . o / a p a c h e 2#d e b i a n s r c _ 2 . 2 . 2 2 − 1 1> .
Intoduction        Use case        My efforts      Package / Project meta-data   Fin



                              The ADMS.SW ontology



      Asset Description Metadata Schema for Software (ADMS.SW)

       • Pilot : EC / Interoperability Solutions for
          European Public Administrations (ISA) -
          cf. Joinup site
       • Exchanging project / packages / releases
          descriptions across development platforms
          and directories
Intoduction          Use case      My efforts    Package / Project meta-data   Fin



                                Open specifications


          • Not too much NIH syndrom : reuses :
              • ADMS / RADion (generic meta-data for semantic assets
                indexing)
              • DOAP (Description of a project)
              • SPDX™ ( Software Package Data Exchange ®)
              • W3C Government Linked Data (GLD) Working Group
              Version 1.0 issued 2012/06/29
          • RDF Validator available
          • RDF is extensible : ADMS.SW / DOAP core + distro-specific
              extensions
Intoduction      Use case      My efforts      Package / Project meta-data   Fin



                        ADMS.SW main concepts
      Project (= Program), Release, Package




      Modeling Debian will require other complements
Intoduction          Use case       My efforts   Package / Project meta-data   Fin



                                Related initiatives




      Related initiatives about package meta-data :
          • AppStream (Software Center, DEP11, etc.)
          • Umegaya (upstream meta-data, links with research
              publications, etc.)
          • DistroMatch (match package names across distributions)
Intoduction          Use case      My efforts    Package / Project meta-data   Fin



                                 Recommendations




          • Upstream authors : please create DOAP descriptions for your
              projects
              https ://github.com/edumbill/doap/wiki
          • Distributions : join the ADMS.SW bandwagon to documents
              package releases
          • Followup-to : <distributions@lists.freedesktop.org> ?
Intoduction            Use case               My efforts               Package / Project meta-data            Fin



                                                      Fin
      More details at :
          • http ://wiki.debian.org/qa.debian.org/pts/RdfInterface
          • Linked Data descriptions of Debian source packages using
              ADMS.SW
          • Authoritative Linked Data descriptions of Debian source
              packages using ADMS.SW, to appear at OSS 2013 (pre-print
              available on demand)

      Contact :
      Micro-blogging : @oberger http://identi.ca/oberger/
      Email : mailto:obergix@debian.org
      Blog : http://www-public.telecom-sudparis.eu/~berger_o/weblog/
                              Copyright 2013 Institut Mines Telecom + Olivier Berger
         License of this presentation : Creative Commons Share Alike (except illustrations which are under
                                         copyright of their respective owners)

Mais conteúdo relacionado

Mais procurados

Big Data Hadoop Training
Big Data Hadoop TrainingBig Data Hadoop Training
Big Data Hadoop Trainingstratapps
 
FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...
FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...
FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...Fasten Project
 
Git basics with notes
Git basics with notesGit basics with notes
Git basics with notesSurabhi Gupta
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questionsKalyan Hadoop
 
Getting Into FLOW3 (TYPO312CA)
Getting Into FLOW3 (TYPO312CA)Getting Into FLOW3 (TYPO312CA)
Getting Into FLOW3 (TYPO312CA)Robert Lemke
 
The Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices ElsewhereThe Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices ElsewherePeter Eisentraut
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Fosdem17 - Mixed License FOSS Projects
Fosdem17 - Mixed License FOSS ProjectsFosdem17 - Mixed License FOSS Projects
Fosdem17 - Mixed License FOSS ProjectsThe Linux Foundation
 
Build Dynamic DNS server from scratch in C (Part1)
Build Dynamic DNS server from scratch in C (Part1)Build Dynamic DNS server from scratch in C (Part1)
Build Dynamic DNS server from scratch in C (Part1)Yen-Kuan Wu
 
Schedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterSchedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterShivraj Raj
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windowsextremecoders
 
HTTP/2で 速くなるとき ならないとき
HTTP/2で 速くなるとき ならないときHTTP/2で 速くなるとき ならないとき
HTTP/2で 速くなるとき ならないときKazuho Oku
 
[Nvidia] Extracting Depot Paths Into New Instances of Their Own
[Nvidia] Extracting Depot Paths Into New Instances of Their Own[Nvidia] Extracting Depot Paths Into New Instances of Their Own
[Nvidia] Extracting Depot Paths Into New Instances of Their OwnPerforce
 
[OSDC 2013] Hadoop Cluster HA 的經驗分享
[OSDC 2013] Hadoop Cluster HA 的經驗分享[OSDC 2013] Hadoop Cluster HA 的經驗分享
[OSDC 2013] Hadoop Cluster HA 的經驗分享Tsu-Fen Han
 
HTTP/2: What no one is telling you
HTTP/2: What no one is telling youHTTP/2: What no one is telling you
HTTP/2: What no one is telling youFastly
 

Mais procurados (20)

Big Data Hadoop Training
Big Data Hadoop TrainingBig Data Hadoop Training
Big Data Hadoop Training
 
FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...
FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...
FASTEN: Scaling static analyses to ecosystem, presented at FOSDEM 2020 in Bru...
 
Git basics with notes
Git basics with notesGit basics with notes
Git basics with notes
 
Aws r
Aws rAws r
Aws r
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questions
 
Hadoop Inside
Hadoop InsideHadoop Inside
Hadoop Inside
 
Getting Into FLOW3 (TYPO312CA)
Getting Into FLOW3 (TYPO312CA)Getting Into FLOW3 (TYPO312CA)
Getting Into FLOW3 (TYPO312CA)
 
The Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices ElsewhereThe Lives of Others: Open-Source Development Practices Elsewhere
The Lives of Others: Open-Source Development Practices Elsewhere
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
DNSSEC
DNSSECDNSSEC
DNSSEC
 
Fosdem17 - Mixed License FOSS Projects
Fosdem17 - Mixed License FOSS ProjectsFosdem17 - Mixed License FOSS Projects
Fosdem17 - Mixed License FOSS Projects
 
Dns
DnsDns
Dns
 
Build Dynamic DNS server from scratch in C (Part1)
Build Dynamic DNS server from scratch in C (Part1)Build Dynamic DNS server from scratch in C (Part1)
Build Dynamic DNS server from scratch in C (Part1)
 
Schedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterSchedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop cluster
 
Reversing the dropbox client on windows
Reversing the dropbox client on windowsReversing the dropbox client on windows
Reversing the dropbox client on windows
 
Demo 0.9.4
Demo 0.9.4Demo 0.9.4
Demo 0.9.4
 
HTTP/2で 速くなるとき ならないとき
HTTP/2で 速くなるとき ならないときHTTP/2で 速くなるとき ならないとき
HTTP/2で 速くなるとき ならないとき
 
[Nvidia] Extracting Depot Paths Into New Instances of Their Own
[Nvidia] Extracting Depot Paths Into New Instances of Their Own[Nvidia] Extracting Depot Paths Into New Instances of Their Own
[Nvidia] Extracting Depot Paths Into New Instances of Their Own
 
[OSDC 2013] Hadoop Cluster HA 的經驗分享
[OSDC 2013] Hadoop Cluster HA 的經驗分享[OSDC 2013] Hadoop Cluster HA 的經驗分享
[OSDC 2013] Hadoop Cluster HA 的經驗分享
 
HTTP/2: What no one is telling you
HTTP/2: What no one is telling youHTTP/2: What no one is telling you
HTTP/2: What no one is telling you
 

Semelhante a Presentation distro recipes-2013

Quadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystemQuadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystemRob Vesse
 
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with condaTravis Oliphant
 
Saveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataSaveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataFuming Shih
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
 
Getting started with R & Hadoop
Getting started with R & HadoopGetting started with R & Hadoop
Getting started with R & HadoopJeffrey Breen
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLAdam Muise
 
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data EverywhereApache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data EverywhereGanesh Raju
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Mark Wilkinson
 
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011jbarclay
 
Introduction to r
Introduction to rIntroduction to r
Introduction to rgslicraf
 
FPM at the Ruby Drink-up of Sophia, September 2011
FPM at the Ruby Drink-up of Sophia, September 2011FPM at the Ruby Drink-up of Sophia, September 2011
FPM at the Ruby Drink-up of Sophia, September 2011rivierarb
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIYoni Davidson
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Juan Sequeda
 
Large scale crawling with Apache Nutch
Large scale crawling with Apache NutchLarge scale crawling with Apache Nutch
Large scale crawling with Apache NutchJulien Nioche
 

Semelhante a Presentation distro recipes-2013 (20)

Quadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystemQuadrupling your elephants - RDF and the Hadoop ecosystem
Quadrupling your elephants - RDF and the Hadoop ecosystem
 
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with conda
 
Saveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF dataSaveface - Save your Facebook content as RDF data
Saveface - Save your Facebook content as RDF data
 
Data in RDF
Data in RDFData in RDF
Data in RDF
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...
 
Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815
 
Getting started with R & Hadoop
Getting started with R & HadoopGetting started with R & Hadoop
Getting started with R & Hadoop
 
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETLMay 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETL
 
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data EverywhereApache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
Apache Bigtop and ARM64 / AArch64 - Empowering Big Data Everywhere
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
 
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
Drupal 7 Feeds Intro Drupal Camp Indianapolis 2011
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
R development
R developmentR development
R development
 
FPM at the Ruby Drink-up of Sophia, September 2011
FPM at the Ruby Drink-up of Sophia, September 2011FPM at the Ruby Drink-up of Sophia, September 2011
FPM at the Ruby Drink-up of Sophia, September 2011
 
carrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-APIcarrow - Go bindings to Apache Arrow via C++-API
carrow - Go bindings to Apache Arrow via C++-API
 
Omnibus to the future!
Omnibus to the future!Omnibus to the future!
Omnibus to the future!
 
Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011Publishing Linked Data 3/5 Semtech2011
Publishing Linked Data 3/5 Semtech2011
 
Large scale crawling with Apache Nutch
Large scale crawling with Apache NutchLarge scale crawling with Apache Nutch
Large scale crawling with Apache Nutch
 

Mais de olberger

An introduction to git
An introduction to gitAn introduction to git
An introduction to gitolberger
 
Interoperability of FLOSS forges; lessons from the COCLICO project, implement...
Interoperability of FLOSS forges; lessons from the COCLICO project, implement...Interoperability of FLOSS forges; lessons from the COCLICO project, implement...
Interoperability of FLOSS forges; lessons from the COCLICO project, implement...olberger
 
OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...
OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...
OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...olberger
 
Presentation forges logicielles à mathrice
Presentation forges logicielles à mathricePresentation forges logicielles à mathrice
Presentation forges logicielles à mathriceolberger
 
Jailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import effortsJailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import effortsolberger
 
Bug tracking à grande échelle et interopérabilité des outils de développement...
Bug tracking à grande échelle et interopérabilité des outils de développement...Bug tracking à grande échelle et interopérabilité des outils de développement...
Bug tracking à grande échelle et interopérabilité des outils de développement...olberger
 
OSLC (Open Services for Lifecycle Collaboration): standard ouvert pour l’int...
OSLC (Open Services for Lifecycle Collaboration):  standard ouvert pour l’int...OSLC (Open Services for Lifecycle Collaboration):  standard ouvert pour l’int...
OSLC (Open Services for Lifecycle Collaboration): standard ouvert pour l’int...olberger
 
Presentation soc-fr-fossa
Presentation soc-fr-fossaPresentation soc-fr-fossa
Presentation soc-fr-fossaolberger
 
Bugs tracking at a large scale in the FLOSS ecosystem
Bugs tracking at a large scale in the FLOSS ecosystemBugs tracking at a large scale in the FLOSS ecosystem
Bugs tracking at a large scale in the FLOSS ecosystemolberger
 
Coclico project - Forges Interoperability (OWF 2010)
Coclico project - Forges Interoperability (OWF 2010)Coclico project - Forges Interoperability (OWF 2010)
Coclico project - Forges Interoperability (OWF 2010)olberger
 
Introduction aux logiciels libres
Introduction aux logiciels libresIntroduction aux logiciels libres
Introduction aux logiciels libresolberger
 
Bugtracking on the Web 2.5
Bugtracking on the Web 2.5Bugtracking on the Web 2.5
Bugtracking on the Web 2.5olberger
 
Introduction aux logiciels libres
Introduction aux logiciels libresIntroduction aux logiciels libres
Introduction aux logiciels libresolberger
 
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...olberger
 
Introduction to bts-link
Introduction to bts-linkIntroduction to bts-link
Introduction to bts-linkolberger
 
Visualizing contributions in a forge -Case study on PicoForge
Visualizing contributions in a forge -Case study on PicoForgeVisualizing contributions in a forge -Case study on PicoForge
Visualizing contributions in a forge -Case study on PicoForgeolberger
 
Plate-formes pour le développement collaboratif des logiciels libres
Plate-formes pour le développement collaboratif des logiciels libresPlate-formes pour le développement collaboratif des logiciels libres
Plate-formes pour le développement collaboratif des logiciels libresolberger
 
Retour d'expérience sur la conduite d'un projet libre
Retour d'expérience sur la conduite d'un projet libreRetour d'expérience sur la conduite d'un projet libre
Retour d'expérience sur la conduite d'un projet libreolberger
 
Olpc France Presentation Sl2008
Olpc France Presentation Sl2008Olpc France Presentation Sl2008
Olpc France Presentation Sl2008olberger
 
Collaboration avec des projets libres - enjeux, difficultés et bonnes pratiques
Collaboration avec des projets libres - enjeux, difficultés et bonnes pratiquesCollaboration avec des projets libres - enjeux, difficultés et bonnes pratiques
Collaboration avec des projets libres - enjeux, difficultés et bonnes pratiquesolberger
 

Mais de olberger (20)

An introduction to git
An introduction to gitAn introduction to git
An introduction to git
 
Interoperability of FLOSS forges; lessons from the COCLICO project, implement...
Interoperability of FLOSS forges; lessons from the COCLICO project, implement...Interoperability of FLOSS forges; lessons from the COCLICO project, implement...
Interoperability of FLOSS forges; lessons from the COCLICO project, implement...
 
OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...
OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...
OSLC (Open Services for Lifecycle Collaboration): open standard for interoper...
 
Presentation forges logicielles à mathrice
Presentation forges logicielles à mathricePresentation forges logicielles à mathrice
Presentation forges logicielles à mathrice
 
Jailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import effortsJailbreaking the Forges : project export/import efforts
Jailbreaking the Forges : project export/import efforts
 
Bug tracking à grande échelle et interopérabilité des outils de développement...
Bug tracking à grande échelle et interopérabilité des outils de développement...Bug tracking à grande échelle et interopérabilité des outils de développement...
Bug tracking à grande échelle et interopérabilité des outils de développement...
 
OSLC (Open Services for Lifecycle Collaboration): standard ouvert pour l’int...
OSLC (Open Services for Lifecycle Collaboration):  standard ouvert pour l’int...OSLC (Open Services for Lifecycle Collaboration):  standard ouvert pour l’int...
OSLC (Open Services for Lifecycle Collaboration): standard ouvert pour l’int...
 
Presentation soc-fr-fossa
Presentation soc-fr-fossaPresentation soc-fr-fossa
Presentation soc-fr-fossa
 
Bugs tracking at a large scale in the FLOSS ecosystem
Bugs tracking at a large scale in the FLOSS ecosystemBugs tracking at a large scale in the FLOSS ecosystem
Bugs tracking at a large scale in the FLOSS ecosystem
 
Coclico project - Forges Interoperability (OWF 2010)
Coclico project - Forges Interoperability (OWF 2010)Coclico project - Forges Interoperability (OWF 2010)
Coclico project - Forges Interoperability (OWF 2010)
 
Introduction aux logiciels libres
Introduction aux logiciels libresIntroduction aux logiciels libres
Introduction aux logiciels libres
 
Bugtracking on the Web 2.5
Bugtracking on the Web 2.5Bugtracking on the Web 2.5
Bugtracking on the Web 2.5
 
Introduction aux logiciels libres
Introduction aux logiciels libresIntroduction aux logiciels libres
Introduction aux logiciels libres
 
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
Weaving a Semantic Web across OSS repositories - a spotlight on bts-link, UDD...
 
Introduction to bts-link
Introduction to bts-linkIntroduction to bts-link
Introduction to bts-link
 
Visualizing contributions in a forge -Case study on PicoForge
Visualizing contributions in a forge -Case study on PicoForgeVisualizing contributions in a forge -Case study on PicoForge
Visualizing contributions in a forge -Case study on PicoForge
 
Plate-formes pour le développement collaboratif des logiciels libres
Plate-formes pour le développement collaboratif des logiciels libresPlate-formes pour le développement collaboratif des logiciels libres
Plate-formes pour le développement collaboratif des logiciels libres
 
Retour d'expérience sur la conduite d'un projet libre
Retour d'expérience sur la conduite d'un projet libreRetour d'expérience sur la conduite d'un projet libre
Retour d'expérience sur la conduite d'un projet libre
 
Olpc France Presentation Sl2008
Olpc France Presentation Sl2008Olpc France Presentation Sl2008
Olpc France Presentation Sl2008
 
Collaboration avec des projets libres - enjeux, difficultés et bonnes pratiques
Collaboration avec des projets libres - enjeux, difficultés et bonnes pratiquesCollaboration avec des projets libres - enjeux, difficultés et bonnes pratiques
Collaboration avec des projets libres - enjeux, difficultés et bonnes pratiques
 

Presentation distro recipes-2013

  • 1. Intoduction Use case My efforts Package / Project meta-data Fin Using RDF metadata for traceability among projects and distributions Olivier Berger <mailto:obergix@debian.org> Debian + Télécom SudParis Thursday 04/04/2013 Distro Recipes - Paris
  • 2. Intoduction Use case My efforts Package / Project meta-data Fin Quick Introduction Short bio Olivier BERGER <mailto:olivier.berger@telecom-sudparis.eu> <mailto:obergix@debian.org> Research Engineer at TELECOM SudParis, expert on software development forges, and interoperability in Libre Software development projects. Contributor to FusionForge, Debian, etc.
  • 3. Intoduction Use case My efforts Package / Project meta-data Fin How much duplication ? • Think about all the duplicate bug reports, and the number of people involved. . . usually not the ones who can help, btw • Can you navigate between all of these ? • Does google search help ?
  • 4. Intoduction Use case My efforts Package / Project meta-data Fin Some issues • Universal Free Software Project description format ; (nay) • Universal distribution Package description format ; (nay) • Common Semantics ? (probably) • How to inter-link documents about the same program : • Bug reports (upstream and in all distros) • Launchpad • Security advisories • Debian’s Homepage: control field • Are we maintaining Free Software distributions in close silos ? • External directories (FSF, Freshmeat/Freecode, JoinUp, etc.) • What’s usually wrong with [ XML | JSON | YAML | RFC822 ] format ?
  • 5. Intoduction Use case My efforts Package / Project meta-data Fin An approach Linked Open Development Meta-Data • Let’s try and make as much distro facts as possible available to humans + machines ? • Adopting the 5 Open Data principles using RDF for distro meta-data : make your stuff available on the web make it available as structured data non-proprietary format use URLs to identify things, so that people can point at your stuff (RDF) link your data to other people’s data to provide context (Linked RDF)
  • 6. Intoduction Use case My efforts Package / Project meta-data Fin Project/program/package traceability over the FLOSS ecosystem • Assembling a graph of descriptions of packages/projects published as Linked Data (DOAP or ADMS.SW) on their forges / project portals. For instance : • For Debian, from the Debian PTS (already Linked Data proof) • For Apache, Gnome, Pypi, from DOAP files (not yet all Linked Data, but close) • . . . Add your preferred upstream . . . • Consumed by developer/maintainer/packager tools : following links between packages, (and their bugs, security alerts), all in semantic interoperable meta-data formats (RDF) !
  • 7. Intoduction Use case My efforts Package / Project meta-data Fin Matching project/package descriptions Example (SPARQL query to match packages by their homepages) PREFIX doap : <h t t p : / / u s e f u l i n c . com/ n s / doap> SELECT ∗ WHERE { GRAPH <h t t p : / / p a c k a g e s . qa . d e b i a n . o r g /> { ? dp doap : homepage ? h } GRAPH <h t t p : / / p r o j e c t s . a p a c h e . o r g /> { ? ap doap : homepage ? h } } “Semantic query” : Trying to match source packages in Debian whose upstream project’s homepages match those of the Apache project’s DOAP descriptors.
  • 8. Intoduction Use case My efforts Package / Project meta-data Fin Matching packages Results : 62 matching Apache projects packaged in Debian (for which maintainers did set the Homepage Control field consistently). Example (Matching upstream Apache project homepages with Debian source packages’) dp h ap ivy ant.a.o/ivy/ ant.a.o/ivy/ apr apr.a.o/ apr.a.o/ apr-util apr.a.o/ apr.a.o/ libcommons-cli-java commons.a.o/cli/ commons.a.o/cli/ libcommons-codec-java commons.a.o/codec/ commons.a.o/codec/ libcommons-collections3-java commons.a.o/collections/ commons.a.o/collections/ libcommons-collections-java commons.a.o/collections/ commons.a.o/collections/ commons-daemon commons.a.o/daemon/ commons.a.o/daemon/ libcommons-discovery-java commons.a.o/discovery/ commons.a.o/discovery/ libcommons-el-java commons.a.o/el/ commons.a.o/el/ libcommons-fileupload-java commons.a.o/fileupload/ commons.a.o/fileupload/ commons-io commons.a.o/io/ commons.a.o/io/ commons-jci commons.a.o/jci/ commons.a.o/jci/ libcommons-launcher-java commons.a.o/launcher/ commons.a.o/launcher/ ... ... ... Matching program names gives more results but is ambiguous
  • 9. Intoduction Use case My efforts Package / Project meta-data Fin My current experiment mining project descriptions • Running on my laptop ATM • Currently use Python to harvest meta-data from DOAP files for : • Gnome • Apache • Pypi.python.org • Debian • (add you ?) • Down to a virtuoso Triple store : > 2.5 M triples ATM • A python app to perform queries • May be published as a public service some day
  • 10. Intoduction Use case My efforts Package / Project meta-data Fin Adding RDF to the Debian Package Tracking System http ://packages.qa.debian.org/
  • 11. Intoduction Use case My efforts Package / Project meta-data Fin Adding ADMS.SW for FusionForge • Adding ADMS.SW support in FusionForge • Projects • Releases • Trove categories • Expected deployment on : • Cenatic • Adullact • Debian’s Alioth
  • 12. Intoduction Use case My efforts Package / Project meta-data Fin Model : Graph of RDF resources Reference Linked Data resources with canonical URI like <http://packages.qa.debian.org/PACKAGE#RESOURCE_ID> The greyed resources correspond to upstream components
  • 13. Intoduction Use case My efforts Package / Project meta-data Fin Forget about RDF/XML Yes, RDF can be expressed : • as XML • as Turtle (PREFERRED) : text (close to YAML / RFC 822) • as JSON Turtle example : @prefix r d f : < h t t p : //www . w3 . o r g /1999/02/22 − r d f −s y n t a x −n s# . > @prefix f o a f : < h t t p : // x m l n s . com/ f o a f / 0 . 1 /> . @prefix o w l : < h t t p : //www . w3 . o r g / 2 0 0 2 / 0 7 / o w l# . > < h t t p : // p e o p l e . d e b i a n . o r g /~ o b e r g i x / f o a f . t t l #me> a foaf:Person ; foaf:name " O l i v i e r ␣ Berger " ; foaf:nick " obergix " ; foaf:mbox " m a i l t o : o b e r g i x @ d e b i a n . org " ; f o a f : h o m e p a g e < h t t p : // p e o p l e . d e b i a n . o r g /~ o b e r g i x /> ; o w l : s a m e A s < h t t p : //www −p u b l i c . t e l e c o m −s u d p a r i s . eu /~ b e r g e r _ o / f o a f . r d f #me> . See also RDF Primer — Turtle version
  • 14. Intoduction Use case My efforts Package / Project meta-data Fin Apache2 Debian packaging as RDF http://packages.qa.debian.org/a/apache2.ttl @ p r e f i x d o a p : < h t t p : // u s e f u l i n c . com/ n s / doap# . > @ p r e f i x a d m s s w : < h t t p : // p u r l . o r g / adms / sw /> . < h t t p : // p . qa . d . o / a p a c h e 2#p r o j e c t> a admssw:SoftwareProject ; doap:name " apache2 " ; d o a p : d e s c r i p t i o n " Debian ␣ apache2 ␣ s o u r c e ␣ packaging " ; d o a p : h o m e p a g e < h t t p : // p a c k a g e s . d . o / s r c : a p a c h e 2> ; d o a p : h o m e p a g e < h t t p : // p . qa . d . o / a p a c h e 2> ; d o a p : r e l e a s e < h t t p : // p . qa . d . o / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1> ; schema:contributor [ a foaf:OnlineAccount ; f o a f : a c c o u n t N a m e " D e b i a n ␣ Apache ␣ M a i n t a i n e r s " ; f o a f : a c c o u n t S e r v i c e H o m e p a g e < h t t p : // qa . d . o / d e v e l o p e r . php ? l o g i n= d e b i a n −a p a c h e @ l i s t s . d . o> ] . < h t t p : // p . qa . d . o / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1> a admssw:SoftwareRelease ; r d f s : l a b e l " apache2 ␣ 2.2.22 −11 " ; d o a p : r e v i s i o n " 2.2.22 −11 " ; a d m s s w : p a c k a g e < h t t p : // p . qa . d . o / a p a c h e 2#apache2_2 . 2 . 2 2 − 1 1 . d s c> ; a d m s s w : i n c l u d e d A s s e t < h t t p : // p . qa . d . o / a p a c h e 2#u p s t r e a m s r c _ 2 . 2 . 2 2> ; a d m s s w : i n c l u d e d A s s e t < h t t p : // p . qa . d . o / a p a c h e 2#d e b i a n s r c _ 2 . 2 . 2 2 − 1 1> .
  • 15. Intoduction Use case My efforts Package / Project meta-data Fin The ADMS.SW ontology Asset Description Metadata Schema for Software (ADMS.SW) • Pilot : EC / Interoperability Solutions for European Public Administrations (ISA) - cf. Joinup site • Exchanging project / packages / releases descriptions across development platforms and directories
  • 16. Intoduction Use case My efforts Package / Project meta-data Fin Open specifications • Not too much NIH syndrom : reuses : • ADMS / RADion (generic meta-data for semantic assets indexing) • DOAP (Description of a project) • SPDX™ ( Software Package Data Exchange ®) • W3C Government Linked Data (GLD) Working Group Version 1.0 issued 2012/06/29 • RDF Validator available • RDF is extensible : ADMS.SW / DOAP core + distro-specific extensions
  • 17. Intoduction Use case My efforts Package / Project meta-data Fin ADMS.SW main concepts Project (= Program), Release, Package Modeling Debian will require other complements
  • 18. Intoduction Use case My efforts Package / Project meta-data Fin Related initiatives Related initiatives about package meta-data : • AppStream (Software Center, DEP11, etc.) • Umegaya (upstream meta-data, links with research publications, etc.) • DistroMatch (match package names across distributions)
  • 19. Intoduction Use case My efforts Package / Project meta-data Fin Recommendations • Upstream authors : please create DOAP descriptions for your projects https ://github.com/edumbill/doap/wiki • Distributions : join the ADMS.SW bandwagon to documents package releases • Followup-to : <distributions@lists.freedesktop.org> ?
  • 20. Intoduction Use case My efforts Package / Project meta-data Fin Fin More details at : • http ://wiki.debian.org/qa.debian.org/pts/RdfInterface • Linked Data descriptions of Debian source packages using ADMS.SW • Authoritative Linked Data descriptions of Debian source packages using ADMS.SW, to appear at OSS 2013 (pre-print available on demand) Contact : Micro-blogging : @oberger http://identi.ca/oberger/ Email : mailto:obergix@debian.org Blog : http://www-public.telecom-sudparis.eu/~berger_o/weblog/ Copyright 2013 Institut Mines Telecom + Olivier Berger License of this presentation : Creative Commons Share Alike (except illustrations which are under copyright of their respective owners)