This document discusses how Sparqlycode knowledge bases can provide additional information about software packages beyond just code and documentation. Sparqlycode knowledge bases contain structured data about the code, build process, source control history, usage statistics, and more. This extra information allows asking questions like what are the dependencies of a package, who contributed to the code, what changes were made in a new release, and whether any known security vulnerabilities exist. The knowledge bases can be automatically generated and queried to enable more informed decisions about software.
1. Demonstrations
Distributing Software Knowledge for DevOps
Adding Value to the Package Distribution
When your deployment team receive a deployment package it is just a black box to them. You may provide them with:
An MD5 or PGP signature to verify that it is from the correct source and that the package has not been tampered
The source code documentation
The source code
However, these have the following limitations:
MD5 and PGP signatures do not tell you of all the changes that have happened before being signed
The source code documentation is meant for developers not for all the other roles in the business
The source code is intractable for both developers and everyone else - there is no means to comprehend all the code other than
investing hours studying it
Sparqlycode completely changes this situation. In addition to the current artefacts as shown you can now also provide a Sparqlycode
Knowledge Base. Once a Sparqlycode KB is available all kinds of questions can be asked of the software that have never been possible
before without a lot of manual investigation.
Package Distributions Accompanied by Sparqlycode Knowledge Bases
For example a typical software distribution deliver in an open source software project might look like one of those from Apache's Active MQ
project:
A Sparqlycode Knowledge Base would be another resource offered with the binary distribution above. The SC KB would also be a zip file
containing a number of RDF/OWL formatted resources. It would also be itself signed with PGP or MD5 to ensure it has not been tampered
with. Below is an example Sparqlycode KB for the AMQP module of Apache ActiveMQ project above:
0.0.4.org.apache.activemq.activemq-amqp.5.10.0.zip
0.0.4.org.apache.activemq.activemq-amqp.5.10.0.md5sum.txt
Contents of the Sparqlycode Knowledge Bases
The zip file in this example is produced by Sparqlycode Engine 0.0.4 and contains:
File Description
0.0.4.org.apache.activemq.activemq-amqp.5.10.0.code.ttl This is the CODE KB. It contains knowledge about the software code
0.0.4.org.apache.activemq.activemq-amqp.5.10.0.pom.ttl This is the BUILD KB. In this case it contains knowledge about the Maven
Project Object Model.
0.0.4.org.apache.activemq.activemq-amqp.5.10.0.sccs.ttl This is the SCCS KB. In this case it represents a Git repository. It
supports both the and our ownW3C PROV-O Git Domain Specific
.Ontology
0.0.4.org.apache.activemq.activemq-amqp.5.10.0.statistics.txt These are a few human readable statistics about the concepts that exist in
the Knowledge Model.
2. LICENSE.HTML This is the license for the KB if acquired from Ifwww.sparqlycode.com.
the KBs are generated by yourself you may also have a policy file of some
kind.
Common Questions To Ask of the Knowledge Bases
What are the direct dependencies of this package?
You can run the following SPARQL query, the example below uses Jena ARQ, to list information about all the dependencies.
The SPARQL query to report a projects direct dependencies is:
PREFIX java: <http://ontology.interition.net/java/ref/>
PREFIX maven: <http://ontology.interition.net/sparqlycode/maven/vocabulary/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX isc: <http://ontology.interition.net/sparqlycode/vocabulary/>
SELECT ?GroupId ?ArtifactId ?Version {
?project a isc:MavenProject ;
maven:dependency ?dependency .
?dependency maven:groupId ?GroupId ;
maven:artifactId ?ArtifactId ;
maven:version ?Version .
} LIMIT 200
If you are using Jena ARQ to perform the query and placed the query in a file called the command would look like:dependencies.rq
arq --query=dependencies.rq
--data=0.0.4.org.apache.activemq.activemq-amqp.5.10.0.pom.ttl
The result in this demonstration would look like:
SPARQL Query Dependencies
Using Jena ARQ to run the query
3. -----------------------------------------------------------------------------------
--
| GroupId | ArtifactId | Version
|
===================================================================================
==
| "org.apache.activemq" | "activemq-leveldb-store" | "5.10.0"
|
| "org.apache.activemq" | "activemq-kahadb-store" | "5.10.0"
|
| "junit" | "junit" | "4.11"
|
| "org.apache.activemq" | "activemq-jaas" | "5.10.0"
|
| "org.eclipse.jetty.aggregate" | "jetty-all-server" | "7.6.9.v20130131"
|
| "org.apache.activemq" | "activemq-broker" | "5.10.0"
|
| "org.slf4j" | "slf4j-log4j12" | "1.7.5"
|
| "org.fusesource.joram-jms-tests" | "joram-jms-tests" | "1.0"
|
| "org.slf4j" | "slf4j-api" | "1.7.5"
|
| "org.apache.activemq" | "activemq-broker" | "5.10.0"
|
| "org.apache.qpid" | "proton-jms" | "0.7"
|
| "org.apache.activemq" | "activemq-spring" | "5.10.0"
|
| "org.fusesource.hawtbuf" | "hawtbuf" | "1.10"
|
| "org.apache.qpid" | "qpid-amqp-1-0-client-jms" | "0.26"
|
| "org.springframework" | "spring-context" | "3.2.8.RELEASE"
|
-----------------------------------------------------------------------------------
--
A Sparqlycode Enterprise Server Feature
This does not list the transitive dependencies, to do that you need the Sparqlycode Enterprise Server (SCES). The SCES provides
tools to manage large numbers of Sparqlycode Knowledge Bases and to collectively query them.
4. Who has contributed to the software in this package?
The SCCS KB can be queried to provide information on the people that contributed to the code changes. With SCES this can extend to all
the people that have contributed to code inside and outside your organization.
The SPARQL query to report who has worked on the software is:
PREFIX git: <http://ontology.interition.net/sccs/git/ref/>
SELECT distinct ?developer {
?commits git:author ?developer .
}
If you are using Jena ARQ to perform the query and placed the query in a file called the command would look like:developers.rq
arq --query=developers.rq
--data=0.0.4.org.apache.activemq.activemq-amqp.5.10.0.sccs.ttl
The result is a list of all the people that made changes to the code base:
--------------------------------------
| developer |
======================================
| <mailto:tabish121@gmai.com> |
| <mailto:hiram@hiramchirino.com> |
| <mailto:gary.tully@gmail.com> |
| <mailto:dejan@nighttale.net> |
| <mailto:dkulp@apache.org> |
| <mailto:jcarman@apache.org> |
| <mailto:kevin@kevinearls.com> |
| <mailto:art@artnaseef.com> |
| <mailto:claus.ibsen@gmail.com> |
| <mailto:hzbarcea@gmail.com> |
| <mailto:rajdavies@gmail.com> |
| <mailto:hadrian@apache.org> |
| <mailto:dhirajsb@yahoo.com> |
| <mailto:tabish121@gmail.com> |
| <mailto:janstey@gmail.com> |
| <mailto:christian.posta@gmail.com> |
| <mailto:torsten@fusesource.com> |
| <mailto:les@hazlewood.com> |
| <mailto:jsherman@redhat.com> |
| <mailto:jbonofre@apache.org> |
| <mailto:dbokde@redhat.com> |
| <mailto:tmielke@redhat.com> |
| <mailto:krzys.sobkowiak@gmail.com> |
--------------------------------------
What has changed since the last release?
Distinct list of committers
A Sparqlycode Enterprise Server Feature
The demonstration SCCS KB here only has the Git DSO Ontology. The SCCS KB supporting the W3C PROVO ontology is
available in the Sparqlycode Enterprise Server. This is mainly because it creates a much bigger dataset requiring more resources
to manage and process.
5. You can compare Knowledge Bases between releases to report those parts of the code and its configuration that have changed. To perform
the SPARQL queries necessary to answer these types of questions it is better to use one of the common Triplestores such as .Jena Fuseki
You can load the Sparqlycode KB into graphs and query them as follows, this query find all the new methods in the latest version of Active
MQ AMQP:
PREFIX java: <http://ontology.interition.net/java/ref/>
PREFIX maven: <http://ontology.interition.net/sparqlycode/maven/vocabulary/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX isc: <http://ontology.interition.net/sparqlycode/vocabulary/>
select ?_method where {
graph
<http://www.sparqlycode.com/sparqlycode/data/org.apache.activemq.activemq-amqp.5.9.
0>
{ ?_method a java:Method . }
NOT EXISTS {
graph
<http://www.sparqlycode.com/sparqlycode/data/org.apache.activemq.activemq-amqp.5.10
.0>
{ ?_method a java:Method . }
}
}
This would result in a list of identifiers, URI, for the new methods. Shown in following Jena Fuseki 2 UI screen snapshot:
SPARQL query for the total number of new methods
6. Are there any security vulnerabilities associated with this package?
Interition also provides Sparqlycode for the National Vulnerability Database KB (NVD KB). Combined with the BUILD KB of the package you
can immediately see if there are any components with known vulnerabilities.
Checking the package itself is not associated with a vulnerability
The following SPARQL query searches the Sparqlycode NVD KB for all CVE in 2014 that refer to ."activemq:5.10.0"
7. PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX cve: <http://scap.nist.gov/schema/vulnerability/0.4#>
PREFIX cvss: <http://scap.nist.gov/schema/cvss-v2/0.2#>
PREFIX vuln: <http://scap.nist.gov/schema/vulnerability/0.4#>
SELECT ?CVE ?cpe WHERE {
?_cve vuln:product ?cpe ;
rdfs:label ?CVE .
FILTER regex(lcase(str(?cpe)), "activemq:5.10.0")
} limit 100
If you are using Jena ARQ to perform the query and placed the query in a file called the command would look like:vulnerability.rq
arq --query=vulnerability.rq --data=nvdcve-2.0-2014.ttl
The result is:
-----------------------------------------------------
| CVE | cpe |
=====================================================
| "CVE-2014-8110" | "cpe:/a:apache:activemq:5.10.0" |
-----------------------------------------------------
Checking the package dependencies are not associated with a vulnerability
Automation Possibilities with Sparqlycode
The most compelling proposition with Sparqlycode is that it enables the above examples and many other queries to be automated. It is also
possible to create other types of Knowledge Bases using RDF/OWL and integrate them easily with the Sparqlycode KB. It is possible to
instrument the whole software development lifecycle with technical and business knowledge across an enterprise at .web scale
The nvdcve-2.0-2014.ttl file is not provided as part of the example download package on this page. It forms part of the Sparqlycode
Enterprise Server.
A NIST Common Vulnerability and Exposure
The queries are kept simple on purpose but in practice this query would find the package name to search, "activemq:5.10.0",
programmatically.
A Sparqlycode Enterprise Server Feature
This is a key Enterprise feature of the additional Knowledge Bases that become available with a Sparqlycode Enterprise Server.