SlideShare a Scribd company logo
1 of 61
OPEN DATA TRAINING MATERIAL
November 2014
Page 1
Table of contents
1. Defining Open Data
2. Understanding Law and Licensing
3. Big Data vs. Open Data
4. Open data as part of your business model
5. Case studies: Open Data Business
6. Where do I find Open Data?
7. How to develop your open data business
8. Open Data training materials already available. A list
9. Slides and inspiring presentations: link-o-graphy
10.Recommended videos, audio files and books
November 2014
Page 2
1. Defining Open Data
“A key promise of open data is that it can freely accessed and used. Without a clear definition of
what exactly that means (e.g. used by whom, for what purpose) there is a risk of dilution especially
as open data is attractive for data users” (Pollock, 2014).
Main goal of this material is to make sure that people willing to re-use open datasets are aware of
what “open” really means.
First step we take is to explore some guidelines you find online. The Open Data Institute and Open
Knowledge keep posting interesting simple guides and contents, ready for open data publishers
and reusers. Let’s start from the basics: What makes data open and The Open definition v2.0.
What makes data open?
Original contents for this material is provided online at http://theodi.org/guides/what-open-data and
http://theodi.org/guides/publishers-guide-open-data-licensing .
Open data is data that is made available by organisations, businesses and individuals for anyone
to access, use and share.
Open data has to have a licence that says it is open data. Without a licence, the data can’t be
reused.
The licence might also say:
● that people who use the data must credit whoever is publishing it (this is called attribution)
● that people who mix the data with other data have to also release the results as open data
(this is called share-alike)
● that people can do whatever they want with your work, if the holder has waived the
copyright of database rights (public domain)
Example: The Department for Education makes available open data about the performance of
schools in England. The data is available as CSV and is available under the Open Government
Licence, which only requires reusers to say that they got the data from the Department for
Education.
These principles for open data are detailed in the Open Definition in the next paragraph.
November 2014
Page 3
Good open data
● are rich of documentation and metadata
● can be linked to, so that it can be easily shared and talked about
● is available in a standard, structured format, so that it can be easily processed
Open Definition
The Open Definition, created in 2005, is the main international standard for open data and open
data licences, and provides principles and guidance for all things “open”.
Open Data Mark: indicates compliance with Open Definition
Definition
You can find the entire updated version of the Open definition at http://opendefinition.org/od/ . The Open
Definition is a project by Open Knowledge, that provides details and additional contents as well on its official
web page.
This material is licensed under a CC 4.0 Attribution https://creativecommons.org/licenses/by/4.0/.
Open data is data that can be freely used, shared and built on by anyone, anywhere, for any
purpose. The “standard” provided by the Open Definition – common requirements that must be
met if a data is to be called “open” – is crucial because much of the value of open data lies in the
ease with which different sources of open data can be combined – practically every app or
insight made with data requires combining several pieces of data. For example, you need to
know the bus timetable and have a map showing bus stops to be able to reach your destination on
time.
Both legal and technical compatibility is vital, and the Open Definition ensures that openly-licensed
data can be combined successfully. This eliminates the risk of a “Tower of Babel” of data, with a
proliferation of licences and terms of use for open data leading to complexity and incompatibility.
The Open Definition prevents this fragmentation – and resulting destruction in value – by ensuring
a common standard for all “open” data. Evidence for the practical success of the effort can be
found in the reuse of the definition key principles and language in other important areas including
UK and US government policy, and include the transition in terminology from “public sector
information” to “open government data”.
Thanks to the efforts of many translators in the community, the Open Definition is available in 30+
languages.
The Open definition explains what can be defined as open work and open license. The term work
is used to denote the item or piece of knowledge being transferred. The term license refers to the
legal conditions under which the work is made available. Where no license has been offered this
should be interpreted as referring to default legal conditions governing use of the work (for
example, copyright or public domain).
November 2014
Page 4
November 2014
Page 5
Open Works
An open work must satisfy the following requirements in its distribution:
● Open License
The work must be available under an open license (as defined in Section 2). Any additional
terms accompanying the work (such as terms of use, or patents held by the licensor) must not
contradict the terms of the license.
● Access
The work shall be available as a whole and at no more than a reasonable one-time
reproduction cost, preferably downloadable via the Internet without charge. Any additional
information necessary for license compliance (such as names of contributors required for
compliance with attribution requirements) must also accompany the work.
● Open Format
The work must be provided in a convenient and modifiable form such that there are no
unnecessary technological obstacles to the performance of the licensed rights. Specifically,
data should be machine-readable, available in bulk, and provided in an open format (i.e., a
format with a freely available published specification which places no restrictions, monetary
or otherwise, upon its use) or, at the very least, can be processed with at least one
free/libre/open-source software tool.
Open Licenses
A license is open if its terms satisfy the following conditions:
● Required Permissions: The license must irrevocably permit (or allow) the following:
1.1 Use: The license must allow free use of the licensed work.
1.2 Redistribution: The license must allow redistribution of the licensed work, including sale,
whether on its own or as part of a collection made from works from different sources.
1.3 Modification: The license must allow the creation of derivatives of the licensed work and allow
the distribution of such derivatives under the same terms of the original licensed work.
1.4 Separation: The license must allow any part of the work to be freely used, distributed, or
modified separately from any other part of the work or from any collection of works in which it was
originally distributed. All parties who receive any distribution of any part of a work within the terms
of the original license should have the same rights as those that are granted in conjunction with the
original work.
1.5 Compilation: The license must allow the licensed work to be distributed along with other
distinct works without placing restrictions on these other works.
1.6 Non-discrimination: The license must not discriminate against any person or group.
1.7 Propagation: The rights attached to the work must apply to all to whom it is redistributed
without the need to agree to any additional legal terms.
1.8 Application to Any Purpose: The license must allow use, redistribution, modification, and
compilation for any purpose. The license must not restrict anyone from making use of the work in a
specific field of endeavor.
1.9 No Charge: The license must not impose any fee arrangement, royalty, or other compensation
or monetary remuneration as part of its conditions.
● Acceptable Conditions
● The license shall not limit, make uncertain, or otherwise diminish the permissions required
in previous section except by the following allowable conditions:
Attribution: The license may require distributions of the work to include attribution of contributors,
rights holders, sponsors and creators as long as any such prescriptions are not onerous.
Integrity: The license may require that modified versions of a licensed work carry a different name
or version number from the original work or otherwise indicate what changes have been made.
Share-alike: The license may require copies or derivatives of a licensed work to remain under a
license the same as or similar to the original
Notice: The license may require retention of copyright notices and identification of the license.
Source: The license may require modified works to be made available in a form preferred for
further modification.
November 2014
Page 6
Technical Restriction Prohibition: The license may prohibit distribution of the work in a manner
where technical measures impose restrictions on the exercise of otherwise allowed rights.
Non-aggression: The license may require modifiers to grant the public additional permissions (for
example, patent licenses) as required for exercise of the rights allowed by the license. The license
may also condition permissions on not aggressing against licensees with respect to exercising any
allowed right (again, for example, patent litigation).
A list of conformant licenses is available at http://opendefinition.org/licenses/ .
We explore licensing in the next section.
November 2014
Page 7
2. Understanding Law and Licensing
In this section, we intend to provide some additional materials on the licenses the applicants are
invited to look for. You can find here an extended list of licenses that are conformant with the
principles laid out in the Open Definition.
Conformant Licenses
The following licenses are conformant with the principles set forth in the Open Definition.
● Domain = Domain of application, i.e. what type of material this license should/can be
applied to. Note if you are looking for an open license for software, please see Open
Source Definition conformant licenses.
● BY = requires attribution
● SA = require share-alike
● Recommended conformant licenses
These licenses conform to the Open Definition and are:
● Reusable: Not specific to an organization or jurisdiction.
● Compatible: Must be compatible with at least one of GPL-3.0+, CC-BY-SA-4.0, and
ODbL-1.0. Permissive/attribution-only licenses must be compatible with all 3 of the
aforementioned licenses, and at least one of Apache-2.0, CC-BY-4.0, and ODC-BY-1.0.
● Current: Widely used and generally considered best practice by a broad spectrum of
projects and actors within the domains of applicability of the license.
License Domain By SA Comments
Creative Commons CCZero (CC0) Content,
Data
N N Dedicate to the Public
Domain (all rights
waived)
Open Data Commons Public Domain
Dedication and Licence (PDDL)
Data N N Dedicate to the Public
Domain (all rights
waived)
Creative Commons Attribution 4.0 (CC-
BY-4.0)
Content,
Data
Y N
Open Data Commons Attribution
License(ODC-BY)
Data Y N Attribution for
data(bases)
Creative Commons Attribution Share-
Alike 4.0 (CC-BY-SA-4.0)
Content,
Data
Y Y
Open Data Commons Open Database Data Y Y Attribution-ShareAlike
November 2014
Page 8
License (ODbL) for data(bases)
November 2014
Page 9
● Other conformant licenses
These licenses conform to the Open Definition, but do not meet reusability or compatibility
requirements for recommended licenses, or have been superseded by newer license versions or
newer licenses with similar use cases, or are little-used. These licenses may be reasonable for the
particular organization they were crafted for to use, or to use for legacy reasons. Projects outside
such contexts are strongly advised to use a recommended conformant license from the list above.
License Domain By SA Comments
Against DRM Content Y Y Little used.
Creative Commons
Attribution
versions 1.0-3.0
Content Y N Includes all jurisdiction "ports";
Superseded by CC-BY-4.0.
Creative Commons
Attribution-
ShareAlike
versions 1.0-3.0
Content Y Y Includes all jurisdiction "ports";
Superseded by CC-BY-SA-4.0.
Additionally, CC-BY-SA-1.0 is
Incompatible with any other license.
Data licence
Germany –
attribution – version
2.0
Data Y N Non-reusable. For use by Germany
government licensors. Note version 1.0 is
not approved as conformant.
Data licence
Germany – Zero –
version 2.0
Data N N Non-reusable. For use by Germany
government licensors. Note there is no
previous version.
Design Science
License
Content Y Y Little used, Incompatible with any other
license.
EFF Open Audio
License
Content Y Y Deprecated in favor of CC-BY-SA.
Free Art License
(FAL)
Content Y Y
GNU Free
Documentation
License
(GNU FDL)
Content Y Y Incompatible with any other license. Only
conformant if used with no cover texts
and no invariant sections.
MirOS License Code,
Content
Y N Little used.
November 2014
Page 10
Open Government
Licence Canada 2.0
Content,
Data
Y N Non-reusable. For use by Canada
government licensors. Note version 1.0 is
not approved as conformant.
Open Government
Licence United
Kingdom 2.0 and 3.0
Content,
Data
Y N Non-reusable. For use by UK government
licensors; re-uses of OGL-UK-2.0 and
OGL-UK-3.0 material may be released
under CC-BY or ODC-BY. Note version 1.0
is not approved as conformant.
Talis Community
License
Data Y Y Draft only, Deprecated in favour of ODC
licenses.
Non-Conformant Licenses
Non conformant licenses are usually those that though supporting some of the definition’s
principles do not support all of them.
● Creative Commons No-Derivatives Licenses
Creative Commons No-Derivatives (by-nd-*) violate OD 1.1#3., “Reuse”, as they do not allow
works, in part or in whole, to be re-used in derivative works.
Creative Commons licenses with the Noderivs stipulation include:
● Attribution-NoDerivs (by-nd)
● Attribution-NonCommercial-NoDerivs (by-nc-nd)
●
● Creative Commons NonCommercial
Creative Commons NonCommercial licenses (by-nc-*) do not support the OD 1.1#8., “No
Discrimination Against Fields of Endeavor”, as they exclude usage in commercial activities.
Creative Commons licenses with the non-commercial stipulation include:
● Attribution-Noncommercial (by-nc)
● Attribution-NonCommercial-ShareAlike (by-nc-sa)
● Attribution-NonCommercial-NoDerivs (by-nc-nd)
November 2014
Page 11
Licence Compatibility
The applicants, as reusers and publishers of open data, often need to understand whether the
licenses applied to datasets are "compatible".
We recommend to the Finodex proposers to have a look at this page:
https://github.com/theodi/open-data-licensing/blob/master/guides/licence-compatibility.md
The most important step towards understanding compatibility in more detail is to understand the
basic provisions of each license.
The Creative Commons Rights Expression Language defines some basic facets of licenses,
covering Permissions, Requirements and Prohibitions. As the CC licenses are already described
using these facets, which are also common to many other licenses, it is possible to put together a
matrix that identifies which facets apply to which licenses.
Table 1 summarises how a number of licenses can be classified based on these facets.
There are several things to note here:
● The Share Alike requirement requires that derived data is published under the same or
compatible terms as the original. This places limits on how remixes can be distributed, i.e.
only under compatible terms.
● The Derivative Works prohibition limits re-users from distributing any form of derivative
work at all. Even if those derivatives are not distributed. However it is still possible to
include the database in a collection in which the original is preserved.
When it comes to publishing derivatives there are, broadly, two different scenarios to consider:
publishing a simple derivative based on a single source, and
publishing a remix of several datasets.
Once a derivative has been created, then it too can be the source of additional derivation.
Derivation is a process that can be repeated either by the original publisher (e.g. mixing in
additional further datasets) or by third-parties (e.g to create new derivatives).
November 2014
Page 12
Questions about licence compatibility:
● Can some data published with Licence X be combined with some additional data published
under Licence Y?
● What license(s) could be applied to a derived or aggregated dataset?
● Are there provisions associated with a licence that inhibit or constrain the creation and
Set of questions for open data publishers and reusers
Author: David Tarrant
● Do you have rights or permission to publish?
● Do you have rights to use the information/data?
● Is the data derived from other sources?
Further readings:
http://www.scribd.com/doc/128356210/Business-considerations-or-privacy-and-open-data-how-not-to-get-caught-out
http://www.scribd.com/doc/125638490/Getting-to-grips-with-the-National-Pupil-Database-personal-data-in-an-open-data-
world
USEFUL GUIDES for reusers and publishers released by The Open Data
Institute
The ODI Publisher's Guide to Open Data Licensing
Source: http://theodi.org/guides/publishers-guide-open-data-licensing
In Europe, there are two kinds of rights that you are automatically given over things that you have
created:
● you get copyright over works (content) that you create and which are original to you, such
as text that you write or photographs you take
● you get a database right over collections of data that you have put a substantial effort into
obtaining, verifying or presenting
Note: As far as we know the database right only arises within the European Union and in Mexico.
In some countries there may be no protection for collections of data.
Database right: 15 years since database was last updated
Database copyright: Life of author + 70 years from date database was created
November 2014
Page 13
Suggestion for the proposers:
If you are uncertain about what rights you may have over a
piece of content or dataset or how you can use it...
Contact the owner. Ask.
If you apply original judgement in putting together a database, for example in choosing which items
to include within the database or which information about them to include, you have a copyright
over that database, because it is a creative work.
For example, if you were to build a database about the best 100 cars, this might involve:
● choosing which cars count as the best cars
● writing a description about each car
● researching and gathering facts about them
You would have copyright over the database, because you chose which cars were “best”. You
would have copyright over the descriptions, because you wrote them. And you would probably
have the database right for the database you’ve built, because you put substantial effort into
gathering information about them. Importantly, you don’t own the facts about the cars — anyone
else can build their own database containing exactly those facts without violating your database
right — but no one else can reuse your database or your descriptions without your permission
because you own the copyright over them.
You probably do not have a database right if you create the facts in a database, as opposed to
gathering them from elsewhere, unless you put substantial effort into verifying or presenting the
database. For example, if you own a restaurant and create a database of the dishes that you offer
and when you offer them, you probably do not have a database right over that database, though
you might have copyright because of the creative judgement involved in working out which dishes
should be offered on particular days to provide a balanced menu.
Copyright and database right are types of Intellectual Property Rights (IPR). There are other kinds
of IPR that you can get, such as patents, trademarks and (some) design rights, which must be
registered (for example with the Intellectual Property Office).
November 2014
Page 14
Database definition
“A collection of independent works, data or materials which are
a) arranged in a systematic or methodical way and
● What About Data From Other Organisations?
You might not own all the content or data that you have and use within your organisation. In
particular, rather than creating the content or gathering the data yourself, some of the content and
data you hold and use within your organisation, and might want to publish, might be:
● completely licensed from someone else
● include an extract of content or data that you have licensed from someone else
● be derived from the content or data that you have licensed from someone else
The Reuser’s Guide to Open Data Licensing describes what you can do with content or data that
you licence from someone else. If you do reuse that content or data in your own publications, you
should indicate the licence under which you are reusing that content, so that people reusing that
content or data know what they can do with it.
● What About My Brand?
Organisations who publish content or data under an open licence are often concerned that this
might enable reusers to also copy their brand.
Your brand should be protected through a trade mark. A trade mark restricts how other people use
your logo or company name. You will also have copyright on the logo.
Although your trade mark will protect you from other people using your logo directly, if your logo is
incorporated into some content that you licence, you should make sure the logo is explicitly not
covered by that licence, as you will usually want to place additional restrictions on its use (such as
its adaptation).
For example, if you have written a report that includes your logo, and you want to licence the
content of the report under the Creative Commons Attribution licence, you could say:
The text, figures and tables in this report are licensed under a Creative Commons Attribution 4.0
International License.
What If I Publish the Data on a Website?
November 2014
Page 15
You still have rights over your database and your content when you publish them on a website.
Others cannot legally extract and reuse a substantial portion of your data or content without your
permission.
You can also indicate that others should not scrape data from your website through your Terms
and Conditions and through technical mechanisms such as robots.txt.
There are two sets of open licences. You should use a licence from one of these sets rather than
creating your own licence, for three reasons:
1. it’s less work
2. it ensures that the legal language in the licence is correct
3. it makes it a lot easier for reusers to know what they can do with your data
● Open Licences for Creative Content
Creative content, such as text, photographs, slides and so on, should be licensed using a Creative
Commons Licence. There are three of these that you should consider using for open content:
Level of Licence Creative Commons Licence
public domain CC0
attribution CC-by
attribution & share-alike CC-by-sa
Make sure that you use the latest (version 4.0) Creative Commons licenses, which are
international. The links in the table above go to the correct licences.
There are other types of Creative Commons licences that are not open licences. For example, the
Creative Commons Attribution-NonCommercial licence does not allow commercial reuse of
November 2014
Page 16
content, and therefore is not an open licence. If you use the Creative Commons licence chooser,
only those that are described as “Free Culture” licences are open licences.
● Open Licences for Databases
We now recommend that you also use a Creative Commons 4.0 licence for data as well as for
content.
You may alternatively use a similar set of licences that was created specifically for databases from
the Open Data Commons. There are again three levels that you can choose from:
Level of Licence Open Data Commons Licence
public domain PDDL
attribution ODC-by
attribution & share-alike ODbL
ODBL licence is used for OpenStreetMap.
You can find more details here: https://blog.openstreetmap.org/2014/08/06/at-the-edge-of-the-
license/
Which Licence Should I Use?
The licence that you use should support your open data business model. It is unusual for
organisations to place content or data in the public domain as being given attribution for the
content or data usually helps to achieve some of the goals of opening it up.
It is possible to license content or data under more than one licence, and let reusers choose which
licence to use it under. Typically you would dual-license some content or data by making it
available under an open licence and under a paid-for licence that does not have the same
restrictions. Dual-licensing is typically used with a share-alike licence, as outlined below.
November 2014
Page 17
Some open data business models work best with a share-alike licence. For example:
● a share-alike licence will usually be unattractive to commercial businesses who don’t want
to open up their own data, so using a share-alike licence coupled with a charged licence
can be a good basis for a freemium business model
● when you are collaborating with others to create a shared resource, a share-alike licence
can help to ensure that you can bring back into that resource any work that others do on
their own copies
On the other hand, if you are hoping to gain other benefits for your business through the reuse of
your data, using a cross-subsidy business model, you may find that a share-alike licence prevents
people from reusing it, and therefore want to avoid having a share-alike restriction.
There are two cases where you have no choice over what licence you can use for the content or
data that you publish.
1. If you are publishing content or data that is derived from content or data that was
licensed to you using a share-alike licence, then you must publish your content or data
using that same licence.
2. With very few exceptions, if you are a government department or arms-length body then
the content or data that you have created or gathered is owned by the Crown. Unless
you have an exemption, granted by the Office of Public Sector Information (OPSI), you
must publish this data using the Open Government Licence.
What Attribution Should I Ask For?
If you choose a licence that includes a requirement for attribution, you need to specify what that
attribution should look like.
In choosing what attribution to ask for, you should consider the ways in which your data or content
might be reused, and the fact that it might be combined with other data or content that might
require its own attribution. If you want to encourage the reuse of your data or content, you need to
make it easy for reusers to satisfy your attribution requirements.
There are two things you should document:
November 2014
Page 18
1. What should the attribution include? You will usually want the name of your
organisation, and a link to either your organisation’s home page or a page about the
data or content you are licensing. Keep this as minimal as possible.
2. Where and how should the attribution be presented? Some attribution requirements
specify that the attribution must be presented directly wherever the data is used, and
may even specify the size or format of the attribution. These requirements can be
difficult to adhere to, particularly for mobile application developers who have limited
screen space to include such attributions. Allowing reusers to provide attribution on a
separate page makes this easier.
Note that under the terms of the licences listed above, when a reuser uses your data or content to
add value to or to create new data or content, they cannot relicense your work. Any onward
reusers are bound by the same attribution requirements as the direct reusers of your content or
data. It’s a good idea to explicitly document this requirement because it might not be obvious to
reusers.
How Do I Indicate the Licence of Content or Data?
You should indicate the licence for content or data you make available using both a human-
readable description and computer-readable metadata. The clearer you make it which licence
applies to your content or data, the easier it is for reusers to know that they can reuse the content
or data you are licensing.
The human-readable descriptions and marks that you should use are spelled out on the Creative
Commons and Open Data Commons websites:
● Creative Commons licence chooser
● Open Data Commons licences
It is best to embed information about the licence that some content or data is available under
directly within the content or data. This ensures that the licensing information is carried around with
the content or data.
In addition to human-readable text, you should provide computer-readable metadata. The separate
Publisher’s Guide to the Open Data Rights Statement Vocabulary describes how to do this.
If you add your dataset to a catalog, such as data.gov.uk or the Data Hub, you should make sure
that you indicate the licence under which the dataset is available within that catalog. This gives
November 2014
Page 19
people searching the catalog a quick and easy way of seeing that they will be able to reuse the
dataset.
November 2014
Page 20
The ODi Reuser's Guide to Open Data Licensing
Source: http://theodi.org/guides/reusers-guide-open-data-licensing
The fact that you can get hold of some information does not necessarily mean that you can do
whatever you want with it. You need to have permission from the owner of that information to do
what you want to do. A licence tells you what you can do.
But what does it mean to license data? What requirements can a licence place on you? What
different licences to publishers use? How can you find out what licence a dataset is available
under? This guide answers these questions.
Note: This guide focuses on data published by organisations based in the UK. Licensing law is
different in different countries, so some of this information might not apply to you if you are reusing
information that is published elsewhere. It does not address other potential legal considerations,
such as compliance with the Data Protection Act.
● What Do Publishers Own?
In Europe, there are two kinds of rights that publishers — organisations or individuals who make
available content or data — are given over things that they have created:
● they get copyright over works (content) that they create and which are original to them,
such as text that they write or photographs they take
● they get a database right over collections of data that they have put a substantial effort
into obtaining, verifying or presenting
Note: As far as we know the database right is unique to the European Union. In some countries
there may be no protection for collections of data.
If someone applies original judgement in putting together a database, for example in choosing
which items to include within the database or which information about them to include, they have a
copyright over that database, because it is a creative work.
For example, if someone were to build a database about the best 100 cars, this might involve:
● choosing which cars count as the best cars
● writing a description about each car
● researching and gathering facts about them
November 2014
Page 21
They would have copyright over the database, because they chose which cars were “best”. They
would have copyright over the descriptions, because they wrote them. And they would probably
have the database right for the database they’ve built, because they put substantial effort into
gathering information about the cars. Importantly, they don’t own the facts about the cars — you or
anyone else could build your own database containing exactly those facts without violating their
database right — but no one else can reuse their database or their descriptions without their
permission because they own the copyright over them.
Publishers probably do not have a database right if they create the facts in a database, as opposed
to gathering them from elsewhere, unless they put substantial effort into verifying or presenting the
database. For example, if someone owns a restaurant and creates a database of the dishes that
they offer, and when they offer them, they probably do not have a database right over that
database, though they might have copyright because of the creative judgement involved in working
out which dishes should be offered on particular days to provide a balanced menu.
● What About Data From Third Parties?
Publishers might not own all the content or data that they publish themselves. In particular, rather
than creating the content or gathering the data themselves, some of the content and data they
publish might be:
● completely licensed by them from someone else
● include an extract of content or data that they have licensed from someone else
● be derived from the content or data that they have licensed from someone else
When they publish the data, the publisher should tell you about which content or data is owned by
another organisation, and under which licence it is being republished.
● What About Brands?
Brands are usually protected through a trade mark. A trade mark restricts how you can use an
organisation’s logo or company name. They will also have copyright on the logo.
Licences for content or data usually explicitly exclude logos and company names, so you cannot,
for example, adapt a logo by changing the colours used within it. You also cannot use the company
name or logo to lend weight to your product without permission to do so. However, the attribution
requirements of a licence may require you to use the company name and logo to indicate that you
have reused data owned by that company.
● What Can’t You Do?
There are a few things that you can do with content or data without a licence, but in general you
need to be given a licence by a publisher if you want to reuse their content or data. Having access
to some content or data — for example by downloading it from a publisher’s website — does not
give you the right to reuse it.
November 2014
Page 22
● Republishing and Adding Value
You do not automatically have the right to republish, in its entirety, content or data that someone
else owns, even if they have given you a licence to use it yourself. You need to check the terms of
the licence for the content or data to make sure that you can republish it.
The same applies if you are adding value to the content or data, for example by automatically
adding links or styling to content, or adding columns with extra information into a dataset. The new
content or data includes the entirety of someone else’s content or data, so you cannot publish it
unless you have their permission.
● Publishing Extracts
You have the right to publish extracts of content or databases that you have access to, regardless
of what the licence says, so long as the extract is not “substantial”. However, it is often hard to tell
if the extract that you have made is “substantial”.
The licence that you have been given might let you republish any amount of the content or data
(open licences do this). Otherwise, you should take legal advice about whether the extracts that
you want to publish are likely to count as substantial or not.
● Publishing Derived Content or Data
You might want to create new content or databases by adapting, deriving, or otherwise processing
some content or data. To do that, you first have to ensure you have been given a licence to use the
data in the first place. You then need to look at what the licence says about creating derived works.
For example, say you have been given a licence to use a photograph on your website. You could
create a new version of that photograph by changing it from colour to black & white, or by adding a
speech bubble to it.
In this case, the photograph is a creative work, and the person who took it owns the copyright.
Because the photograph is protected by copyright, you can only create these new images if the
licence under which you are using the photograph allows you to do so.
Copyright can exist in small pieces of content, such as phrases. For example, if you analyse some
content to create a new database, you should make sure that you have the right to reuse any
snippets of content that you might keep in the new database. If the content includes a presentation
of data from a database, you have to consider database rights as well: scraping data from the page
might equate to creating an extract.
Database rights are slightly different, because they only extend to creating extracts or re-utilising
(republishing) a database.
For example, say you analysed the data about prescriptions of each drug within each GP practice
within the UK, along with other data about the coverage of each practice, to create a new dataset
that provided the average spend per patient of each practice. So long as you had no separate
contractual obligations to the owners of the two datasets you have brought together, you might well
be free to do what you liked with the result, as it would not be possible to reconstruct the original
databases from the aggregated data.
November 2014
Page 23
● What Do Licences Say?
Licences tell you what you can do with the content or data that you access. A licence will tell you
whether you can:
● republish the content or data on your own website
● derive new content or data from it
● make money by selling products that use it
● republish it while charging a fee for access
Many licences will let you access content or data for free, but say that you cannot republish it or
adapt it, or use it within commercial products. If you break the terms of the licence, the owner of
the content or data can take you to court.
● What Do Open Licences Say?
An open licence is one that places very few restrictions on what you can do with the content or
data that is being licensed.
According to the Open Definition, there are only two kinds of restrictions that an open licence can
place:
● that you must give attribution to the source of the content or data
● that you must publish any derived content or data under the same licence (this is called
share-alike)
An open licence might do neither or one or both of these. So, you might encounter content or data
available under one of three levels of licence:
1. a public domain licence has no restrictions at all (technically, these indicate that the rights
owner has waived their rights to the content or data)
2. an attribution licence just says that you must give attribution to the publisher
3. an attribution & share-alike licence says that you must give attribution and share any
derived content or data under the same licence
November 2014
Page 24
● How Do You Provide Attribution?
You should provide attribution even if the licence does not require it. Giving attribution is a way of
recognising both the efforts that the publisher has made to put together the content or data you are
reusing, and their generosity in making it available for reuse.
When content or data is licensed using a licence that includes attribution, the publisher might
specify:
● what wording the attribution should include
● where and how the attribution should be presented
You should follow what the publisher asks you to do. If it is not practical, for example if you are
providing a service that does not have room for the attribution statement that they request, then get
in touch with them to ask what to do.
It is good practice to provide the name of the organisation that published the data or content, and a
link to their home page. Specifying the name of the dataset and providing a link to its location also
helps other reusers to find the data you are reusing.
If you are building a tool that reuses some content or data, you should try to include attribution on
every page or screen in which the content or data is used. If this is impractical (for example
because you are pulling together information from lots of different sources), you should provide a
clear link to a page or screen that then provides attribution information.
If you are republishing data or content, its reusers are still bound by the attribution requirements of
the original data or content. To make it easier for them to understand and fulfil those requirements,
it is good practice to include the attribution for the source data or content in the attribution that you
ask for. This might sometimes be impractical, for example because you are creating derived data
or content includes data or content from a large number of sources. In these cases, you should
provide a full list of the sources and request an attribution which links to that list.
● How Do You Share-Alike?
A share-alike licence requires you to republish new content or data that you create using the given
content or data under the same, share-alike licence. Creating new ways of presenting data does
not count as derivation or adaptation, but combining two sets of data to create a new set probably
does.
Publishing the content and data that you create from open data, as open data, is a good thing to
do even if the licence does not require it. Opening up your content and data enables others to
reuse and build on your work, and can add value to your work.
● What Open Licences Are There?
There are two sets of open licences that you may encounter.
November 2014
Page 25
● Open Licences for Creative Content
Creative content, such as text, photographs, slides and so on, may be licensed using a Creative
Commons Licence. There are three of these that you might encounter:
Level of Licence Creative Commons Licence
public domain CC0
attribution CC-by
attribution & share-alike CC-by-sa
There are different versions for each of these licences, the most recent being version 4.0. There
are also different variants which take into account differences in the law in different countries. The
links in the table above are to the version 4.0 versions, which apply internationally, but you may
find publishers using other versions. You can reuse content under these licences no matter what
country you are in.
There are other types of Creative Commons licences that are not open licences. For example, the
Creative Commons Attribution-NonCommercial licence does not allow commercial reuse of
content, and therefore is not an open licence. The human-readable summaries of the Creative
Commons licences spell out exactly what you can do under each licence.
● Open Licences for Databases
You might encounter a similar set of licences which is available for databases from the Open Data
Commons. There are again three levels:
Level of Licence Open Data Commons Licence
public domain PDDL
attribution ODC-by
attribution & share-alike ODbL
● Other Licences
There are other licences that enable reuse and which you may encounter, particularly around
public sector information:
November 2014
Page 26
● Open Government Licence is an attribution licence that covers both copyright and
database right and is mainly used for information made available by UK central government
● OS Open Licence is an attribution licence that is exactly the same as the Open
Government Licence but ensures that the attribution is to the Ordnance Survey
● How is the Licence Indicated?
The licence under which information is published should be clear both in human-readable content
and as machine-readable data. If you cannot work out the licence for information that you discover
on the web, you should contact the owner of the site to ask: the lack of licensing information means
that you cannot assume the right to reuse the content or data.
Human-readable descriptions and marks that you may encounter are shown on the Creative
Commons and Open Data Commons websites:
● Creative Commons licence chooser
● Open Data Commons licences
Where possible, the publisher should have embedded information about the licence directly within
the content or data itself. Often, however, you will have to look at the page from which you access
the content or data, or the licence information for the entire website, which is often linked to from
the footer of the page.
If a publisher adds their dataset to a catalog, such as data.gov.uk or the Data Hub, they may
indicate the licence under which the dataset is available in the metadata supplied by the catalog.
You should check that this is consistent with any licence information they supply on their own site
or within the data itself: if it is not, you should ask them for clarification.
Legal tools for open data
Open Data Commons is the home of a set of legal tools to help you provide and use Open Data
http://opendatacommons.org/
http://opendatacommons.org/faq/licenses/
3. Big Data vs. Open Data
November 2014
Page 27
Big Data vs Open Data - Diagram
Source: http://www.opendatanow.com/2013/11/new-big-data-vs-open-data-mapping-it-out/#.VGDCrfSG9Zt
As Joel Gurin points out: “there’s general agreement that Open Data should be free of charge or
cost just a minimal amount. Starting with some basic descriptions, the intersection of these three
concepts (big data, open data, open government) defines the six subtypes of data shown on the
Venn diagram. (There’s no separate category for the intersection of Big Data and Open
Government – anything in that category is also Open Data.) Here are characteristic examples of
each, referring to the numbers above.
1. Big Data that’s not Open Data. A lot of Big Data falls in this category, including some Big Data
that has great commercial value. All of the data that large retailers hold on customers’ buying
habits, that hospitals hold about their patients, or that banks hold about their credit-card holders,
falls here. It’s information that the data-holders own and can use for commercial advantage.
National security data, like the data collected by the NSA, is also in this category.
2. Open Government work that’s not Open Data. This is the part of Open Government that
focuses purely on citizen engagement. For instance, the White House has started a petition
website, called We the People, to open itself to citizen input. While the site makes its data
available, publishing Open Data – beyond numbers of signatures – is not its main purpose.
3. Big, Open, Non-Governmental Data. Here we find scientific data-sharing and citizen science
projects like Zooniverse. Big data from astronomical observations, from large biomedical projects
like the Human Genome Project, or from other sources realizes its greatest value through an open,
shared approach. While some of this research may be government-funded, it’s not “government
data” because it’s not generally held, maintained, or analyzed by government agencies. This
category also includes a very different kind of Open Data: the data that can be analyzed from
Twitter and other forms of social media.
4. Open Government Data that’s not Big Data. Government data doesn’t have to be Big Data to
be valuable. Modest amounts of data from states, cities, and the federal government can have a
major impact when it’s released. This kind of data fuels the participatory budgeting movement,
where cities around the world invite their residents to look at the city budget and help decide how
to spend it. It’s also the fuel for apps that help people use city services like public buses or health
clinics.
November 2014
Page 28
5. Open Data – not Big, not from Government. This includes the private-sector data that
companies choose to share for their own purposes – for example, to satisfy their potential investors
or to enhance their reputations. Environmental, social, and governance (ESG) metrics fall here. In
addition, reputational data, such as data from consumer complaints, is highly relevant to business
and falls in this category.
6. Big, Open, Government Data (the trifecta). These datasets may have the most impact of any
category. Government agencies have the capacity and funds to gather very large amounts of data,
and making those datasets open can have major economic benefits. National weather data and
GPS data are the most often-cited examples. U.S. Census data, and data collected by the
Securities and Exchange Commission and the Department of Health and Human Services, are
others. With the new Open Data Policy, this category will likely become larger, more robust, and
even more significant.
November 2014
Page 29
November 2014
Page 30
4 key steps
These are in very approximate order — many of the steps can be done simultaneously.
1. Choose your dataset(s). Choose the dataset(s) you plan to make open. Keep in mind
that you can (and may need to) return to this step if you encounter problems at a later
stage.
2. Apply an open license.
○ Determine what intellectual property rights exist in the data.
○ Apply a suitable ‘open’ license that licenses all of these rights and supports the
definition of openness.
○ NB: if you can’t do this go back to step 1 and try a different dataset.
○
3. Make the data available — in bulk and in a useful format. You may also wish to
consider alternative ways of making it available such as via an API.
4. Make it discoverable — post on the web and perhaps organize a central catalog to list
your open datasets.
4. Categories and Type of Data
Open can apply to information from any source and about any topic. Anyone can release their data
under an open licence for free use by and benefit to the public. Although we may think mostly
about government and public sector bodies releasing public information such as budgets or maps,
or researchers sharing their results data and publications, any organisation can open information
(corporations, universities, NGOs, startups, charities, community groups and individuals).
There is open information in transport, science, products, education, sustainability, maps,
legislation, libraries, economics, culture, development, business, design, finance. So the
explanation of what open means applies to all of these information sources and types.
Source: http://blog.okfn.org/2013/10/03/defining-open-data/#sthash.nXnXf8Bx.dpuf
November 2014
Page 31
Categories
Business and Legal services
Data/Technology
Education
Energy
Environment and weather
Finance and Investment
Food and Agriculture
Geospatial/Mapping
Governance
Healthcare
Housing/ real estate
Insurance
Lifestyle and Consumer
Media
Research and Consulting
Scientific Research
Transportation
November 2014
Page 32
The Open Data Consumers Checklist:
Source: http://theodi.org/guides/the-open-data-consumers-checklist
The Open Data Handbook:
Source: http://opendatahandbook.org/
The handbook introduces you to the legal, social and technical aspects of open data. It can be
used by anyone but is especially useful for those working with government data. It discusses the
why, what and how of open data — why to go open, what open is, and the how to do open. Read it
online or download a PDF .
November 2014
Page 33
4. Open data as part of your business model
Al-Debei and Avison (2010) derived a unified business model framework based on a
comprehensive review of the literature. They argue that the model provides an abstract but holistic
view and that the fundamental
dimensions are value based. There are four relevant aspects to the business model framework:
● Value proposition—the business logic for creating value for customers by offering products
● and services for targeted segments,
● Value architecture—an architecture for the technological and organizational infrastructure
● used in the provisioning of products and services,
● Value network—collaboration and coordination with other organizations, and
● Value finance—the costing, pricing, and revenue breakdown associated with sustaining and
improving the creation of value.
New business models and practices driven by social media and open data have hardly been
investigated. Exceptions are the analyses of companies in the United Kingdom (Hammell,
Perricos,Lewis, & Branch, 2012) and a classification of social business models based on the
revenue model (for instance, Ferro, 2012; Ferro & Osella, 2012; Ferro & Osella, 2013; Ubaldi,
2013).
Based on the analysis of a number of companies in the United Kingdom, five archetypes of
business models can be identified (Hammell et al., 2012).
These include:
(1) suppliers—public and private sector organizations—publishing the data,
(2) aggregators linking open data to produce useful insights,
(3) developers—organizations and individuals—building apps,
(4) enrichers using open data to enable their existing products and services, and
(5) enablers facilitating the supply and use of open data.
Ferro and Osella (2013) identify the following models:
1. Premium—end users are offered a service or product in exchange for payment.
2. Freemium—basic services or products are offered free of charge. Profit is made by having
end users pay for extended features.
3. Open source like—data are offered for free through cross subsidization.
4. Infrastructural Razor and Blades—data sets are stored for free and are accessible to everyone
via Application Programming Interfaces (APIs) (‘‘razors’’), while reusers are charged only
for the computing power that they employ on demand in as-a-service mode (‘‘blades’’).
5. Demand-oriented platform—the company provides developers with a one-stop shop of data
sets that are catalogued using metadata. Revenue is made in exchange for advanced services
and refined data sets or data flows.
6. Supply-oriented platform—this business model is quite similar to the previous one, but the
PSI providers are charged in lieu of developers.
7. Free as branded advertising—the company uses PSI as a tool to attract attention from
November 2014
Page 34
customers by providing them with useful services. The company expects that the public will
then favor its particular brand or company. Revenue is expected not to come directly from
PSI, but from other business lines that represent the company’s core business.
8. White label development—a company wants to use PSI as an attraction tool but does not
have the competencies required to do so. The company then uses an advertising factory,
which receives payment in the form of a lump sum or recurring fees in exchange for turnkey
solutions, depending upon whether the solution is in the form of a product or a service
(Ferro & Osella, 2013).
The revenue model can be payment by open data providers or users in the form of
(1) recurring fees, granting access for a specific time period, or pay per use,
(2) advertisement, or
(3) ensuring visibility for creating revenue for other activities (Ferro & Osella, 2013). Although these
eight options describe a complete array of possible business models, they are derived from the
revenue.
Infomediary Business Models for Connecting Open Data Providers and Users
Available here: http://ssc.sagepub.com/content/early/2014/01/30/0894439314525902.full.pdf+html
All infomediary business models can be developed and operated by either public or private
organizations. The business model might be initiated by public events (hackathons) but operated
by private party, yet when a best practice is adopted the roles can be reversed. The following six
business models were identified.
1. Single-purpose apps provide real-time services such as information about weather, quality of
restrooms, vehicles, houses, and pollution. These apps often provide a single function, based
on one type of open data provided. The app processes the data and presents it visually for the
ease of the users.
2. Interactive apps: In addition to single-purpose apps, this type of business model provides users
the opportunity to add content. Ratings are often included, as is additional information such as
complaints.
3. Information aggregators take many published open data sources and combine and process
them for subsequent presentation to the users. An example is a transportation planner that
aggregates information from various transport modalities and companies.
Often interoperability is a challenge that requires agreements among data providers.
4. Comparison models: This type of business model aggregates open data from various sources
for the purpose of comparing the performance of entities with each other. For example, it can be
used to compare schools and other public organizations. The data can originate from official
sources (school inspection) or from users (criminal chart) and used by citizens (in determining a
school for their children or a place to live) and public organizations (in developing measures to
improve schools or for crime interventions).
5. Open data repositories are used by governments to publish their information. These can be
national open data portals or more specialized portals, such as websites of statistical agencies.
The essence is that these portals are relatively closed and only a limited number of public
organizations can publish open data on them. There is little to no user interaction, and the focus is
on being able to indiscriminately open data sets. Searching for open data is a key aspect, although
it is often difficult to find the right information. They can provide basic functionalities for processing
and visualizing data.
November 2014
Page 35
6. Service platforms: These platforms provide all kinds of features for searching, importing,
cleansing, processing, and visualizing information. Service platforms often contain open data
repositories or are connected to open data repositories that function as the data source. Service
platforms can vary in the level of openness; some are based on payment (e.g., www.junar. com)
whereas others are free of charge (www.engagedata.eu ).
Further reading:
Business models for open data applications available at:
http://www.appsforeurope.eu/article/business-models-open-data-applications
November 2014
Page 36
5. CASE Studies: Open Data Business
Success stories about the open datastartups from the ODI Startup
Programme
November 2014
Page 37
Transport API
http://www.transportapi.com
Clients: Transport for London, Heathrow Airport. Greater London Authority, Citymapper, Elgin, Giraffe.co.uk,
Network rail
Products: TransportAPI
Achievements: Transport API solutions have powered award winning apps, such as Citymapper
The TransportApi story:
TransportAPI is Britain’s first comprehensive open platform for transport solutions. the company’s objective
is to enhance travel experience through real time information, and enable new transportation insights
through analytics. It uses open data feeds from key industry sources as Traveline, Network rail and Transport
for London. The company offers nationwide timetables, departure and infrastructure informations for
schedules, live departures and archived service running across all transport modes. The data feeds are
available for integration by web and app developers. Data Components such as the ‘nearest transport’ widget
can be used in travel portals, hyperlocal sites and business analytics.
TransportAPI currently has 700 developers and organizations signed up on its platform. They are individual
taxpayers, but also public sector organizations like universities and local authorities who are getting free data.
As Jonathan Raper, Managing director, says, “Our intervention in the market has led prices for transport data
fall and previous monopoly transport data providers to relax their terms.” The company also scales data
usage and provides a new, single source option for its customers, like Heathrow Airport, who now use
TransportAPI for all their public transport information. Jonathan further explain that “TransportAPI employs 6
people now and the tax we generate per year is nudging £75K”.
November 2014
Page 38
Mastodon C
http://www.mastodonc.com
Clients: Technology Strategy Board, CDEC’s Open Health Data platform, Nesta
Products: Kixi Data Platform
Achievements: Mastodon C identified £200m of potential savings to the NHS in its prescribing analytics project,
which investigated the use of branded statins over cheaper generic versions.
The Mastodon C story
Mastodon C helps businesses make sense of the proliferation of data that now exists, allowing them to make
better decisions. It does this using a cloud-based open source data processing and analytics platform, which it
customises to each client’s datasets. The team also applies data science techniques to gain insights, make
predictions and find business value from data, which is built back into client systems.
The team at Mastodon C uses open data together with the closed data that clients own. Francine Bennett, Co-
Founder and CEO at Mastodon C says: “We often find ourselves introducing clients to open data concepts
through our work, as we’ll suggest useful datasets which they can make use of to help their business.”
6. Where do I find open data?
A list open data catalogs
http://publicdata.eu/
https://open-data.europa.eu/it/data
http://datacatalogs.org/
http://planet.openstreetmap.eu
http://wikidata.org
dbpedia.org
November 2014
Page 39
7. How can you develop your open data business?
This chapter has been elaborated by the Finodex team and It’s already included in the Finodex
Handbook.
Summary:
In this chapter we provide basic knowledge regarding how you can develop your business
using open data. We’ll show how to generate a business model, exploring the components of
the Business Model Canvas in detail. In particular, we’ll offer an overview of open data
business models. In the case of reuse of PSI (Public sector information) Osella & Ferro have
developed an interesting framework “that focuses on decision-making levers that a business
developer has at his/her fingertips for molding the overarching architecture of a business
venture hinged on public data re-use”. They combined the framework with the business model
ontology by employing the Business Model Canvas in order to visualize archetypal business
models at an enterprise level. The tool has been proved very useful and could probably be
adopted in the development and assessment of any data intensive business venture. After
exploring eight business models we introduce the importance of the adoption of the Lean
methodology for business development, offering a case study of open data business
development in which the Lean approach has been used. Moreover, defining and setting your
business goals need a competitor analysis, which is also explained. Last but not least, we
describe the rights connected to using open datasets. Licensing and related issues of
compatibility between licenses are crucial when you deal with open data.
Index:
a. Business Modeling
b. Open Data Business models
c. Lean methodology
d. Competitor Analysis
e. Intellectual Property Rights
Introduction
In this chapter we provide essential knowledge regarding how you can develop your open data
business. We’ll show how to generate a business model, exploring the components of the
Business Model Canvas in detail. In particular, we’ll offer an overview of open data business
models. In the case of reuse of PSI (Public Sector Information) Osella & Ferro have developed an
interesting framework “that focuses on decision-making levers that a business developer has at
his/her fingertips for molding the overarching architecture of a business venture hinged on public
data re-use”. They combined the framework with the business model ontology by employing the
Business Model Canvas in order to visualize archetypal business models at an enterprise level.
The tool has been proved very useful and could probably be adopted in the development and
assessment of any data intensive business venture. After exploring eight business models we
introduce the importance of the adoption of the Lean methodology for business development,
offering a case study of open data business development in which the Lean approach has been
used. Moreover, defining and setting your business goals need a competitor analysis, which is also
explained. Last but not least, we describe the rights connected to using open datasets. Licensing
and related issues of compatibility between licenses are crucial when you deal with open data.
November 2014
Page 40
a) Business Modeling
A business model is a strategic tool that indicates how the company makes money specifying the
sources of the company’s revenues as well as how much and how often these sources are willing
to do that. Since its publication in 2004, the book “Business Model Generation” by Osterwalder and
Pigneur, soon has become the bible for startups and SMEs. In their book the authors explain the
so called Business Model Canvas (Figure 1), which is a tool that will help you to visually and
capture the components of a business model, and will assist you in the business model generation
process.
In order to keep track of all of your steps in creating your business model, you may want to
download here the “canvas” and start to write down all the assumptions and progress that you
make!
Figure 1. Business Model Canvas
Source: “A business model describes the rationale of how an organization creates, delivers, and
captures value” in Osterwalder & Pigneur, Business Model Generation, 2004.
According to Osterwalder, in order to build an effective business model you have to identify several
blocks. In the following we briefly list them. For each of them, rather than a theoretical description,
we provide a set of practical questions for you to answer. Down to work!
1. Customer segments
First of all, you need to define which customers you aim to reach. You have to answer two
important questions:
● For whom are we creating value?
● Who are our most important customers?
November 2014
Page 41
2. Value Proposition
You should provide to your customers a product or a service with an added value. The “value
proposition” is a statement that summarizes why potential consumers should buy your particular
product or service, and prefer it to similar offerings. In this case, you should answer the following
questions:
● What value do we deliver to the customer?
● Which one of our customer’s problems are we helping to solve?
● Which customer needs are we satisfying?
● What bundles of products and services are we offering to each Customer Segment?
Factors such as newness, performance, customization, design, brand/status, cost reduction, risk
reduction, accessibility, and convenience/usability can add value to your business. Your value
proposition may be qualitative (privileging customer experience and outcome) and/or qualitative
(price and efficiency).
3. Sales Channels
Once you have understood your value proposition and your customer segment, you need to take
care of channels able to deliver the value to your clients. You should ask yourself:
● Through which channels do our customer segments want to be reached?
● How are we reaching them now?
● How are our channels integrated? Which ones work best?
● Which ones are most cost-efficient?
● How are we integrating them with customer routines?
You can reach your clients either through your own channels (store front), your partner channels
(major distributors), or a combination of both.
4. Customer Relationships
Another important step: you have to identify the kind of relationship you establish with each of your
customer segments. These are the main questions you should answer:
● What type of relationship does each of our customer/segments expect us to establish and
maintain with them?
● Which ones have we established?
● How costly are they?
● How are they integrated with the rest of our business model?
The different types of customer relationships are: personal assistance, automated service,
communities and so on.
5. Revenue streams
You need to plan how you are going to generate cash through the customer segment (costs must
be subtracted from revenues to create earnings). The meaningful questions are:
● For what value are our customers really willing to pay?
● For what do they currently pay?
● How are they currently paying?
November 2014
Page 42
● How would they prefer to pay?
● How much does each Revenue Stream contribute to overall revenues?
There are several possibility to generate revenue streams such as asset sales, usage fee,
subscription fees, lending/leasing/renting, licensing, etc.
6. Key resources & key activities
You need then to understand what are the assets that will make your business model work. Hence
answer at the following questions:
● What Key Resources do our Value Propositions require?
● Our Distribution Channels?
● Customer Relationships?
● Revenue Streams?
● What are then the action you can do in order to make your business model work.
● What Key Activities do our Value Propositions require?
● Our Distribution Channels?
● Customer Relationships?
● Revenue streams?
7. Key partnerships
You will probably need to require the help of external help of partners and/or suppliers in order to
make your business model to work properly:
● Who are our Key Partners?
● Who are our key suppliers?
● Which Key Resources are we acquiring from partners?
● Which Key Activities do partners perform?
8. Cost structure
Last but not least, you want to consider what are costs you will incur as well as the consequences,
when you will start applying your business model on your product.
What are the most important costs inherent in our business model?
Which Key Resources are most expensive?
Which Key Activities are most expensive?
Further reading
● A. Osterwalder & Y. Pigneur, Business Model Generation, 2004
● Elements of a business plan, available online
b) Open data business models
In the case of PSI (Public Sector Information) reuse performed by private sector entrepreneurs,
many inherent roadblocks, coupled with a certain vagueness surrounding the rationale underlying
business endeavors, keep slowing the process down. The advent of the Open Data framework,
oriented towards data openness (i.e. open by default), poses new issues regarding the access to
November 2014
Page 43
information which occurs free of charge and different forms of payment may be required for
restricting the access to derivative works.
Two Italian researchers Michele Osella and Enrico Ferro (2012) developed a framework “that
focuses on decision-making levers that a business developer has at his/her fingertips for molding
the overarching architecture of a business venture hinged on public data re-use”.
Figure 2. Framework for PSI business model analysis by Osella & Ferro
Source: Osella & Ferro, “Business Models for PSI Re-Use: A Multidimensional Framework”, 2012
Figure 3. Framework for PSI business model analysis by Osella & Ferro
Source: Osella & Ferro, “Business Models for PSI Re-Use: A Multidimensional Framework”, 2012
November 2014
Page 44
While developing the framework surrounding the PSI reuse, they realize that it was not sufficient to
grasp the business logic and the mechanisms needed to build an effective strategy. A solution
came from the combination with Osterwalder's business model ontology, by employing the
Business Model Canvas (explained in the previous paragraphs) in order to visualize archetypal
business models at an enterprise level. The tool has been proved very useful and could probably
be adopted in the development and assessment of any data intensive business venture.
The result is the identification of eight business models currently employed by the actors present in
the Public Sector Information centric (PSI-centric) ecosystem. In particular, the choice of the
business model to adopt is function of the position covered in the value chain and of the strategic
choices made.
Why are they useful?
From a business model viewpoint, which is one of the perspectives on the PSI realm showed by
Osella here, our interest is to identify the steps needed to maximise the benefits for reusers of
open data, “a profit-driven reuse and value creation”.
You can find, in the following list, the eight business models as described by Osella and Ferro:
1. Premium Product / Service.
2. Freemium Product / Service. A classic example in this vein is represented by mobile apps
related to public transportation in urban areas.
3. Open Source. OpenCorporates and OpenPolis
4. Infrastructural Razor & Blades. Public Data Sets on Amazon Web Service
5. Demand-Oriented Platform. DataMarket and Infochimps
6. Supply-Oriented Platform. Socrata
7. Free, as Branded Advertising.
8. White-Label Development.. This business model has not consolidated yet, but some
embryonic attempts seem to be particularly promising.
In this paragraph we are exploring the identified eight business models more in details. The main
references are two papers co-authored by Ferro and Osella: “Business Models for PSI Re-Use: A
Multidimensional Framework” (2012) and “Eight Business Model Archetypes for PSI Re-Use”
(2013).
#1 Premium Product / Service: While implementing this business model, a core re-user offers to
end-users a product or a service presumably characterized by high intrinsic value in exchange for
a payment that could occur à la carte or in the guise of a recurring fee: while the former implies the
November 2014
Page 45
payment of an amount of money for each unit of product purchased (pay-per-use), the latter has
an "all-inclusive" nature since it grants for a given timeframe the access to certain features in
accordance with contractual terms. In this business model, probably associated to the
“mainstream” model by the majority of analysts, the high intrinsic value, coupled with the price
mechanism, calls for B2B customers (often called “high-end market”) and for long or medium terms
relationships going beyond single transactions (Osella & Ferro, 2013).
Figure 4. Premium Product / Service (framework view)
Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
November 2014
Page 46
Figure 5. Premium Product / Service (“Canvas” view)
Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#2 Freemium Product / Service. Core re-users resorting to this business model offer to end-users
a product or a service in accordance with freemium price logic: one of the offerings is free-of-
charge and entails only basic features, while customers willing to take advantage of refined
features or add-ons are charged. In the PSI realm, the implementation of this business model has
its roots in limitations deliberately imposed by the core re-user in terms of data access: as a result,
ad-hoc payments may be required to enjoy advanced features, to have recourse to additional
formats or, sometimes, to weed out advertising. In contrast with the previous model, here the
prominent target market is the consumer one (often called “low-end market”) with which the firm
establishes medium or short terms relationships that usually do not involve the customization.
Target customers are generally reached via the Web or via the mobile channel, which are
promising to “hit” a considerable number of installed bases. (Osella & Ferro, 2013).
Fi
gure 6. Freemium Product / Service (“Canvas” view)
source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#3 Open Source Like. This very peculiar business model takes place on top of products, services,
or simple unpackaged data that are provided for free and in an open format. In terms of
economics, a cross-subsidization occurs in the enterprise under examination since the costs
incurred for free offering of data are covered by revenues stemming from supplementary business
lines that are still PSI-based: in fact, trickles of revenue for the core re-users may stem only from
added-value services or from license variations (dual licensing). The resemblance with Open
Source software is given by the fact that in this circumstance data is provided in a totally open
format that allows free elaboration, usage and redistribution without any technical barrier (Osella &
Ferro, 2013).
November 2014
Page 47
.
Figure 7. Open Source Like. (“Canvas” view)
source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#4 Infrastructural Razor & Blades. Entering in the realm of enablers, this business model is
chosen by enterprises acting as intermediaries that facilitate the access to PSI resources by profit-
oriented developers or scientists not driven by commercial intent. As it happens in the well-known
model “razor & blades”, the value proposition hinges on an attractive, inexpensive or free initial
offer (“razor”) that encourages continuing future purchases of follow-up items or services (“blades”)
that are usually consumables characterized by inelastic demand curve and high margins. Applying
this model in the PSI environment, datasets are stored for free on cloud computing platforms being
accessible by everyone via APIs (“razor”) while re-users are charged only for the computing power
that they employ on-demand in as-a-service mode (“blades”). This business model exhibits another
case of cross-subsidization whereby profits accrued from the provision of on-demand computing
capacity cover costs attributable to the storage and maintenance of data. Finally, it goes without
saying that application of this model is limited to contexts and domains in which the computational
costs are significant (Osella & Ferro, 2013).
November 2014
Page 48
Figure 8. Infrastructural Razor and Blades (“Canvas” view)
Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#5 Demand-Oriented Platform. Following this business model, the enabler acting as intermediary
provides developers with easier access to PSI resources that are stored on proprietary servers
having high reliability. Once collected, PSI datasets are subsequently catalogued using metadata,
harmonized in terms of formats and exposed through APIs, making it easier to dynamically retrieve
data in meaningful way. As a result, a wide range of critical issues pertaining to original raw data
are made irrelevant due to the usage of platforms capable to convert datasets in data streams,
contributing significantly to the "commoditization" and "democratization" of data. In addition,
developers may reap the benefits given by the "one stop shopping" nature of such platforms: they
may resort to one supplier and access a variety of information resources through standardized
APIs - even beyond the borders of the PSI - without having to worry about interfaces connecting to
each original source. This “procurement” approach is crucial to minimize search costs and, by
consequence, transaction costs. In terms of pricing, as a good that was born free and open (such
as Open Government Data) cannot be charged in absence of added value on top of it, enablers
adopting this business model earn revenues in exchange for advanced services and refined
datasets or data flows. To sum up, re-users are charged according to a freemium pricing model
that sets the boundary between free and premium in light of feature limitations (Osella & Ferro,
2013).
November 2014
Page 49
Figure 9. Demand-oriented platform (“Canvas” view)
Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#6 Supply-Oriented Platform. To conclude with enablers, this business model entails the
presence of an intermediary business actor having again an infrastructural role. However, on the
contrary of the previous case, according to this logic PSI holders are charged in lieu of developers.
In fact, the enabler, following the
golden rules of two-sided market, fixes the price according to the degree of positive externality that
each side is able to exert on the other one. Consequently, this approach is beneficial for both sides
of the resulting arena: from developers’ perspective, their barriers are wiped out (i.e., they can
retrieve data without incurring cost) while, from the governmental angle, PSI holders become
platform owners taking advantage of some handy features such as cloud storage, rapid upload of
brand-new datasets by public employees, standardization of formats, tagging with metadata and,
above all, automated external exposure of data via APIs and GUI. Public agencies that adhere to
such programs in order to dip their toes into the water of Open Data establish long term
relationships with providers and are required to pay a periodic fee that depends on the degree of
sophistication characterizing the solutions purchased and on some technical parameters (Osella &
Ferro, 2013).
November 2014
Page 50
Figure 10. Supply-oriented platform (“Canvas” view)
Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#7 Free as Branded Advertising. Service advertising is an emerging form of communication
aimed at encouraging or persuading an audience towards a brand or a company. Conversely to
the more famous “display advertising”, where commercial messages are simply visualized, in
service advertising the advertiser strives to conquer the customer by providing him or her with
services of general usefulness. That said, in the PSI realm, services offered in this way do not
generate any direct revenue but they are supposed to bring positive return in a broad sense,
driving economic results on other business lines - unrelated to PSI - that represent the enterprise’s
core business. The rationale fuelling this “enlightened” business model is twofold. Firstly, it may be
based on a powerful advertising boost that leads the company to consider the cost as a
promotional investment in the marketing mix. Secondly, it seems to be very convenient in presence
of zero marginal costs, a situation that occurs when the costs of distribution and usage are not
significant (Osella & Ferro, 2013).
November 2014
Page 51
Figure 11. Free as Branded Advertising. (“Canvas” view)
source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
#8 White-Label Development. Last but not least, if service advertisers do not have in-house
sufficient competencies required to develop their business endeavors, they can knock the door of
advertising factories. Such firms, in fact, come into play as outsourcers carrying out duties that
otherwise would be handled by service advertisers. Hence, the development of PSI-based
solutions is particularly compelling for companies willing to use PSI as "attraction tool" but not
equipped with competencies required to do so (e.g., data retrieval, software development, service
maintenance, marketing promotion). In order to let the service advertiser’s brand stand out,
solutions are developed in a white-label manner, i.e., shadowing the outsourcer’s brand and giving
full visibility to the sole service advertiser’s brand. Taking into account the “one stop shopping
supply” and the business-criticality of the solutions in terms of corporate image, the resulting one-
to-one relationship between provider and customer is tailor-made and “cemented”.
Concerning financials, advertising factories collect lump-sum payments or recurring fees in
exchange for turn-key solutions so developed, depending on whether the crafted solution takes the
form of product or service: whilst in the former case service advertisers perceive the cost as
CAPEX, in the latter one the respective cost assumes an OPEX nature (Osella & Ferro, 2013).
November 2014
Page 52
Figure 12. White Label Development. (“Canvas view”)
Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013
Case studies
You can find a lot of examples of companies that employ the business models described above
here. Herein we describe one example on the freemium model. A variety of web applications use
the freemium business model. The free product or service here is subsidised through a paid-for
product or service that offers some kind of added value on top of what is made available as open
data. The free product acts as marketing, establishing the provider in the marketplace and
increasing the take-up of the paid-for product (The ODI Guide, How to make a business case for
open data). One way of using a freemium model is to release your open data using a share-alike
license. This ensures that organisations who do things with your data have to either openly share
their results (which means you can benefit from what they do) or have to negotiate with you to be
able to use the data under a different (potentially charged) license.
OpenCorporates uses this business model, licensing their database with a share-alike license
while offering paid-for licenses for companies who do not want to share their data.
Another approach to a freemium model is to offer a paid-for product that:
● incorporates additional data, perhaps from third-party sources
● is provided in a different format from the open data
● is more up-to-date, complete or detailed than the open data
● is the result of an analysis or model based on the released open data
November 2014
Page 53
● is a dump of data that can otherwise be accessed through an API
Alternatively, you could offer a paid-for service based on the open data you are publishing that:
● provides an API over open data that can otherwise be accessed as a dump
● provides availability guarantees through a Service-Level Agreement
● removes rate limits
Recently the U.S. Government has launched a new section of the open government data catalog,
data.gov. The new sub-domain “Impact” profiles companies that are making use of open
government data.
References and further reading
● The Open Data Institute, How to make a business case for open data, available on line.
● Alex Howard, Open data economy: Eight business models for open data and
insight from Deloitte UK, available here.
● Elements of open data startups, presentation available here.
● Enrico Ferro, Emerging Business models in PSI reuse, available here.
● E.Ferro & M.Osella, Business Models for PSI Re-Use: A Multidimensional Framework,
2012 available on line.
● E.Ferro & M.Osella, Eight Business Model Archetypes for PSI Re-Use, 2013 available on
line.
c) Lean methodology
After exploring the eight business models on which the PSI reuse relies on, we introduce the
importance of the adoption of the Lean methodology for business development. You have already
identified the opportunities offered by the reuse of open data by employing the Business Model
Canvas and the framework developed by Osella and Ferro and now you want to start developing
your own business.
Lean methodology is a method for developing businesses and products with the goal to find
product-market fit and make a cash flow positive and sustainable company before it runs out of
money. “Validated learning,” experimentation, testing, measurement actual progress and learn
what customers really want are the main pillars of the methodology. All the process, then, should
be accomplished as fast as possible and as cheap as possible. Pioneers of the Lean Startup
movement are Steve Blank (The startup owner’s manual: the step by step guide for building a
company, 2012; The four steps to the epiphany, 2006) and Eric Ries (The Lean Startup, 2011).
The lean approach aims at being as much effective as possible in achieving your final goal.
According to lean methodology you should follow a build-measure-learn feedback loop.
Ideas > build > product > measure > data > learn > ideas > and so on (circle)
November 2014
Page 54
Figure 13. Build-measure-learn feedback loop
Image source: Andrew Walpole, Build - Measure - Learn Feedback Loop infographic, 2013
Here we explain the loop step by step:
1) Idea:
When you process your idea keep in mind that the final goal is to provide benefit to your customer,
the rest is just waste of time. So, first of all, ask yourself:
➢ Can I build a sustainable business around this set of products and services?
What you want to achieve is, in fact, a compromise between your vision and what your customers
would accept.
Hence, you want to focus on an idea that answers a problem that really needs a solution. You want
also to make explicit all implicit assumptions you are making on how you can create a business on
that idea.
Please, answer at the following questions before building your product:
➢ Do consumers recognize that they have the problem you are trying to solve?
➢ If there was a solution, would they buy it?
➢ Would they buy it from you?
➢ Can you build a solution for that problem?
“Success is not delivering a feature; success is learning how to solve the customer’s problem.”
(Eric Ries, The Lean Startup, 2011).
2) Build:
Develop a minimum viable product (MVP) in order to start learning process as soon as possible.
➢ MVP
A minimum viable product is a version of a new product or feature which allows to test the
assumptions you made. When you are building your MVP, remove any feature, process or effort
that does not contribute directly to the learning you seek. When you will test your MVP you will
learn which elements of your product or strategy are not appropriated.
3) Measure:
November 2014
Page 55
When MVP is establish, measure how your customer respond build on metrics that can lead to to
cause and effect questions. Metrics have to show a clearly defined action to take once analyzed.
Examples:
➢ A/B Split-Test Results
➢ Per-customer metrics
➢ Direct customer feedback
4) Learn:
Analyze your product, feedback and metrics to assess your progress in an objective way.
➢ Validate learning
“Validated learning” means that you need to run experiments that you have to scientifically validate
based on empirical data collected by real customers that allow you to test each element of your
vision.
During the all process should utilize an investigative development the so called "Five Whys"-asking
yourself simple questions to study and solve problems along the way. When this process of
measuring and learning is done and you made small changes for optimizing your product, you
should be able to understand whether the drivers of your business model are appropriate or not
and decide to pivot or persevere.
Figure 14. Description Step by Step of the feedback loop
Image source: Andrew Walpole, Build - Measure - Learn Feedback Loop infographic, 2013
November 2014
Page 56
Pivot:
If you decide to pivot you need to take a big change in the direction or make structural course
correction to test new ideas/hypotheses about the product, strategy and engine of growth and start
the cycle once again from the beginning. If your new experiment runs in a more productive way
than the experiments you were running before it is probably a sign that you made a successful
pivot.
Persevere:
If you think that your test is going in the right direction then you should continue to test more
assumptions and build towards executing your current vision.
The lean methodology underlines the importance of experimenting in order to learn. Pivoting is just
a part of the process - “if you cannot fail, you cannot learn.” (Eric Ries, The Lean Startup, 2011).
Until a precise business model is found, it is important to keep your initial vision. This way,
adjustments can be made to the model without reassessing the entire market.
Lean approach in open data business development: a case study
Steve Blank mentions a story of a startup called Tidepool as the perfect example to be studied in
order to demonstrate the power of the customer development, one of the key parts in Lean
Methodology. Tidepool team were severely criticized about their business model. They began
believing they were selling an open data and software platform for people with Type 1, Diabetes
into a multi-sided market comprised of patients, providers, device makers, app builders and
researchers. They firstly reduced what they thought was a five-sided market to a simpler two-sided
one. But the big payoff came when their discussions with medical device customers revealed an
entirely new way to think about pricing - potentially tripling their revenue.
Figure 15. Screenshot of Tidepool home page
Image source: http://tidepool.org
Further reading
● Eric Ries, The Lean Startup, available online
● Steve Blank, The Four Steps to the Epiphany, available online
● Steve Blank & Bob Dorf, The Startup Owner Manual: The Step by Step Guide for Building a
Great Company, available online
● Steve Blank, When Customer Make you Smarter, available online.
● Andrew Walpole, Build - Measure - Learn Feedback Loop, available online
● The Lean Startup Methodology, available online
Learning resources
● Steve Blank, How to Build a Startup, available online
November 2014
Page 57
● Steve Blank, Lean Customer Development - Part 1, available online
● Steve Blank, Lean Customer Development - 3 tool for startups, Part 2, available online
● Steve Blank, Lean Customer Development - Customer Development in action, Part 3 - 3
tool for startups, available online
● Steve Blank, Lean Customer Development - Closing, Part 3, available online
November 2014
Page 58
8.Open Data training materials already available. A list
● Useful links by the ODI use on our 3-day Open Data in Practice course here
● Slides used in the business sections on ODI’s Open Data in Practice course here
● ODI’s stories section : good place to find examples of real world impact.
● It's also worth looking at ODI Start-Ups page for ways entrepreneurs are using open data
to build new businesses. You'll find details of business approach, short pitch videos and
for some of the companies case-studies.
● You can explore all the materials and tutorials released by the team of School of Data.
You can find interesting guides at http://schoolofdata.org/courses/
November 2014
Page 59
9.SLIDES and inspiring presentations: link-o-graphy
http://www.slideshare.net/MicheleOsella
http://www.slideshare.net/search/slideshow?searchfrom=header&q=open+data+business
http://www.slideshare.net/OReillyStrata
http://www.slideshare.net/TheODINC
http://www.slideshare.net/MGHProfessional/leading-with-data?qid=9626d5fe-9a72-4e37-9bcf-
579ef5d75c88&v=qf1&b=&from_search=1
http://www.slideshare.net/JenvanderMeer/strata-open-data-its-not-just-for-govts2112014?
qid=9626d5fe-9a72-4e37-9bcf-579ef5d75c88&v=default&b=&from_search=15
http://www.slideshare.net/deirdrelee/deirdre-lee-opendata?qid=9626d5fe-9a72-4e37-9bcf-
579ef5d75c88&v=qf1&b=&from_search=8
http://www.slideshare.net/WorldBankGroupFinances/world-bank-gurin?qid=9626d5fe-9a72-4e37-
9bcf-579ef5d75c88&v=qf1&b=&from_search=6
http://training.theodi.org/resources/ODP_Business.pdf
http://theodi.github.io/presentations/2013-10-tsb-workshop-tom.html#/cover
http://www.slideshare.net/napo/a-dive-into-open-data
November 2014
Page 60
10. Videos, Audio files and books
So you want to build an open data business?
https://www.youtube.com/watch?v=jNscjJ5DetM
The value of open data to business - the Open Data 500 Study
http://theodi.org/lunchtime-lectures/friday-lunchtime-lecture-the-value-of-open-data-to-business-
the-open-data-500-study
Learning from New York City’s open-data effort
http://www.mckinsey.com/insights/public_sector/learning_from_new_york_citys_open_data_effort
Some useful webinars:
http://www.socrata.com/webinars/
Opening up open data: An interview with Tim O’Reilly
http://www.mckinsey.com/insights/business_technology/opening_up_open_data_an_interview_wit
h_tim_o_reilly
What is Open Data and how can it transform your business?
https://www.youtube.com/watch?v=hXZaf08gjfo
A very interesting list of recommended books is available here:
https://github.com/theodi/training-web/blob/gh-pages/Bibliography/index.md
November 2014
Page 61

More Related Content

Viewers also liked

#FIWAREPamplona - Training Day - Gaining and retaining customers
#FIWAREPamplona - Training Day - Gaining and retaining customers#FIWAREPamplona - Training Day - Gaining and retaining customers
#FIWAREPamplona - Training Day - Gaining and retaining customersMiguel García González
 
FINODEX - General presentation on EU public funding
FINODEX - General presentation on EU public fundingFINODEX - General presentation on EU public funding
FINODEX - General presentation on EU public fundingMiguel García González
 
#FIWAREPamplona - Training Day - Fiware stories
#FIWAREPamplona - Training Day - Fiware stories#FIWAREPamplona - Training Day - Fiware stories
#FIWAREPamplona - Training Day - Fiware storiesMiguel García González
 
FIWAREPamplona - Training Day - Orizont Agrifood Accelerator
FIWAREPamplona - Training Day - Orizont Agrifood AcceleratorFIWAREPamplona - Training Day - Orizont Agrifood Accelerator
FIWAREPamplona - Training Day - Orizont Agrifood AcceleratorMiguel García González
 
The story of mixing open data, entrepreneurs and FIWARE technologies
The story of mixing open data, entrepreneurs and FIWARE technologiesThe story of mixing open data, entrepreneurs and FIWARE technologies
The story of mixing open data, entrepreneurs and FIWARE technologiesMiguel García González
 
FINODEX common mistakes to avoid before submission
FINODEX common mistakes to avoid before submissionFINODEX common mistakes to avoid before submission
FINODEX common mistakes to avoid before submissionMiguel García González
 
Fiware cloud capabilities_and_setting_up_your_environment
Fiware cloud capabilities_and_setting_up_your_environmentFiware cloud capabilities_and_setting_up_your_environment
Fiware cloud capabilities_and_setting_up_your_environmentMiguel García González
 
#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs
#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs
#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEsMiguel García González
 
FINODEX Call2 - Phase 2 - Selected Projects
FINODEX Call2 - Phase 2 - Selected Projects FINODEX Call2 - Phase 2 - Selected Projects
FINODEX Call2 - Phase 2 - Selected Projects Miguel García González
 
#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...
#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...
#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...Miguel García González
 

Viewers also liked (17)

FIWARE SME Instrument Webinar - Zabala
FIWARE SME Instrument Webinar - ZabalaFIWARE SME Instrument Webinar - Zabala
FIWARE SME Instrument Webinar - Zabala
 
#FIWAREPamplona - Training Day - Gaining and retaining customers
#FIWAREPamplona - Training Day - Gaining and retaining customers#FIWAREPamplona - Training Day - Gaining and retaining customers
#FIWAREPamplona - Training Day - Gaining and retaining customers
 
FINODEX - General presentation on EU public funding
FINODEX - General presentation on EU public fundingFINODEX - General presentation on EU public funding
FINODEX - General presentation on EU public funding
 
#FIWAREPamplona - Training Day - Fiware stories
#FIWAREPamplona - Training Day - Fiware stories#FIWAREPamplona - Training Day - Fiware stories
#FIWAREPamplona - Training Day - Fiware stories
 
FIWAREPamplona - Training Day - Orizont Agrifood Accelerator
FIWAREPamplona - Training Day - Orizont Agrifood AcceleratorFIWAREPamplona - Training Day - Orizont Agrifood Accelerator
FIWAREPamplona - Training Day - Orizont Agrifood Accelerator
 
The story of mixing open data, entrepreneurs and FIWARE technologies
The story of mixing open data, entrepreneurs and FIWARE technologiesThe story of mixing open data, entrepreneurs and FIWARE technologies
The story of mixing open data, entrepreneurs and FIWARE technologies
 
FINODEX common mistakes to avoid before submission
FINODEX common mistakes to avoid before submissionFINODEX common mistakes to avoid before submission
FINODEX common mistakes to avoid before submission
 
Fiware cloud capabilities_and_setting_up_your_environment
Fiware cloud capabilities_and_setting_up_your_environmentFiware cloud capabilities_and_setting_up_your_environment
Fiware cloud capabilities_and_setting_up_your_environment
 
#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs
#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs
#FIWAREPamplona - Training Day - European Public Funding Opportunities for SMEs
 
FIWARE Foundation
FIWARE FoundationFIWARE Foundation
FIWARE Foundation
 
Agenda Demo Day - FINODEX - FIWARE Trento
Agenda Demo Day - FINODEX - FIWARE TrentoAgenda Demo Day - FINODEX - FIWARE Trento
Agenda Demo Day - FINODEX - FIWARE Trento
 
FINODEX Startup portfolio
FINODEX Startup portfolioFINODEX Startup portfolio
FINODEX Startup portfolio
 
FINODEX summary
FINODEX summaryFINODEX summary
FINODEX summary
 
FINODEX Call2 - Phase 2 - Selected Projects
FINODEX Call2 - Phase 2 - Selected Projects FINODEX Call2 - Phase 2 - Selected Projects
FINODEX Call2 - Phase 2 - Selected Projects
 
Open data and entrepreneurship
Open data and entrepreneurshipOpen data and entrepreneurship
Open data and entrepreneurship
 
#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...
#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...
#FIWAREPamplona - Training day - Open and agile smart cities. A technical int...
 
#FIWAREPamplona Aporta IODC16 Open Data
#FIWAREPamplona Aporta IODC16 Open Data#FIWAREPamplona Aporta IODC16 Open Data
#FIWAREPamplona Aporta IODC16 Open Data
 

Similar to FINODEX open data training

"Addressing primary “modalities of constraint" on open and effective access t...
"Addressing primary “modalities of constraint" on open and effective access t..."Addressing primary “modalities of constraint" on open and effective access t...
"Addressing primary “modalities of constraint" on open and effective access t...Tom Moritz
 
Wiki Educator
Wiki EducatorWiki Educator
Wiki Educatorsansdoute
 
C-SCALE Tutorial: Licensing Open Source Software
C-SCALE Tutorial: Licensing Open Source SoftwareC-SCALE Tutorial: Licensing Open Source Software
C-SCALE Tutorial: Licensing Open Source SoftwareSebastian Luna-Valero
 
FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])
FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])
FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])Mike Linksvayer
 
Open Source Developer by Binary Semantics
Open Source Developer by Binary SemanticsOpen Source Developer by Binary Semantics
Open Source Developer by Binary SemanticsBinary Semantics
 
Go open2010 sde_20100417
Go open2010 sde_20100417Go open2010 sde_20100417
Go open2010 sde_20100417Sandro D'Elia
 
Transcript #4 fair -R for Reusable
Transcript   #4 fair -R for ReusableTranscript   #4 fair -R for Reusable
Transcript #4 fair -R for ReusableARDC
 
Internet and open source concepts
Internet and open source conceptsInternet and open source concepts
Internet and open source conceptsSachidananda M H
 
A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...
A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...
A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...Addison Coleman
 
Open Source Software: An Edge For Your Growing Business
Open Source Software: An Edge For Your Growing BusinessOpen Source Software: An Edge For Your Growing Business
Open Source Software: An Edge For Your Growing BusinessPromet Source
 
OGD - Jeff Walpole
OGD - Jeff WalpoleOGD - Jeff Walpole
OGD - Jeff WalpoleAcquia
 

Similar to FINODEX open data training (20)

"Addressing primary “modalities of constraint" on open and effective access t...
"Addressing primary “modalities of constraint" on open and effective access t..."Addressing primary “modalities of constraint" on open and effective access t...
"Addressing primary “modalities of constraint" on open and effective access t...
 
1 Open Source Business
1 Open Source Business1 Open Source Business
1 Open Source Business
 
Wiki Educator
Wiki EducatorWiki Educator
Wiki Educator
 
Data & metadata licensing
Data & metadata licensingData & metadata licensing
Data & metadata licensing
 
About opendata2017
About opendata2017About opendata2017
About opendata2017
 
C-SCALE Tutorial: Licensing Open Source Software
C-SCALE Tutorial: Licensing Open Source SoftwareC-SCALE Tutorial: Licensing Open Source Software
C-SCALE Tutorial: Licensing Open Source Software
 
FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])
FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])
FOSDEM 2012 Legal Devroom: ⊂ (FLOSS legal/policy ∩ CC [4.0])
 
License eula
License eulaLicense eula
License eula
 
Licence Gpl 3.0
Licence Gpl 3.0Licence Gpl 3.0
Licence Gpl 3.0
 
Open Source Developer by Binary Semantics
Open Source Developer by Binary SemanticsOpen Source Developer by Binary Semantics
Open Source Developer by Binary Semantics
 
Go open2010 sde_20100417
Go open2010 sde_20100417Go open2010 sde_20100417
Go open2010 sde_20100417
 
License
LicenseLicense
License
 
License
LicenseLicense
License
 
Transcript #4 fair -R for Reusable
Transcript   #4 fair -R for ReusableTranscript   #4 fair -R for Reusable
Transcript #4 fair -R for Reusable
 
Internet and open source concepts
Internet and open source conceptsInternet and open source concepts
Internet and open source concepts
 
Foss introduction and history
Foss introduction and historyFoss introduction and history
Foss introduction and history
 
A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...
A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...
A Qualitative Study On The Adoption Of Copyright Assignment Agreements (CAA) ...
 
Open Source Software: An Edge For Your Growing Business
Open Source Software: An Edge For Your Growing BusinessOpen Source Software: An Edge For Your Growing Business
Open Source Software: An Edge For Your Growing Business
 
Open Development
Open DevelopmentOpen Development
Open Development
 
OGD - Jeff Walpole
OGD - Jeff WalpoleOGD - Jeff Walpole
OGD - Jeff Walpole
 

More from Miguel García González (15)

Open Data for Startups Webinar
Open Data for Startups WebinarOpen Data for Startups Webinar
Open Data for Startups Webinar
 
SME Instrument Evaluator VIEW
SME Instrument Evaluator VIEWSME Instrument Evaluator VIEW
SME Instrument Evaluator VIEW
 
FIWARE successful SME Instrument Winner (TEAMDEV)
FIWARE successful SME Instrument Winner (TEAMDEV)FIWARE successful SME Instrument Winner (TEAMDEV)
FIWARE successful SME Instrument Winner (TEAMDEV)
 
FIWARE successful SME Instrument Winner (TEA SISTEMI)
FIWARE successful SME Instrument Winner (TEA SISTEMI)FIWARE successful SME Instrument Winner (TEA SISTEMI)
FIWARE successful SME Instrument Winner (TEA SISTEMI)
 
#FIWAREPamplona - Training Day - ODINE incubator
#FIWAREPamplona - Training Day - ODINE incubator#FIWAREPamplona - Training Day - ODINE incubator
#FIWAREPamplona - Training Day - ODINE incubator
 
#FIWAREPamplona - Training Day - Tips for an efficient Marketing Strategy
#FIWAREPamplona - Training Day - Tips for an efficient Marketing Strategy#FIWAREPamplona - Training Day - Tips for an efficient Marketing Strategy
#FIWAREPamplona - Training Day - Tips for an efficient Marketing Strategy
 
#FIWAREPamplona-TrainingDay Communication plan
#FIWAREPamplona-TrainingDay Communication plan#FIWAREPamplona-TrainingDay Communication plan
#FIWAREPamplona-TrainingDay Communication plan
 
FINODEX summary. Year 1
FINODEX summary. Year 1FINODEX summary. Year 1
FINODEX summary. Year 1
 
FINODEX introduces FIWARE
FINODEX introduces FIWAREFINODEX introduces FIWARE
FINODEX introduces FIWARE
 
FIWARE ID Management
FIWARE ID ManagementFIWARE ID Management
FIWARE ID Management
 
FIWARE Generic Enablers introduction
FIWARE Generic Enablers introductionFIWARE Generic Enablers introduction
FIWARE Generic Enablers introduction
 
FIWARE Context Broker
FIWARE Context BrokerFIWARE Context Broker
FIWARE Context Broker
 
FIWARE Complex Event Processing
FIWARE Complex Event ProcessingFIWARE Complex Event Processing
FIWARE Complex Event Processing
 
FIWARE Technology Intro
FIWARE Technology IntroFIWARE Technology Intro
FIWARE Technology Intro
 
FIWARE Lab
FIWARE LabFIWARE Lab
FIWARE Lab
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

FINODEX open data training

  • 1. OPEN DATA TRAINING MATERIAL November 2014 Page 1
  • 2. Table of contents 1. Defining Open Data 2. Understanding Law and Licensing 3. Big Data vs. Open Data 4. Open data as part of your business model 5. Case studies: Open Data Business 6. Where do I find Open Data? 7. How to develop your open data business 8. Open Data training materials already available. A list 9. Slides and inspiring presentations: link-o-graphy 10.Recommended videos, audio files and books November 2014 Page 2
  • 3. 1. Defining Open Data “A key promise of open data is that it can freely accessed and used. Without a clear definition of what exactly that means (e.g. used by whom, for what purpose) there is a risk of dilution especially as open data is attractive for data users” (Pollock, 2014). Main goal of this material is to make sure that people willing to re-use open datasets are aware of what “open” really means. First step we take is to explore some guidelines you find online. The Open Data Institute and Open Knowledge keep posting interesting simple guides and contents, ready for open data publishers and reusers. Let’s start from the basics: What makes data open and The Open definition v2.0. What makes data open? Original contents for this material is provided online at http://theodi.org/guides/what-open-data and http://theodi.org/guides/publishers-guide-open-data-licensing . Open data is data that is made available by organisations, businesses and individuals for anyone to access, use and share. Open data has to have a licence that says it is open data. Without a licence, the data can’t be reused. The licence might also say: ● that people who use the data must credit whoever is publishing it (this is called attribution) ● that people who mix the data with other data have to also release the results as open data (this is called share-alike) ● that people can do whatever they want with your work, if the holder has waived the copyright of database rights (public domain) Example: The Department for Education makes available open data about the performance of schools in England. The data is available as CSV and is available under the Open Government Licence, which only requires reusers to say that they got the data from the Department for Education. These principles for open data are detailed in the Open Definition in the next paragraph. November 2014 Page 3 Good open data ● are rich of documentation and metadata ● can be linked to, so that it can be easily shared and talked about ● is available in a standard, structured format, so that it can be easily processed
  • 4. Open Definition The Open Definition, created in 2005, is the main international standard for open data and open data licences, and provides principles and guidance for all things “open”. Open Data Mark: indicates compliance with Open Definition Definition You can find the entire updated version of the Open definition at http://opendefinition.org/od/ . The Open Definition is a project by Open Knowledge, that provides details and additional contents as well on its official web page. This material is licensed under a CC 4.0 Attribution https://creativecommons.org/licenses/by/4.0/. Open data is data that can be freely used, shared and built on by anyone, anywhere, for any purpose. The “standard” provided by the Open Definition – common requirements that must be met if a data is to be called “open” – is crucial because much of the value of open data lies in the ease with which different sources of open data can be combined – practically every app or insight made with data requires combining several pieces of data. For example, you need to know the bus timetable and have a map showing bus stops to be able to reach your destination on time. Both legal and technical compatibility is vital, and the Open Definition ensures that openly-licensed data can be combined successfully. This eliminates the risk of a “Tower of Babel” of data, with a proliferation of licences and terms of use for open data leading to complexity and incompatibility. The Open Definition prevents this fragmentation – and resulting destruction in value – by ensuring a common standard for all “open” data. Evidence for the practical success of the effort can be found in the reuse of the definition key principles and language in other important areas including UK and US government policy, and include the transition in terminology from “public sector information” to “open government data”. Thanks to the efforts of many translators in the community, the Open Definition is available in 30+ languages. The Open definition explains what can be defined as open work and open license. The term work is used to denote the item or piece of knowledge being transferred. The term license refers to the legal conditions under which the work is made available. Where no license has been offered this should be interpreted as referring to default legal conditions governing use of the work (for example, copyright or public domain). November 2014 Page 4
  • 5. November 2014 Page 5 Open Works An open work must satisfy the following requirements in its distribution: ● Open License The work must be available under an open license (as defined in Section 2). Any additional terms accompanying the work (such as terms of use, or patents held by the licensor) must not contradict the terms of the license. ● Access The work shall be available as a whole and at no more than a reasonable one-time reproduction cost, preferably downloadable via the Internet without charge. Any additional information necessary for license compliance (such as names of contributors required for compliance with attribution requirements) must also accompany the work. ● Open Format The work must be provided in a convenient and modifiable form such that there are no unnecessary technological obstacles to the performance of the licensed rights. Specifically, data should be machine-readable, available in bulk, and provided in an open format (i.e., a format with a freely available published specification which places no restrictions, monetary or otherwise, upon its use) or, at the very least, can be processed with at least one free/libre/open-source software tool.
  • 6. Open Licenses A license is open if its terms satisfy the following conditions: ● Required Permissions: The license must irrevocably permit (or allow) the following: 1.1 Use: The license must allow free use of the licensed work. 1.2 Redistribution: The license must allow redistribution of the licensed work, including sale, whether on its own or as part of a collection made from works from different sources. 1.3 Modification: The license must allow the creation of derivatives of the licensed work and allow the distribution of such derivatives under the same terms of the original licensed work. 1.4 Separation: The license must allow any part of the work to be freely used, distributed, or modified separately from any other part of the work or from any collection of works in which it was originally distributed. All parties who receive any distribution of any part of a work within the terms of the original license should have the same rights as those that are granted in conjunction with the original work. 1.5 Compilation: The license must allow the licensed work to be distributed along with other distinct works without placing restrictions on these other works. 1.6 Non-discrimination: The license must not discriminate against any person or group. 1.7 Propagation: The rights attached to the work must apply to all to whom it is redistributed without the need to agree to any additional legal terms. 1.8 Application to Any Purpose: The license must allow use, redistribution, modification, and compilation for any purpose. The license must not restrict anyone from making use of the work in a specific field of endeavor. 1.9 No Charge: The license must not impose any fee arrangement, royalty, or other compensation or monetary remuneration as part of its conditions. ● Acceptable Conditions ● The license shall not limit, make uncertain, or otherwise diminish the permissions required in previous section except by the following allowable conditions: Attribution: The license may require distributions of the work to include attribution of contributors, rights holders, sponsors and creators as long as any such prescriptions are not onerous. Integrity: The license may require that modified versions of a licensed work carry a different name or version number from the original work or otherwise indicate what changes have been made. Share-alike: The license may require copies or derivatives of a licensed work to remain under a license the same as or similar to the original Notice: The license may require retention of copyright notices and identification of the license. Source: The license may require modified works to be made available in a form preferred for further modification. November 2014 Page 6
  • 7. Technical Restriction Prohibition: The license may prohibit distribution of the work in a manner where technical measures impose restrictions on the exercise of otherwise allowed rights. Non-aggression: The license may require modifiers to grant the public additional permissions (for example, patent licenses) as required for exercise of the rights allowed by the license. The license may also condition permissions on not aggressing against licensees with respect to exercising any allowed right (again, for example, patent litigation). A list of conformant licenses is available at http://opendefinition.org/licenses/ . We explore licensing in the next section. November 2014 Page 7
  • 8. 2. Understanding Law and Licensing In this section, we intend to provide some additional materials on the licenses the applicants are invited to look for. You can find here an extended list of licenses that are conformant with the principles laid out in the Open Definition. Conformant Licenses The following licenses are conformant with the principles set forth in the Open Definition. ● Domain = Domain of application, i.e. what type of material this license should/can be applied to. Note if you are looking for an open license for software, please see Open Source Definition conformant licenses. ● BY = requires attribution ● SA = require share-alike ● Recommended conformant licenses These licenses conform to the Open Definition and are: ● Reusable: Not specific to an organization or jurisdiction. ● Compatible: Must be compatible with at least one of GPL-3.0+, CC-BY-SA-4.0, and ODbL-1.0. Permissive/attribution-only licenses must be compatible with all 3 of the aforementioned licenses, and at least one of Apache-2.0, CC-BY-4.0, and ODC-BY-1.0. ● Current: Widely used and generally considered best practice by a broad spectrum of projects and actors within the domains of applicability of the license. License Domain By SA Comments Creative Commons CCZero (CC0) Content, Data N N Dedicate to the Public Domain (all rights waived) Open Data Commons Public Domain Dedication and Licence (PDDL) Data N N Dedicate to the Public Domain (all rights waived) Creative Commons Attribution 4.0 (CC- BY-4.0) Content, Data Y N Open Data Commons Attribution License(ODC-BY) Data Y N Attribution for data(bases) Creative Commons Attribution Share- Alike 4.0 (CC-BY-SA-4.0) Content, Data Y Y Open Data Commons Open Database Data Y Y Attribution-ShareAlike November 2014 Page 8
  • 9. License (ODbL) for data(bases) November 2014 Page 9
  • 10. ● Other conformant licenses These licenses conform to the Open Definition, but do not meet reusability or compatibility requirements for recommended licenses, or have been superseded by newer license versions or newer licenses with similar use cases, or are little-used. These licenses may be reasonable for the particular organization they were crafted for to use, or to use for legacy reasons. Projects outside such contexts are strongly advised to use a recommended conformant license from the list above. License Domain By SA Comments Against DRM Content Y Y Little used. Creative Commons Attribution versions 1.0-3.0 Content Y N Includes all jurisdiction "ports"; Superseded by CC-BY-4.0. Creative Commons Attribution- ShareAlike versions 1.0-3.0 Content Y Y Includes all jurisdiction "ports"; Superseded by CC-BY-SA-4.0. Additionally, CC-BY-SA-1.0 is Incompatible with any other license. Data licence Germany – attribution – version 2.0 Data Y N Non-reusable. For use by Germany government licensors. Note version 1.0 is not approved as conformant. Data licence Germany – Zero – version 2.0 Data N N Non-reusable. For use by Germany government licensors. Note there is no previous version. Design Science License Content Y Y Little used, Incompatible with any other license. EFF Open Audio License Content Y Y Deprecated in favor of CC-BY-SA. Free Art License (FAL) Content Y Y GNU Free Documentation License (GNU FDL) Content Y Y Incompatible with any other license. Only conformant if used with no cover texts and no invariant sections. MirOS License Code, Content Y N Little used. November 2014 Page 10
  • 11. Open Government Licence Canada 2.0 Content, Data Y N Non-reusable. For use by Canada government licensors. Note version 1.0 is not approved as conformant. Open Government Licence United Kingdom 2.0 and 3.0 Content, Data Y N Non-reusable. For use by UK government licensors; re-uses of OGL-UK-2.0 and OGL-UK-3.0 material may be released under CC-BY or ODC-BY. Note version 1.0 is not approved as conformant. Talis Community License Data Y Y Draft only, Deprecated in favour of ODC licenses. Non-Conformant Licenses Non conformant licenses are usually those that though supporting some of the definition’s principles do not support all of them. ● Creative Commons No-Derivatives Licenses Creative Commons No-Derivatives (by-nd-*) violate OD 1.1#3., “Reuse”, as they do not allow works, in part or in whole, to be re-used in derivative works. Creative Commons licenses with the Noderivs stipulation include: ● Attribution-NoDerivs (by-nd) ● Attribution-NonCommercial-NoDerivs (by-nc-nd) ● ● Creative Commons NonCommercial Creative Commons NonCommercial licenses (by-nc-*) do not support the OD 1.1#8., “No Discrimination Against Fields of Endeavor”, as they exclude usage in commercial activities. Creative Commons licenses with the non-commercial stipulation include: ● Attribution-Noncommercial (by-nc) ● Attribution-NonCommercial-ShareAlike (by-nc-sa) ● Attribution-NonCommercial-NoDerivs (by-nc-nd) November 2014 Page 11
  • 12. Licence Compatibility The applicants, as reusers and publishers of open data, often need to understand whether the licenses applied to datasets are "compatible". We recommend to the Finodex proposers to have a look at this page: https://github.com/theodi/open-data-licensing/blob/master/guides/licence-compatibility.md The most important step towards understanding compatibility in more detail is to understand the basic provisions of each license. The Creative Commons Rights Expression Language defines some basic facets of licenses, covering Permissions, Requirements and Prohibitions. As the CC licenses are already described using these facets, which are also common to many other licenses, it is possible to put together a matrix that identifies which facets apply to which licenses. Table 1 summarises how a number of licenses can be classified based on these facets. There are several things to note here: ● The Share Alike requirement requires that derived data is published under the same or compatible terms as the original. This places limits on how remixes can be distributed, i.e. only under compatible terms. ● The Derivative Works prohibition limits re-users from distributing any form of derivative work at all. Even if those derivatives are not distributed. However it is still possible to include the database in a collection in which the original is preserved. When it comes to publishing derivatives there are, broadly, two different scenarios to consider: publishing a simple derivative based on a single source, and publishing a remix of several datasets. Once a derivative has been created, then it too can be the source of additional derivation. Derivation is a process that can be repeated either by the original publisher (e.g. mixing in additional further datasets) or by third-parties (e.g to create new derivatives). November 2014 Page 12 Questions about licence compatibility: ● Can some data published with Licence X be combined with some additional data published under Licence Y? ● What license(s) could be applied to a derived or aggregated dataset? ● Are there provisions associated with a licence that inhibit or constrain the creation and Set of questions for open data publishers and reusers Author: David Tarrant ● Do you have rights or permission to publish? ● Do you have rights to use the information/data? ● Is the data derived from other sources?
  • 13. Further readings: http://www.scribd.com/doc/128356210/Business-considerations-or-privacy-and-open-data-how-not-to-get-caught-out http://www.scribd.com/doc/125638490/Getting-to-grips-with-the-National-Pupil-Database-personal-data-in-an-open-data- world USEFUL GUIDES for reusers and publishers released by The Open Data Institute The ODI Publisher's Guide to Open Data Licensing Source: http://theodi.org/guides/publishers-guide-open-data-licensing In Europe, there are two kinds of rights that you are automatically given over things that you have created: ● you get copyright over works (content) that you create and which are original to you, such as text that you write or photographs you take ● you get a database right over collections of data that you have put a substantial effort into obtaining, verifying or presenting Note: As far as we know the database right only arises within the European Union and in Mexico. In some countries there may be no protection for collections of data. Database right: 15 years since database was last updated Database copyright: Life of author + 70 years from date database was created November 2014 Page 13 Suggestion for the proposers: If you are uncertain about what rights you may have over a piece of content or dataset or how you can use it... Contact the owner. Ask.
  • 14. If you apply original judgement in putting together a database, for example in choosing which items to include within the database or which information about them to include, you have a copyright over that database, because it is a creative work. For example, if you were to build a database about the best 100 cars, this might involve: ● choosing which cars count as the best cars ● writing a description about each car ● researching and gathering facts about them You would have copyright over the database, because you chose which cars were “best”. You would have copyright over the descriptions, because you wrote them. And you would probably have the database right for the database you’ve built, because you put substantial effort into gathering information about them. Importantly, you don’t own the facts about the cars — anyone else can build their own database containing exactly those facts without violating your database right — but no one else can reuse your database or your descriptions without your permission because you own the copyright over them. You probably do not have a database right if you create the facts in a database, as opposed to gathering them from elsewhere, unless you put substantial effort into verifying or presenting the database. For example, if you own a restaurant and create a database of the dishes that you offer and when you offer them, you probably do not have a database right over that database, though you might have copyright because of the creative judgement involved in working out which dishes should be offered on particular days to provide a balanced menu. Copyright and database right are types of Intellectual Property Rights (IPR). There are other kinds of IPR that you can get, such as patents, trademarks and (some) design rights, which must be registered (for example with the Intellectual Property Office). November 2014 Page 14 Database definition “A collection of independent works, data or materials which are a) arranged in a systematic or methodical way and
  • 15. ● What About Data From Other Organisations? You might not own all the content or data that you have and use within your organisation. In particular, rather than creating the content or gathering the data yourself, some of the content and data you hold and use within your organisation, and might want to publish, might be: ● completely licensed from someone else ● include an extract of content or data that you have licensed from someone else ● be derived from the content or data that you have licensed from someone else The Reuser’s Guide to Open Data Licensing describes what you can do with content or data that you licence from someone else. If you do reuse that content or data in your own publications, you should indicate the licence under which you are reusing that content, so that people reusing that content or data know what they can do with it. ● What About My Brand? Organisations who publish content or data under an open licence are often concerned that this might enable reusers to also copy their brand. Your brand should be protected through a trade mark. A trade mark restricts how other people use your logo or company name. You will also have copyright on the logo. Although your trade mark will protect you from other people using your logo directly, if your logo is incorporated into some content that you licence, you should make sure the logo is explicitly not covered by that licence, as you will usually want to place additional restrictions on its use (such as its adaptation). For example, if you have written a report that includes your logo, and you want to licence the content of the report under the Creative Commons Attribution licence, you could say: The text, figures and tables in this report are licensed under a Creative Commons Attribution 4.0 International License. What If I Publish the Data on a Website? November 2014 Page 15
  • 16. You still have rights over your database and your content when you publish them on a website. Others cannot legally extract and reuse a substantial portion of your data or content without your permission. You can also indicate that others should not scrape data from your website through your Terms and Conditions and through technical mechanisms such as robots.txt. There are two sets of open licences. You should use a licence from one of these sets rather than creating your own licence, for three reasons: 1. it’s less work 2. it ensures that the legal language in the licence is correct 3. it makes it a lot easier for reusers to know what they can do with your data ● Open Licences for Creative Content Creative content, such as text, photographs, slides and so on, should be licensed using a Creative Commons Licence. There are three of these that you should consider using for open content: Level of Licence Creative Commons Licence public domain CC0 attribution CC-by attribution & share-alike CC-by-sa Make sure that you use the latest (version 4.0) Creative Commons licenses, which are international. The links in the table above go to the correct licences. There are other types of Creative Commons licences that are not open licences. For example, the Creative Commons Attribution-NonCommercial licence does not allow commercial reuse of November 2014 Page 16
  • 17. content, and therefore is not an open licence. If you use the Creative Commons licence chooser, only those that are described as “Free Culture” licences are open licences. ● Open Licences for Databases We now recommend that you also use a Creative Commons 4.0 licence for data as well as for content. You may alternatively use a similar set of licences that was created specifically for databases from the Open Data Commons. There are again three levels that you can choose from: Level of Licence Open Data Commons Licence public domain PDDL attribution ODC-by attribution & share-alike ODbL ODBL licence is used for OpenStreetMap. You can find more details here: https://blog.openstreetmap.org/2014/08/06/at-the-edge-of-the- license/ Which Licence Should I Use? The licence that you use should support your open data business model. It is unusual for organisations to place content or data in the public domain as being given attribution for the content or data usually helps to achieve some of the goals of opening it up. It is possible to license content or data under more than one licence, and let reusers choose which licence to use it under. Typically you would dual-license some content or data by making it available under an open licence and under a paid-for licence that does not have the same restrictions. Dual-licensing is typically used with a share-alike licence, as outlined below. November 2014 Page 17
  • 18. Some open data business models work best with a share-alike licence. For example: ● a share-alike licence will usually be unattractive to commercial businesses who don’t want to open up their own data, so using a share-alike licence coupled with a charged licence can be a good basis for a freemium business model ● when you are collaborating with others to create a shared resource, a share-alike licence can help to ensure that you can bring back into that resource any work that others do on their own copies On the other hand, if you are hoping to gain other benefits for your business through the reuse of your data, using a cross-subsidy business model, you may find that a share-alike licence prevents people from reusing it, and therefore want to avoid having a share-alike restriction. There are two cases where you have no choice over what licence you can use for the content or data that you publish. 1. If you are publishing content or data that is derived from content or data that was licensed to you using a share-alike licence, then you must publish your content or data using that same licence. 2. With very few exceptions, if you are a government department or arms-length body then the content or data that you have created or gathered is owned by the Crown. Unless you have an exemption, granted by the Office of Public Sector Information (OPSI), you must publish this data using the Open Government Licence. What Attribution Should I Ask For? If you choose a licence that includes a requirement for attribution, you need to specify what that attribution should look like. In choosing what attribution to ask for, you should consider the ways in which your data or content might be reused, and the fact that it might be combined with other data or content that might require its own attribution. If you want to encourage the reuse of your data or content, you need to make it easy for reusers to satisfy your attribution requirements. There are two things you should document: November 2014 Page 18
  • 19. 1. What should the attribution include? You will usually want the name of your organisation, and a link to either your organisation’s home page or a page about the data or content you are licensing. Keep this as minimal as possible. 2. Where and how should the attribution be presented? Some attribution requirements specify that the attribution must be presented directly wherever the data is used, and may even specify the size or format of the attribution. These requirements can be difficult to adhere to, particularly for mobile application developers who have limited screen space to include such attributions. Allowing reusers to provide attribution on a separate page makes this easier. Note that under the terms of the licences listed above, when a reuser uses your data or content to add value to or to create new data or content, they cannot relicense your work. Any onward reusers are bound by the same attribution requirements as the direct reusers of your content or data. It’s a good idea to explicitly document this requirement because it might not be obvious to reusers. How Do I Indicate the Licence of Content or Data? You should indicate the licence for content or data you make available using both a human- readable description and computer-readable metadata. The clearer you make it which licence applies to your content or data, the easier it is for reusers to know that they can reuse the content or data you are licensing. The human-readable descriptions and marks that you should use are spelled out on the Creative Commons and Open Data Commons websites: ● Creative Commons licence chooser ● Open Data Commons licences It is best to embed information about the licence that some content or data is available under directly within the content or data. This ensures that the licensing information is carried around with the content or data. In addition to human-readable text, you should provide computer-readable metadata. The separate Publisher’s Guide to the Open Data Rights Statement Vocabulary describes how to do this. If you add your dataset to a catalog, such as data.gov.uk or the Data Hub, you should make sure that you indicate the licence under which the dataset is available within that catalog. This gives November 2014 Page 19
  • 20. people searching the catalog a quick and easy way of seeing that they will be able to reuse the dataset. November 2014 Page 20
  • 21. The ODi Reuser's Guide to Open Data Licensing Source: http://theodi.org/guides/reusers-guide-open-data-licensing The fact that you can get hold of some information does not necessarily mean that you can do whatever you want with it. You need to have permission from the owner of that information to do what you want to do. A licence tells you what you can do. But what does it mean to license data? What requirements can a licence place on you? What different licences to publishers use? How can you find out what licence a dataset is available under? This guide answers these questions. Note: This guide focuses on data published by organisations based in the UK. Licensing law is different in different countries, so some of this information might not apply to you if you are reusing information that is published elsewhere. It does not address other potential legal considerations, such as compliance with the Data Protection Act. ● What Do Publishers Own? In Europe, there are two kinds of rights that publishers — organisations or individuals who make available content or data — are given over things that they have created: ● they get copyright over works (content) that they create and which are original to them, such as text that they write or photographs they take ● they get a database right over collections of data that they have put a substantial effort into obtaining, verifying or presenting Note: As far as we know the database right is unique to the European Union. In some countries there may be no protection for collections of data. If someone applies original judgement in putting together a database, for example in choosing which items to include within the database or which information about them to include, they have a copyright over that database, because it is a creative work. For example, if someone were to build a database about the best 100 cars, this might involve: ● choosing which cars count as the best cars ● writing a description about each car ● researching and gathering facts about them November 2014 Page 21
  • 22. They would have copyright over the database, because they chose which cars were “best”. They would have copyright over the descriptions, because they wrote them. And they would probably have the database right for the database they’ve built, because they put substantial effort into gathering information about the cars. Importantly, they don’t own the facts about the cars — you or anyone else could build your own database containing exactly those facts without violating their database right — but no one else can reuse their database or their descriptions without their permission because they own the copyright over them. Publishers probably do not have a database right if they create the facts in a database, as opposed to gathering them from elsewhere, unless they put substantial effort into verifying or presenting the database. For example, if someone owns a restaurant and creates a database of the dishes that they offer, and when they offer them, they probably do not have a database right over that database, though they might have copyright because of the creative judgement involved in working out which dishes should be offered on particular days to provide a balanced menu. ● What About Data From Third Parties? Publishers might not own all the content or data that they publish themselves. In particular, rather than creating the content or gathering the data themselves, some of the content and data they publish might be: ● completely licensed by them from someone else ● include an extract of content or data that they have licensed from someone else ● be derived from the content or data that they have licensed from someone else When they publish the data, the publisher should tell you about which content or data is owned by another organisation, and under which licence it is being republished. ● What About Brands? Brands are usually protected through a trade mark. A trade mark restricts how you can use an organisation’s logo or company name. They will also have copyright on the logo. Licences for content or data usually explicitly exclude logos and company names, so you cannot, for example, adapt a logo by changing the colours used within it. You also cannot use the company name or logo to lend weight to your product without permission to do so. However, the attribution requirements of a licence may require you to use the company name and logo to indicate that you have reused data owned by that company. ● What Can’t You Do? There are a few things that you can do with content or data without a licence, but in general you need to be given a licence by a publisher if you want to reuse their content or data. Having access to some content or data — for example by downloading it from a publisher’s website — does not give you the right to reuse it. November 2014 Page 22
  • 23. ● Republishing and Adding Value You do not automatically have the right to republish, in its entirety, content or data that someone else owns, even if they have given you a licence to use it yourself. You need to check the terms of the licence for the content or data to make sure that you can republish it. The same applies if you are adding value to the content or data, for example by automatically adding links or styling to content, or adding columns with extra information into a dataset. The new content or data includes the entirety of someone else’s content or data, so you cannot publish it unless you have their permission. ● Publishing Extracts You have the right to publish extracts of content or databases that you have access to, regardless of what the licence says, so long as the extract is not “substantial”. However, it is often hard to tell if the extract that you have made is “substantial”. The licence that you have been given might let you republish any amount of the content or data (open licences do this). Otherwise, you should take legal advice about whether the extracts that you want to publish are likely to count as substantial or not. ● Publishing Derived Content or Data You might want to create new content or databases by adapting, deriving, or otherwise processing some content or data. To do that, you first have to ensure you have been given a licence to use the data in the first place. You then need to look at what the licence says about creating derived works. For example, say you have been given a licence to use a photograph on your website. You could create a new version of that photograph by changing it from colour to black & white, or by adding a speech bubble to it. In this case, the photograph is a creative work, and the person who took it owns the copyright. Because the photograph is protected by copyright, you can only create these new images if the licence under which you are using the photograph allows you to do so. Copyright can exist in small pieces of content, such as phrases. For example, if you analyse some content to create a new database, you should make sure that you have the right to reuse any snippets of content that you might keep in the new database. If the content includes a presentation of data from a database, you have to consider database rights as well: scraping data from the page might equate to creating an extract. Database rights are slightly different, because they only extend to creating extracts or re-utilising (republishing) a database. For example, say you analysed the data about prescriptions of each drug within each GP practice within the UK, along with other data about the coverage of each practice, to create a new dataset that provided the average spend per patient of each practice. So long as you had no separate contractual obligations to the owners of the two datasets you have brought together, you might well be free to do what you liked with the result, as it would not be possible to reconstruct the original databases from the aggregated data. November 2014 Page 23
  • 24. ● What Do Licences Say? Licences tell you what you can do with the content or data that you access. A licence will tell you whether you can: ● republish the content or data on your own website ● derive new content or data from it ● make money by selling products that use it ● republish it while charging a fee for access Many licences will let you access content or data for free, but say that you cannot republish it or adapt it, or use it within commercial products. If you break the terms of the licence, the owner of the content or data can take you to court. ● What Do Open Licences Say? An open licence is one that places very few restrictions on what you can do with the content or data that is being licensed. According to the Open Definition, there are only two kinds of restrictions that an open licence can place: ● that you must give attribution to the source of the content or data ● that you must publish any derived content or data under the same licence (this is called share-alike) An open licence might do neither or one or both of these. So, you might encounter content or data available under one of three levels of licence: 1. a public domain licence has no restrictions at all (technically, these indicate that the rights owner has waived their rights to the content or data) 2. an attribution licence just says that you must give attribution to the publisher 3. an attribution & share-alike licence says that you must give attribution and share any derived content or data under the same licence November 2014 Page 24
  • 25. ● How Do You Provide Attribution? You should provide attribution even if the licence does not require it. Giving attribution is a way of recognising both the efforts that the publisher has made to put together the content or data you are reusing, and their generosity in making it available for reuse. When content or data is licensed using a licence that includes attribution, the publisher might specify: ● what wording the attribution should include ● where and how the attribution should be presented You should follow what the publisher asks you to do. If it is not practical, for example if you are providing a service that does not have room for the attribution statement that they request, then get in touch with them to ask what to do. It is good practice to provide the name of the organisation that published the data or content, and a link to their home page. Specifying the name of the dataset and providing a link to its location also helps other reusers to find the data you are reusing. If you are building a tool that reuses some content or data, you should try to include attribution on every page or screen in which the content or data is used. If this is impractical (for example because you are pulling together information from lots of different sources), you should provide a clear link to a page or screen that then provides attribution information. If you are republishing data or content, its reusers are still bound by the attribution requirements of the original data or content. To make it easier for them to understand and fulfil those requirements, it is good practice to include the attribution for the source data or content in the attribution that you ask for. This might sometimes be impractical, for example because you are creating derived data or content includes data or content from a large number of sources. In these cases, you should provide a full list of the sources and request an attribution which links to that list. ● How Do You Share-Alike? A share-alike licence requires you to republish new content or data that you create using the given content or data under the same, share-alike licence. Creating new ways of presenting data does not count as derivation or adaptation, but combining two sets of data to create a new set probably does. Publishing the content and data that you create from open data, as open data, is a good thing to do even if the licence does not require it. Opening up your content and data enables others to reuse and build on your work, and can add value to your work. ● What Open Licences Are There? There are two sets of open licences that you may encounter. November 2014 Page 25
  • 26. ● Open Licences for Creative Content Creative content, such as text, photographs, slides and so on, may be licensed using a Creative Commons Licence. There are three of these that you might encounter: Level of Licence Creative Commons Licence public domain CC0 attribution CC-by attribution & share-alike CC-by-sa There are different versions for each of these licences, the most recent being version 4.0. There are also different variants which take into account differences in the law in different countries. The links in the table above are to the version 4.0 versions, which apply internationally, but you may find publishers using other versions. You can reuse content under these licences no matter what country you are in. There are other types of Creative Commons licences that are not open licences. For example, the Creative Commons Attribution-NonCommercial licence does not allow commercial reuse of content, and therefore is not an open licence. The human-readable summaries of the Creative Commons licences spell out exactly what you can do under each licence. ● Open Licences for Databases You might encounter a similar set of licences which is available for databases from the Open Data Commons. There are again three levels: Level of Licence Open Data Commons Licence public domain PDDL attribution ODC-by attribution & share-alike ODbL ● Other Licences There are other licences that enable reuse and which you may encounter, particularly around public sector information: November 2014 Page 26
  • 27. ● Open Government Licence is an attribution licence that covers both copyright and database right and is mainly used for information made available by UK central government ● OS Open Licence is an attribution licence that is exactly the same as the Open Government Licence but ensures that the attribution is to the Ordnance Survey ● How is the Licence Indicated? The licence under which information is published should be clear both in human-readable content and as machine-readable data. If you cannot work out the licence for information that you discover on the web, you should contact the owner of the site to ask: the lack of licensing information means that you cannot assume the right to reuse the content or data. Human-readable descriptions and marks that you may encounter are shown on the Creative Commons and Open Data Commons websites: ● Creative Commons licence chooser ● Open Data Commons licences Where possible, the publisher should have embedded information about the licence directly within the content or data itself. Often, however, you will have to look at the page from which you access the content or data, or the licence information for the entire website, which is often linked to from the footer of the page. If a publisher adds their dataset to a catalog, such as data.gov.uk or the Data Hub, they may indicate the licence under which the dataset is available in the metadata supplied by the catalog. You should check that this is consistent with any licence information they supply on their own site or within the data itself: if it is not, you should ask them for clarification. Legal tools for open data Open Data Commons is the home of a set of legal tools to help you provide and use Open Data http://opendatacommons.org/ http://opendatacommons.org/faq/licenses/ 3. Big Data vs. Open Data November 2014 Page 27
  • 28. Big Data vs Open Data - Diagram Source: http://www.opendatanow.com/2013/11/new-big-data-vs-open-data-mapping-it-out/#.VGDCrfSG9Zt As Joel Gurin points out: “there’s general agreement that Open Data should be free of charge or cost just a minimal amount. Starting with some basic descriptions, the intersection of these three concepts (big data, open data, open government) defines the six subtypes of data shown on the Venn diagram. (There’s no separate category for the intersection of Big Data and Open Government – anything in that category is also Open Data.) Here are characteristic examples of each, referring to the numbers above. 1. Big Data that’s not Open Data. A lot of Big Data falls in this category, including some Big Data that has great commercial value. All of the data that large retailers hold on customers’ buying habits, that hospitals hold about their patients, or that banks hold about their credit-card holders, falls here. It’s information that the data-holders own and can use for commercial advantage. National security data, like the data collected by the NSA, is also in this category. 2. Open Government work that’s not Open Data. This is the part of Open Government that focuses purely on citizen engagement. For instance, the White House has started a petition website, called We the People, to open itself to citizen input. While the site makes its data available, publishing Open Data – beyond numbers of signatures – is not its main purpose. 3. Big, Open, Non-Governmental Data. Here we find scientific data-sharing and citizen science projects like Zooniverse. Big data from astronomical observations, from large biomedical projects like the Human Genome Project, or from other sources realizes its greatest value through an open, shared approach. While some of this research may be government-funded, it’s not “government data” because it’s not generally held, maintained, or analyzed by government agencies. This category also includes a very different kind of Open Data: the data that can be analyzed from Twitter and other forms of social media. 4. Open Government Data that’s not Big Data. Government data doesn’t have to be Big Data to be valuable. Modest amounts of data from states, cities, and the federal government can have a major impact when it’s released. This kind of data fuels the participatory budgeting movement, where cities around the world invite their residents to look at the city budget and help decide how to spend it. It’s also the fuel for apps that help people use city services like public buses or health clinics. November 2014 Page 28
  • 29. 5. Open Data – not Big, not from Government. This includes the private-sector data that companies choose to share for their own purposes – for example, to satisfy their potential investors or to enhance their reputations. Environmental, social, and governance (ESG) metrics fall here. In addition, reputational data, such as data from consumer complaints, is highly relevant to business and falls in this category. 6. Big, Open, Government Data (the trifecta). These datasets may have the most impact of any category. Government agencies have the capacity and funds to gather very large amounts of data, and making those datasets open can have major economic benefits. National weather data and GPS data are the most often-cited examples. U.S. Census data, and data collected by the Securities and Exchange Commission and the Department of Health and Human Services, are others. With the new Open Data Policy, this category will likely become larger, more robust, and even more significant. November 2014 Page 29
  • 30. November 2014 Page 30 4 key steps These are in very approximate order — many of the steps can be done simultaneously. 1. Choose your dataset(s). Choose the dataset(s) you plan to make open. Keep in mind that you can (and may need to) return to this step if you encounter problems at a later stage. 2. Apply an open license. ○ Determine what intellectual property rights exist in the data. ○ Apply a suitable ‘open’ license that licenses all of these rights and supports the definition of openness. ○ NB: if you can’t do this go back to step 1 and try a different dataset. ○ 3. Make the data available — in bulk and in a useful format. You may also wish to consider alternative ways of making it available such as via an API. 4. Make it discoverable — post on the web and perhaps organize a central catalog to list your open datasets.
  • 31. 4. Categories and Type of Data Open can apply to information from any source and about any topic. Anyone can release their data under an open licence for free use by and benefit to the public. Although we may think mostly about government and public sector bodies releasing public information such as budgets or maps, or researchers sharing their results data and publications, any organisation can open information (corporations, universities, NGOs, startups, charities, community groups and individuals). There is open information in transport, science, products, education, sustainability, maps, legislation, libraries, economics, culture, development, business, design, finance. So the explanation of what open means applies to all of these information sources and types. Source: http://blog.okfn.org/2013/10/03/defining-open-data/#sthash.nXnXf8Bx.dpuf November 2014 Page 31
  • 32. Categories Business and Legal services Data/Technology Education Energy Environment and weather Finance and Investment Food and Agriculture Geospatial/Mapping Governance Healthcare Housing/ real estate Insurance Lifestyle and Consumer Media Research and Consulting Scientific Research Transportation November 2014 Page 32
  • 33. The Open Data Consumers Checklist: Source: http://theodi.org/guides/the-open-data-consumers-checklist The Open Data Handbook: Source: http://opendatahandbook.org/ The handbook introduces you to the legal, social and technical aspects of open data. It can be used by anyone but is especially useful for those working with government data. It discusses the why, what and how of open data — why to go open, what open is, and the how to do open. Read it online or download a PDF . November 2014 Page 33
  • 34. 4. Open data as part of your business model Al-Debei and Avison (2010) derived a unified business model framework based on a comprehensive review of the literature. They argue that the model provides an abstract but holistic view and that the fundamental dimensions are value based. There are four relevant aspects to the business model framework: ● Value proposition—the business logic for creating value for customers by offering products ● and services for targeted segments, ● Value architecture—an architecture for the technological and organizational infrastructure ● used in the provisioning of products and services, ● Value network—collaboration and coordination with other organizations, and ● Value finance—the costing, pricing, and revenue breakdown associated with sustaining and improving the creation of value. New business models and practices driven by social media and open data have hardly been investigated. Exceptions are the analyses of companies in the United Kingdom (Hammell, Perricos,Lewis, & Branch, 2012) and a classification of social business models based on the revenue model (for instance, Ferro, 2012; Ferro & Osella, 2012; Ferro & Osella, 2013; Ubaldi, 2013). Based on the analysis of a number of companies in the United Kingdom, five archetypes of business models can be identified (Hammell et al., 2012). These include: (1) suppliers—public and private sector organizations—publishing the data, (2) aggregators linking open data to produce useful insights, (3) developers—organizations and individuals—building apps, (4) enrichers using open data to enable their existing products and services, and (5) enablers facilitating the supply and use of open data. Ferro and Osella (2013) identify the following models: 1. Premium—end users are offered a service or product in exchange for payment. 2. Freemium—basic services or products are offered free of charge. Profit is made by having end users pay for extended features. 3. Open source like—data are offered for free through cross subsidization. 4. Infrastructural Razor and Blades—data sets are stored for free and are accessible to everyone via Application Programming Interfaces (APIs) (‘‘razors’’), while reusers are charged only for the computing power that they employ on demand in as-a-service mode (‘‘blades’’). 5. Demand-oriented platform—the company provides developers with a one-stop shop of data sets that are catalogued using metadata. Revenue is made in exchange for advanced services and refined data sets or data flows. 6. Supply-oriented platform—this business model is quite similar to the previous one, but the PSI providers are charged in lieu of developers. 7. Free as branded advertising—the company uses PSI as a tool to attract attention from November 2014 Page 34
  • 35. customers by providing them with useful services. The company expects that the public will then favor its particular brand or company. Revenue is expected not to come directly from PSI, but from other business lines that represent the company’s core business. 8. White label development—a company wants to use PSI as an attraction tool but does not have the competencies required to do so. The company then uses an advertising factory, which receives payment in the form of a lump sum or recurring fees in exchange for turnkey solutions, depending upon whether the solution is in the form of a product or a service (Ferro & Osella, 2013). The revenue model can be payment by open data providers or users in the form of (1) recurring fees, granting access for a specific time period, or pay per use, (2) advertisement, or (3) ensuring visibility for creating revenue for other activities (Ferro & Osella, 2013). Although these eight options describe a complete array of possible business models, they are derived from the revenue. Infomediary Business Models for Connecting Open Data Providers and Users Available here: http://ssc.sagepub.com/content/early/2014/01/30/0894439314525902.full.pdf+html All infomediary business models can be developed and operated by either public or private organizations. The business model might be initiated by public events (hackathons) but operated by private party, yet when a best practice is adopted the roles can be reversed. The following six business models were identified. 1. Single-purpose apps provide real-time services such as information about weather, quality of restrooms, vehicles, houses, and pollution. These apps often provide a single function, based on one type of open data provided. The app processes the data and presents it visually for the ease of the users. 2. Interactive apps: In addition to single-purpose apps, this type of business model provides users the opportunity to add content. Ratings are often included, as is additional information such as complaints. 3. Information aggregators take many published open data sources and combine and process them for subsequent presentation to the users. An example is a transportation planner that aggregates information from various transport modalities and companies. Often interoperability is a challenge that requires agreements among data providers. 4. Comparison models: This type of business model aggregates open data from various sources for the purpose of comparing the performance of entities with each other. For example, it can be used to compare schools and other public organizations. The data can originate from official sources (school inspection) or from users (criminal chart) and used by citizens (in determining a school for their children or a place to live) and public organizations (in developing measures to improve schools or for crime interventions). 5. Open data repositories are used by governments to publish their information. These can be national open data portals or more specialized portals, such as websites of statistical agencies. The essence is that these portals are relatively closed and only a limited number of public organizations can publish open data on them. There is little to no user interaction, and the focus is on being able to indiscriminately open data sets. Searching for open data is a key aspect, although it is often difficult to find the right information. They can provide basic functionalities for processing and visualizing data. November 2014 Page 35
  • 36. 6. Service platforms: These platforms provide all kinds of features for searching, importing, cleansing, processing, and visualizing information. Service platforms often contain open data repositories or are connected to open data repositories that function as the data source. Service platforms can vary in the level of openness; some are based on payment (e.g., www.junar. com) whereas others are free of charge (www.engagedata.eu ). Further reading: Business models for open data applications available at: http://www.appsforeurope.eu/article/business-models-open-data-applications November 2014 Page 36
  • 37. 5. CASE Studies: Open Data Business Success stories about the open datastartups from the ODI Startup Programme November 2014 Page 37 Transport API http://www.transportapi.com Clients: Transport for London, Heathrow Airport. Greater London Authority, Citymapper, Elgin, Giraffe.co.uk, Network rail Products: TransportAPI Achievements: Transport API solutions have powered award winning apps, such as Citymapper The TransportApi story: TransportAPI is Britain’s first comprehensive open platform for transport solutions. the company’s objective is to enhance travel experience through real time information, and enable new transportation insights through analytics. It uses open data feeds from key industry sources as Traveline, Network rail and Transport for London. The company offers nationwide timetables, departure and infrastructure informations for schedules, live departures and archived service running across all transport modes. The data feeds are available for integration by web and app developers. Data Components such as the ‘nearest transport’ widget can be used in travel portals, hyperlocal sites and business analytics. TransportAPI currently has 700 developers and organizations signed up on its platform. They are individual taxpayers, but also public sector organizations like universities and local authorities who are getting free data. As Jonathan Raper, Managing director, says, “Our intervention in the market has led prices for transport data fall and previous monopoly transport data providers to relax their terms.” The company also scales data usage and provides a new, single source option for its customers, like Heathrow Airport, who now use TransportAPI for all their public transport information. Jonathan further explain that “TransportAPI employs 6 people now and the tax we generate per year is nudging £75K”.
  • 38. November 2014 Page 38 Mastodon C http://www.mastodonc.com Clients: Technology Strategy Board, CDEC’s Open Health Data platform, Nesta Products: Kixi Data Platform Achievements: Mastodon C identified £200m of potential savings to the NHS in its prescribing analytics project, which investigated the use of branded statins over cheaper generic versions. The Mastodon C story Mastodon C helps businesses make sense of the proliferation of data that now exists, allowing them to make better decisions. It does this using a cloud-based open source data processing and analytics platform, which it customises to each client’s datasets. The team also applies data science techniques to gain insights, make predictions and find business value from data, which is built back into client systems. The team at Mastodon C uses open data together with the closed data that clients own. Francine Bennett, Co- Founder and CEO at Mastodon C says: “We often find ourselves introducing clients to open data concepts through our work, as we’ll suggest useful datasets which they can make use of to help their business.”
  • 39. 6. Where do I find open data? A list open data catalogs http://publicdata.eu/ https://open-data.europa.eu/it/data http://datacatalogs.org/ http://planet.openstreetmap.eu http://wikidata.org dbpedia.org November 2014 Page 39
  • 40. 7. How can you develop your open data business? This chapter has been elaborated by the Finodex team and It’s already included in the Finodex Handbook. Summary: In this chapter we provide basic knowledge regarding how you can develop your business using open data. We’ll show how to generate a business model, exploring the components of the Business Model Canvas in detail. In particular, we’ll offer an overview of open data business models. In the case of reuse of PSI (Public sector information) Osella & Ferro have developed an interesting framework “that focuses on decision-making levers that a business developer has at his/her fingertips for molding the overarching architecture of a business venture hinged on public data re-use”. They combined the framework with the business model ontology by employing the Business Model Canvas in order to visualize archetypal business models at an enterprise level. The tool has been proved very useful and could probably be adopted in the development and assessment of any data intensive business venture. After exploring eight business models we introduce the importance of the adoption of the Lean methodology for business development, offering a case study of open data business development in which the Lean approach has been used. Moreover, defining and setting your business goals need a competitor analysis, which is also explained. Last but not least, we describe the rights connected to using open datasets. Licensing and related issues of compatibility between licenses are crucial when you deal with open data. Index: a. Business Modeling b. Open Data Business models c. Lean methodology d. Competitor Analysis e. Intellectual Property Rights Introduction In this chapter we provide essential knowledge regarding how you can develop your open data business. We’ll show how to generate a business model, exploring the components of the Business Model Canvas in detail. In particular, we’ll offer an overview of open data business models. In the case of reuse of PSI (Public Sector Information) Osella & Ferro have developed an interesting framework “that focuses on decision-making levers that a business developer has at his/her fingertips for molding the overarching architecture of a business venture hinged on public data re-use”. They combined the framework with the business model ontology by employing the Business Model Canvas in order to visualize archetypal business models at an enterprise level. The tool has been proved very useful and could probably be adopted in the development and assessment of any data intensive business venture. After exploring eight business models we introduce the importance of the adoption of the Lean methodology for business development, offering a case study of open data business development in which the Lean approach has been used. Moreover, defining and setting your business goals need a competitor analysis, which is also explained. Last but not least, we describe the rights connected to using open datasets. Licensing and related issues of compatibility between licenses are crucial when you deal with open data. November 2014 Page 40
  • 41. a) Business Modeling A business model is a strategic tool that indicates how the company makes money specifying the sources of the company’s revenues as well as how much and how often these sources are willing to do that. Since its publication in 2004, the book “Business Model Generation” by Osterwalder and Pigneur, soon has become the bible for startups and SMEs. In their book the authors explain the so called Business Model Canvas (Figure 1), which is a tool that will help you to visually and capture the components of a business model, and will assist you in the business model generation process. In order to keep track of all of your steps in creating your business model, you may want to download here the “canvas” and start to write down all the assumptions and progress that you make! Figure 1. Business Model Canvas Source: “A business model describes the rationale of how an organization creates, delivers, and captures value” in Osterwalder & Pigneur, Business Model Generation, 2004. According to Osterwalder, in order to build an effective business model you have to identify several blocks. In the following we briefly list them. For each of them, rather than a theoretical description, we provide a set of practical questions for you to answer. Down to work! 1. Customer segments First of all, you need to define which customers you aim to reach. You have to answer two important questions: ● For whom are we creating value? ● Who are our most important customers? November 2014 Page 41
  • 42. 2. Value Proposition You should provide to your customers a product or a service with an added value. The “value proposition” is a statement that summarizes why potential consumers should buy your particular product or service, and prefer it to similar offerings. In this case, you should answer the following questions: ● What value do we deliver to the customer? ● Which one of our customer’s problems are we helping to solve? ● Which customer needs are we satisfying? ● What bundles of products and services are we offering to each Customer Segment? Factors such as newness, performance, customization, design, brand/status, cost reduction, risk reduction, accessibility, and convenience/usability can add value to your business. Your value proposition may be qualitative (privileging customer experience and outcome) and/or qualitative (price and efficiency). 3. Sales Channels Once you have understood your value proposition and your customer segment, you need to take care of channels able to deliver the value to your clients. You should ask yourself: ● Through which channels do our customer segments want to be reached? ● How are we reaching them now? ● How are our channels integrated? Which ones work best? ● Which ones are most cost-efficient? ● How are we integrating them with customer routines? You can reach your clients either through your own channels (store front), your partner channels (major distributors), or a combination of both. 4. Customer Relationships Another important step: you have to identify the kind of relationship you establish with each of your customer segments. These are the main questions you should answer: ● What type of relationship does each of our customer/segments expect us to establish and maintain with them? ● Which ones have we established? ● How costly are they? ● How are they integrated with the rest of our business model? The different types of customer relationships are: personal assistance, automated service, communities and so on. 5. Revenue streams You need to plan how you are going to generate cash through the customer segment (costs must be subtracted from revenues to create earnings). The meaningful questions are: ● For what value are our customers really willing to pay? ● For what do they currently pay? ● How are they currently paying? November 2014 Page 42
  • 43. ● How would they prefer to pay? ● How much does each Revenue Stream contribute to overall revenues? There are several possibility to generate revenue streams such as asset sales, usage fee, subscription fees, lending/leasing/renting, licensing, etc. 6. Key resources & key activities You need then to understand what are the assets that will make your business model work. Hence answer at the following questions: ● What Key Resources do our Value Propositions require? ● Our Distribution Channels? ● Customer Relationships? ● Revenue Streams? ● What are then the action you can do in order to make your business model work. ● What Key Activities do our Value Propositions require? ● Our Distribution Channels? ● Customer Relationships? ● Revenue streams? 7. Key partnerships You will probably need to require the help of external help of partners and/or suppliers in order to make your business model to work properly: ● Who are our Key Partners? ● Who are our key suppliers? ● Which Key Resources are we acquiring from partners? ● Which Key Activities do partners perform? 8. Cost structure Last but not least, you want to consider what are costs you will incur as well as the consequences, when you will start applying your business model on your product. What are the most important costs inherent in our business model? Which Key Resources are most expensive? Which Key Activities are most expensive? Further reading ● A. Osterwalder & Y. Pigneur, Business Model Generation, 2004 ● Elements of a business plan, available online b) Open data business models In the case of PSI (Public Sector Information) reuse performed by private sector entrepreneurs, many inherent roadblocks, coupled with a certain vagueness surrounding the rationale underlying business endeavors, keep slowing the process down. The advent of the Open Data framework, oriented towards data openness (i.e. open by default), poses new issues regarding the access to November 2014 Page 43
  • 44. information which occurs free of charge and different forms of payment may be required for restricting the access to derivative works. Two Italian researchers Michele Osella and Enrico Ferro (2012) developed a framework “that focuses on decision-making levers that a business developer has at his/her fingertips for molding the overarching architecture of a business venture hinged on public data re-use”. Figure 2. Framework for PSI business model analysis by Osella & Ferro Source: Osella & Ferro, “Business Models for PSI Re-Use: A Multidimensional Framework”, 2012 Figure 3. Framework for PSI business model analysis by Osella & Ferro Source: Osella & Ferro, “Business Models for PSI Re-Use: A Multidimensional Framework”, 2012 November 2014 Page 44
  • 45. While developing the framework surrounding the PSI reuse, they realize that it was not sufficient to grasp the business logic and the mechanisms needed to build an effective strategy. A solution came from the combination with Osterwalder's business model ontology, by employing the Business Model Canvas (explained in the previous paragraphs) in order to visualize archetypal business models at an enterprise level. The tool has been proved very useful and could probably be adopted in the development and assessment of any data intensive business venture. The result is the identification of eight business models currently employed by the actors present in the Public Sector Information centric (PSI-centric) ecosystem. In particular, the choice of the business model to adopt is function of the position covered in the value chain and of the strategic choices made. Why are they useful? From a business model viewpoint, which is one of the perspectives on the PSI realm showed by Osella here, our interest is to identify the steps needed to maximise the benefits for reusers of open data, “a profit-driven reuse and value creation”. You can find, in the following list, the eight business models as described by Osella and Ferro: 1. Premium Product / Service. 2. Freemium Product / Service. A classic example in this vein is represented by mobile apps related to public transportation in urban areas. 3. Open Source. OpenCorporates and OpenPolis 4. Infrastructural Razor & Blades. Public Data Sets on Amazon Web Service 5. Demand-Oriented Platform. DataMarket and Infochimps 6. Supply-Oriented Platform. Socrata 7. Free, as Branded Advertising. 8. White-Label Development.. This business model has not consolidated yet, but some embryonic attempts seem to be particularly promising. In this paragraph we are exploring the identified eight business models more in details. The main references are two papers co-authored by Ferro and Osella: “Business Models for PSI Re-Use: A Multidimensional Framework” (2012) and “Eight Business Model Archetypes for PSI Re-Use” (2013). #1 Premium Product / Service: While implementing this business model, a core re-user offers to end-users a product or a service presumably characterized by high intrinsic value in exchange for a payment that could occur à la carte or in the guise of a recurring fee: while the former implies the November 2014 Page 45
  • 46. payment of an amount of money for each unit of product purchased (pay-per-use), the latter has an "all-inclusive" nature since it grants for a given timeframe the access to certain features in accordance with contractual terms. In this business model, probably associated to the “mainstream” model by the majority of analysts, the high intrinsic value, coupled with the price mechanism, calls for B2B customers (often called “high-end market”) and for long or medium terms relationships going beyond single transactions (Osella & Ferro, 2013). Figure 4. Premium Product / Service (framework view) Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 November 2014 Page 46
  • 47. Figure 5. Premium Product / Service (“Canvas” view) Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #2 Freemium Product / Service. Core re-users resorting to this business model offer to end-users a product or a service in accordance with freemium price logic: one of the offerings is free-of- charge and entails only basic features, while customers willing to take advantage of refined features or add-ons are charged. In the PSI realm, the implementation of this business model has its roots in limitations deliberately imposed by the core re-user in terms of data access: as a result, ad-hoc payments may be required to enjoy advanced features, to have recourse to additional formats or, sometimes, to weed out advertising. In contrast with the previous model, here the prominent target market is the consumer one (often called “low-end market”) with which the firm establishes medium or short terms relationships that usually do not involve the customization. Target customers are generally reached via the Web or via the mobile channel, which are promising to “hit” a considerable number of installed bases. (Osella & Ferro, 2013). Fi gure 6. Freemium Product / Service (“Canvas” view) source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #3 Open Source Like. This very peculiar business model takes place on top of products, services, or simple unpackaged data that are provided for free and in an open format. In terms of economics, a cross-subsidization occurs in the enterprise under examination since the costs incurred for free offering of data are covered by revenues stemming from supplementary business lines that are still PSI-based: in fact, trickles of revenue for the core re-users may stem only from added-value services or from license variations (dual licensing). The resemblance with Open Source software is given by the fact that in this circumstance data is provided in a totally open format that allows free elaboration, usage and redistribution without any technical barrier (Osella & Ferro, 2013). November 2014 Page 47
  • 48. . Figure 7. Open Source Like. (“Canvas” view) source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #4 Infrastructural Razor & Blades. Entering in the realm of enablers, this business model is chosen by enterprises acting as intermediaries that facilitate the access to PSI resources by profit- oriented developers or scientists not driven by commercial intent. As it happens in the well-known model “razor & blades”, the value proposition hinges on an attractive, inexpensive or free initial offer (“razor”) that encourages continuing future purchases of follow-up items or services (“blades”) that are usually consumables characterized by inelastic demand curve and high margins. Applying this model in the PSI environment, datasets are stored for free on cloud computing platforms being accessible by everyone via APIs (“razor”) while re-users are charged only for the computing power that they employ on-demand in as-a-service mode (“blades”). This business model exhibits another case of cross-subsidization whereby profits accrued from the provision of on-demand computing capacity cover costs attributable to the storage and maintenance of data. Finally, it goes without saying that application of this model is limited to contexts and domains in which the computational costs are significant (Osella & Ferro, 2013). November 2014 Page 48
  • 49. Figure 8. Infrastructural Razor and Blades (“Canvas” view) Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #5 Demand-Oriented Platform. Following this business model, the enabler acting as intermediary provides developers with easier access to PSI resources that are stored on proprietary servers having high reliability. Once collected, PSI datasets are subsequently catalogued using metadata, harmonized in terms of formats and exposed through APIs, making it easier to dynamically retrieve data in meaningful way. As a result, a wide range of critical issues pertaining to original raw data are made irrelevant due to the usage of platforms capable to convert datasets in data streams, contributing significantly to the "commoditization" and "democratization" of data. In addition, developers may reap the benefits given by the "one stop shopping" nature of such platforms: they may resort to one supplier and access a variety of information resources through standardized APIs - even beyond the borders of the PSI - without having to worry about interfaces connecting to each original source. This “procurement” approach is crucial to minimize search costs and, by consequence, transaction costs. In terms of pricing, as a good that was born free and open (such as Open Government Data) cannot be charged in absence of added value on top of it, enablers adopting this business model earn revenues in exchange for advanced services and refined datasets or data flows. To sum up, re-users are charged according to a freemium pricing model that sets the boundary between free and premium in light of feature limitations (Osella & Ferro, 2013). November 2014 Page 49
  • 50. Figure 9. Demand-oriented platform (“Canvas” view) Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #6 Supply-Oriented Platform. To conclude with enablers, this business model entails the presence of an intermediary business actor having again an infrastructural role. However, on the contrary of the previous case, according to this logic PSI holders are charged in lieu of developers. In fact, the enabler, following the golden rules of two-sided market, fixes the price according to the degree of positive externality that each side is able to exert on the other one. Consequently, this approach is beneficial for both sides of the resulting arena: from developers’ perspective, their barriers are wiped out (i.e., they can retrieve data without incurring cost) while, from the governmental angle, PSI holders become platform owners taking advantage of some handy features such as cloud storage, rapid upload of brand-new datasets by public employees, standardization of formats, tagging with metadata and, above all, automated external exposure of data via APIs and GUI. Public agencies that adhere to such programs in order to dip their toes into the water of Open Data establish long term relationships with providers and are required to pay a periodic fee that depends on the degree of sophistication characterizing the solutions purchased and on some technical parameters (Osella & Ferro, 2013). November 2014 Page 50
  • 51. Figure 10. Supply-oriented platform (“Canvas” view) Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #7 Free as Branded Advertising. Service advertising is an emerging form of communication aimed at encouraging or persuading an audience towards a brand or a company. Conversely to the more famous “display advertising”, where commercial messages are simply visualized, in service advertising the advertiser strives to conquer the customer by providing him or her with services of general usefulness. That said, in the PSI realm, services offered in this way do not generate any direct revenue but they are supposed to bring positive return in a broad sense, driving economic results on other business lines - unrelated to PSI - that represent the enterprise’s core business. The rationale fuelling this “enlightened” business model is twofold. Firstly, it may be based on a powerful advertising boost that leads the company to consider the cost as a promotional investment in the marketing mix. Secondly, it seems to be very convenient in presence of zero marginal costs, a situation that occurs when the costs of distribution and usage are not significant (Osella & Ferro, 2013). November 2014 Page 51
  • 52. Figure 11. Free as Branded Advertising. (“Canvas” view) source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 #8 White-Label Development. Last but not least, if service advertisers do not have in-house sufficient competencies required to develop their business endeavors, they can knock the door of advertising factories. Such firms, in fact, come into play as outsourcers carrying out duties that otherwise would be handled by service advertisers. Hence, the development of PSI-based solutions is particularly compelling for companies willing to use PSI as "attraction tool" but not equipped with competencies required to do so (e.g., data retrieval, software development, service maintenance, marketing promotion). In order to let the service advertiser’s brand stand out, solutions are developed in a white-label manner, i.e., shadowing the outsourcer’s brand and giving full visibility to the sole service advertiser’s brand. Taking into account the “one stop shopping supply” and the business-criticality of the solutions in terms of corporate image, the resulting one- to-one relationship between provider and customer is tailor-made and “cemented”. Concerning financials, advertising factories collect lump-sum payments or recurring fees in exchange for turn-key solutions so developed, depending on whether the crafted solution takes the form of product or service: whilst in the former case service advertisers perceive the cost as CAPEX, in the latter one the respective cost assumes an OPEX nature (Osella & Ferro, 2013). November 2014 Page 52
  • 53. Figure 12. White Label Development. (“Canvas view”) Source: M.Osella & E.Ferro, Eight Business Model Archetypes for PSI Re-Use, 2013 Case studies You can find a lot of examples of companies that employ the business models described above here. Herein we describe one example on the freemium model. A variety of web applications use the freemium business model. The free product or service here is subsidised through a paid-for product or service that offers some kind of added value on top of what is made available as open data. The free product acts as marketing, establishing the provider in the marketplace and increasing the take-up of the paid-for product (The ODI Guide, How to make a business case for open data). One way of using a freemium model is to release your open data using a share-alike license. This ensures that organisations who do things with your data have to either openly share their results (which means you can benefit from what they do) or have to negotiate with you to be able to use the data under a different (potentially charged) license. OpenCorporates uses this business model, licensing their database with a share-alike license while offering paid-for licenses for companies who do not want to share their data. Another approach to a freemium model is to offer a paid-for product that: ● incorporates additional data, perhaps from third-party sources ● is provided in a different format from the open data ● is more up-to-date, complete or detailed than the open data ● is the result of an analysis or model based on the released open data November 2014 Page 53
  • 54. ● is a dump of data that can otherwise be accessed through an API Alternatively, you could offer a paid-for service based on the open data you are publishing that: ● provides an API over open data that can otherwise be accessed as a dump ● provides availability guarantees through a Service-Level Agreement ● removes rate limits Recently the U.S. Government has launched a new section of the open government data catalog, data.gov. The new sub-domain “Impact” profiles companies that are making use of open government data. References and further reading ● The Open Data Institute, How to make a business case for open data, available on line. ● Alex Howard, Open data economy: Eight business models for open data and insight from Deloitte UK, available here. ● Elements of open data startups, presentation available here. ● Enrico Ferro, Emerging Business models in PSI reuse, available here. ● E.Ferro & M.Osella, Business Models for PSI Re-Use: A Multidimensional Framework, 2012 available on line. ● E.Ferro & M.Osella, Eight Business Model Archetypes for PSI Re-Use, 2013 available on line. c) Lean methodology After exploring the eight business models on which the PSI reuse relies on, we introduce the importance of the adoption of the Lean methodology for business development. You have already identified the opportunities offered by the reuse of open data by employing the Business Model Canvas and the framework developed by Osella and Ferro and now you want to start developing your own business. Lean methodology is a method for developing businesses and products with the goal to find product-market fit and make a cash flow positive and sustainable company before it runs out of money. “Validated learning,” experimentation, testing, measurement actual progress and learn what customers really want are the main pillars of the methodology. All the process, then, should be accomplished as fast as possible and as cheap as possible. Pioneers of the Lean Startup movement are Steve Blank (The startup owner’s manual: the step by step guide for building a company, 2012; The four steps to the epiphany, 2006) and Eric Ries (The Lean Startup, 2011). The lean approach aims at being as much effective as possible in achieving your final goal. According to lean methodology you should follow a build-measure-learn feedback loop. Ideas > build > product > measure > data > learn > ideas > and so on (circle) November 2014 Page 54
  • 55. Figure 13. Build-measure-learn feedback loop Image source: Andrew Walpole, Build - Measure - Learn Feedback Loop infographic, 2013 Here we explain the loop step by step: 1) Idea: When you process your idea keep in mind that the final goal is to provide benefit to your customer, the rest is just waste of time. So, first of all, ask yourself: ➢ Can I build a sustainable business around this set of products and services? What you want to achieve is, in fact, a compromise between your vision and what your customers would accept. Hence, you want to focus on an idea that answers a problem that really needs a solution. You want also to make explicit all implicit assumptions you are making on how you can create a business on that idea. Please, answer at the following questions before building your product: ➢ Do consumers recognize that they have the problem you are trying to solve? ➢ If there was a solution, would they buy it? ➢ Would they buy it from you? ➢ Can you build a solution for that problem? “Success is not delivering a feature; success is learning how to solve the customer’s problem.” (Eric Ries, The Lean Startup, 2011). 2) Build: Develop a minimum viable product (MVP) in order to start learning process as soon as possible. ➢ MVP A minimum viable product is a version of a new product or feature which allows to test the assumptions you made. When you are building your MVP, remove any feature, process or effort that does not contribute directly to the learning you seek. When you will test your MVP you will learn which elements of your product or strategy are not appropriated. 3) Measure: November 2014 Page 55
  • 56. When MVP is establish, measure how your customer respond build on metrics that can lead to to cause and effect questions. Metrics have to show a clearly defined action to take once analyzed. Examples: ➢ A/B Split-Test Results ➢ Per-customer metrics ➢ Direct customer feedback 4) Learn: Analyze your product, feedback and metrics to assess your progress in an objective way. ➢ Validate learning “Validated learning” means that you need to run experiments that you have to scientifically validate based on empirical data collected by real customers that allow you to test each element of your vision. During the all process should utilize an investigative development the so called "Five Whys"-asking yourself simple questions to study and solve problems along the way. When this process of measuring and learning is done and you made small changes for optimizing your product, you should be able to understand whether the drivers of your business model are appropriate or not and decide to pivot or persevere. Figure 14. Description Step by Step of the feedback loop Image source: Andrew Walpole, Build - Measure - Learn Feedback Loop infographic, 2013 November 2014 Page 56
  • 57. Pivot: If you decide to pivot you need to take a big change in the direction or make structural course correction to test new ideas/hypotheses about the product, strategy and engine of growth and start the cycle once again from the beginning. If your new experiment runs in a more productive way than the experiments you were running before it is probably a sign that you made a successful pivot. Persevere: If you think that your test is going in the right direction then you should continue to test more assumptions and build towards executing your current vision. The lean methodology underlines the importance of experimenting in order to learn. Pivoting is just a part of the process - “if you cannot fail, you cannot learn.” (Eric Ries, The Lean Startup, 2011). Until a precise business model is found, it is important to keep your initial vision. This way, adjustments can be made to the model without reassessing the entire market. Lean approach in open data business development: a case study Steve Blank mentions a story of a startup called Tidepool as the perfect example to be studied in order to demonstrate the power of the customer development, one of the key parts in Lean Methodology. Tidepool team were severely criticized about their business model. They began believing they were selling an open data and software platform for people with Type 1, Diabetes into a multi-sided market comprised of patients, providers, device makers, app builders and researchers. They firstly reduced what they thought was a five-sided market to a simpler two-sided one. But the big payoff came when their discussions with medical device customers revealed an entirely new way to think about pricing - potentially tripling their revenue. Figure 15. Screenshot of Tidepool home page Image source: http://tidepool.org Further reading ● Eric Ries, The Lean Startup, available online ● Steve Blank, The Four Steps to the Epiphany, available online ● Steve Blank & Bob Dorf, The Startup Owner Manual: The Step by Step Guide for Building a Great Company, available online ● Steve Blank, When Customer Make you Smarter, available online. ● Andrew Walpole, Build - Measure - Learn Feedback Loop, available online ● The Lean Startup Methodology, available online Learning resources ● Steve Blank, How to Build a Startup, available online November 2014 Page 57
  • 58. ● Steve Blank, Lean Customer Development - Part 1, available online ● Steve Blank, Lean Customer Development - 3 tool for startups, Part 2, available online ● Steve Blank, Lean Customer Development - Customer Development in action, Part 3 - 3 tool for startups, available online ● Steve Blank, Lean Customer Development - Closing, Part 3, available online November 2014 Page 58
  • 59. 8.Open Data training materials already available. A list ● Useful links by the ODI use on our 3-day Open Data in Practice course here ● Slides used in the business sections on ODI’s Open Data in Practice course here ● ODI’s stories section : good place to find examples of real world impact. ● It's also worth looking at ODI Start-Ups page for ways entrepreneurs are using open data to build new businesses. You'll find details of business approach, short pitch videos and for some of the companies case-studies. ● You can explore all the materials and tutorials released by the team of School of Data. You can find interesting guides at http://schoolofdata.org/courses/ November 2014 Page 59
  • 60. 9.SLIDES and inspiring presentations: link-o-graphy http://www.slideshare.net/MicheleOsella http://www.slideshare.net/search/slideshow?searchfrom=header&q=open+data+business http://www.slideshare.net/OReillyStrata http://www.slideshare.net/TheODINC http://www.slideshare.net/MGHProfessional/leading-with-data?qid=9626d5fe-9a72-4e37-9bcf- 579ef5d75c88&v=qf1&b=&from_search=1 http://www.slideshare.net/JenvanderMeer/strata-open-data-its-not-just-for-govts2112014? qid=9626d5fe-9a72-4e37-9bcf-579ef5d75c88&v=default&b=&from_search=15 http://www.slideshare.net/deirdrelee/deirdre-lee-opendata?qid=9626d5fe-9a72-4e37-9bcf- 579ef5d75c88&v=qf1&b=&from_search=8 http://www.slideshare.net/WorldBankGroupFinances/world-bank-gurin?qid=9626d5fe-9a72-4e37- 9bcf-579ef5d75c88&v=qf1&b=&from_search=6 http://training.theodi.org/resources/ODP_Business.pdf http://theodi.github.io/presentations/2013-10-tsb-workshop-tom.html#/cover http://www.slideshare.net/napo/a-dive-into-open-data November 2014 Page 60
  • 61. 10. Videos, Audio files and books So you want to build an open data business? https://www.youtube.com/watch?v=jNscjJ5DetM The value of open data to business - the Open Data 500 Study http://theodi.org/lunchtime-lectures/friday-lunchtime-lecture-the-value-of-open-data-to-business- the-open-data-500-study Learning from New York City’s open-data effort http://www.mckinsey.com/insights/public_sector/learning_from_new_york_citys_open_data_effort Some useful webinars: http://www.socrata.com/webinars/ Opening up open data: An interview with Tim O’Reilly http://www.mckinsey.com/insights/business_technology/opening_up_open_data_an_interview_wit h_tim_o_reilly What is Open Data and how can it transform your business? https://www.youtube.com/watch?v=hXZaf08gjfo A very interesting list of recommended books is available here: https://github.com/theodi/training-web/blob/gh-pages/Bibliography/index.md November 2014 Page 61