This presentation will be a live exchange of ideas & arguments, between a representative of a start up working on agricultural information management and discovery, and a representative of academia that has recently completed his PhD and is now leading a young and promising research team.
The two presenters will focus on the case of a recommendation service that is going to be part of a web portal for organic agriculture researchers and educators (called Organic.Edunet), which will help users find relevant educational material and bibliography. They currently develop this as part of an EU-funded initiative but would both be interested to find a way to further sustain this work: the start up by including this to the bundle of services that it offers to the users of its information discovery packages, and the research team by attracting more funding to further explore recommendation technologies.
The start up representative will describe his evergoing, helpless and aimless efforts to include a research activity on recommender systems within the R&D strategy of the company, for the sakes of the good-old-PhD-times. And will explain why this failed.
The academia representative will describe the great things that his research can do to boost the performance of recommendation services in such portals. And why this does-not-work-yet-operationally because he cannot find real usage data that can prove his amazing algorithm outside what can be proven in offline lab experiments using datasets from other domains (like MovieLens and CiteULike).
Both will explain how they started working together in order to design, experimentally test, and deploy the Organic.Edunet recommendation service. And will describe their expectations from this academic-industry collaboration. Then, they will reflect on the challenges they see in such partnerships and how (if) they plan to overcome them.
Je t’aime… moi non plus: reporting on the opportunities, expectations and challenges of a real academic-industrial collaboration
1. Je t'aime... moi non
plus
Nikos Manouselis, Agro-Know Technologies (Greece)
Christoph Trattner, Know-Center (Austria)
2. Nikos Manouselis, Agro-Know Technologies (Greece)
Christoph Trattner, Know-Center (Austria)
reporting on the opportunities,
expectations and challenges of a
real academic-industrial
collaboration …blah blah
6. Unorganized Content in
local and remote sites
Widgets
Authoring services
Data Discovery Services
Analytics services
Agricultural Data Platform
Ingestion
Translat
e
Publishing
Harvesting BlossomCultivation
Organized and structured
Content in local and remote
DBs
Educational
Geographical
Bibliographic
Enrichment
Aggregate
data from
diverse
sources
Work with
different type
of data
Prepare data
for
meaningful
services
Educational
Bibliographic
knowledge aggregation and sharing solutions
7. Christoph
Born: 24.2.1980, Graz, Austria
•Research Field: Social Computing (so basically my
research is centered around Social Networks
Analysis, Social (Semantic) System design and Social
information access)
•Education: Ph.D. („Dr. techn.“) in Computer Science
and a MSc & BSc in Telematics from Graz University
of Technology
•Publications (since 2009) : 5 Journals, 24 Conf.
Papers, 2 Book Chapter, Publications in for example:
WWW, HT, ICWE, Wikisym, SocialCom, ASONAM,
etc.
•Currently, I am working as Head of the Social
Semantic Research Group and Deputy division
manager at the Know-Center, in Graz Austria
Contact:
Email:
trattner.christoph@gmai.com Web:
http://christophtrattner.info
Twitter: @ctrattner
8. Christoph’s team
• 1 Post Doc, 5 Pre Docs
(1 more will join in Sept. )
• 2 MSc student
• 1 BSc student
DI. Dieter
Theiler
DI. Dominik
Kowald
Mag. Peter
Kraker
Mag. Sebastian
Dennerlein
Dr. Elisabeth
Lex
Mag. Matthias
Rella
10. Organic.Edunet
• outcome of EU project “Organic.Edunet” of
eContent+ programme (2007-2010)
• based on network of >10 content providers
• portal maintained & updated by Agro-Know
and an academic partner (anAP)
• evolved through EU project “Organic.Lingua”
(2011-2014) in collaboration with K-C and
anAP
11.
12. Organic.Edunet Recommendation
• social navigation module exposed through API
– content-based recommendation using tags on
resources
– user-based collaborative filtering using multi-
criteria ratings
• recommendation of relevant resources within
user’s profile
– well-hidden, never used
– module API developed & supported by Agro-Know
– UI & features developed & supported by anAP
13. desired: Organic.Edunet “Suggest”
• a real content discovery service suggesting
resources to users
– interactions used as input to train system
– personalised vs. non-personalised version
15. Agro-Know’s perspective
• a service that can become a plug-and-play product
– working on top of recommendation API
– reusable in all agDiscovery services (sites, portals, apps)
• a service that works, well
– tested performance, correct parameters for algorithms
in each context
– tested & adaptable UI, to be reused in several
deployments
• a service bundle that we can sell to our clients
16. Nikos’ perspective
• experiment with multi-criteria recommendation
– continue work that started in PhD
– visualisation & UI challenges
– find someone to try-that-interesting-idea
• take advantage of large user base & lots of data
– Organic.Edunet dataset: ratings & tags already
collected
– expand to federated data sources of social data
• keep publishing, but not keep on doing research
experiments
17. Christoph’s & KC’s perspective
• Why is this cooperation valuable for us/me?
– Typically it is not too easy to get access to real user data..
• Test algorithms not only “offline” but also online
– Currently, we are just playing around with offline experiments
• Test interfaces not only in lab studies
– Currently, we are evaluating our interfaces just with expert
interviews or with lab studies
• Work towards second doctorial thesis that lies in the context of
recommending “things” (people, resources, annotations) in social
semantic networks
19. bringing it all together
• major activities to take place in next 9
months
–offline experiments using existing dataset &
exploring various algorithmic options
[summer’13]
–online experiments exploring various
service options [autumn’13]
–final service deployment [winter’13-’14]
20. evaluation experiments (1/2)
• evaluating algorithms
–offline experiment running different
algorithms over offline data that have user
preferences
–online experiment with single interface with
back end recommendation engine
interchanging between algorithm variations
21. evaluation experiments (2/2)
• evaluating different visualisations
– simple suggested list of resources
– simple tag-cloud based faceted browsing
– cluster-based bubble interface for browsing bases
on themes
• evaluating data availability/coverage
– one interface with selected algorithm with
backend selector that will interchange item
catalogue dataset
22. research outcomes
• conference publications to make K-C
happy
– ACM RecSys’14
– ACM HT’14
• journal publication to make all happy
– ACM TIST Special Issue on Recommender System
Benchmarking
24. Nikos’ perspective
• productizing & selling
– bundle of services together with K-C or Agro-
Know’s product?
– business & costing model?
• time
– research mentoring is a luxury for a start-up CEO
– should eventually lead to an added-value product
– creates bias in product development process
(what if this idea should simply die?)
• trust: what if they are yet-another-anAP?
25. Christoph‘s perspective
• Time: Tight timeline
– according to Giannis (our project coordinator)
services should be done by Sept.
– Not much room for failure
27. Christoph‘s perspective
• Data: Sparse data...
– Although the portal attracts a lot of people every
day (a bunch of thousands), we currently do not
have the data we need to do „real“ cool
personalized recommender stuff...
29. Christoph‘s perspective
• Multilinguality:
– Currently the portal provides documents in 42
different languages...how do we handle that?
– Well, lucky us, most articles are in English
language so we might handle this by providing our
services just to those users?
• Speed: Although our recommendations are
pretty fast (almost real-time) how do we
handle network delays? Maybe it is better to
set up a virtual machine?
30. Christoph‘s perspective
• Scalability: What happens if the portal really
flies off? Currently, we have almost everthing
in memory
– Ok we have a big server with 256GB of RAM
...and we are using Apache Mahout for some
algorithms (e.g. CF), but how about the other
„cool“ algorithms we have developed and that we
want to test?
31. Christoph‘s perspective
• Sustainability & Trust: Currently, we are pretty
fine with Nikos, and he likes our ideas, but
what if we want to test new stuff?
– Does he allow us to change our services?
– Or even worse, he does not allow us to change
anything!
From data cultivation to data blossom , the Agricultural Data platform is an end-to-end modular solution that can transform data into meaningful services. The agricultural data are harvested from diverse sources and after they enrichment are published through a set of web services to external systems. The enrichment of data includes: improvement of data descriptions annotation of data with ontologies translation of data descriptions The enrichment of the data allows the development of high quality services for specific agricultural communities. Publishing is responsible for the exposure of agricultural data in a form that can be used a) for the development of data discovery services b) authoring services and c) analytics dashboards to track and study how the agricultural data are used.