Exploiting classical bibliometrics of CSCW: classification, evaluation, limitations, and the odds of semantic analytics

Existing mechanisms are inefficient for a single human to classify and
transform data into knowledge patterns from a large number of publications.
Augmenting human intelligence through
computational intelligent mechanisms
230+ million
Knowledge workers in 2012

Motivation
• Current impacts of financial crisis on societal and scientific frameworks have raised the
need to harvest and evaluate vast volumes of data, encompassing:
– Time-consuming, computationally difficult problem-solving tasks.
– Cognitive effort in mining and analyzing scientific corpora with inefficient techniques.
– Socially-mediated acquisition and use of knowledge to reduce gaps and accelerate innovation at a global scale. A
challenge relies on how to combine the feedback of different researchers and general public who might disagree.
• Keeping track of advancements, revealed assumptions, disciplinary boundaries, research
fields lacking reexamination, “ghost theories”, and social-technical requirements.
– Up-to-date literature reviews on a certain topic are laborious-intensive by nature.
– Developing strategies to help coping with major challenges (e.g., emergency response, and painless ageing).
– Cooperative work endeavors among researchers have originated complex structures.
– Knowledge representations differ from field to field.

Timeline of social
interaction support
systems (2000-2010)
The nature of
work will continue
to change

CSCW-related publication venues
Scientific data production in the field of CSCW has grown steadily for
much of the past quarter century, disseminated by devoted and non-
specific publication venues concerning their interdisciplinary scope.
Grudin & Poltrock (2012)

Research approach
Bibliometrics is applied as a formative instrument to map the structure
and impact of the field of CSCW through content and citation analysis.
Nicolaisen (2010)

Logical diagram of
research methodology

Bibliometric studies in CSCW and HCI contiguous literature

Global analysis of comparable data in Google Scholar,
ACM Digital Library and WoS

Presence of CSCW researchers by affiliation’s
country (2009-2010)

Classification
dimensions across
the literature
Cruz et al. (2012)

Application-level
classification scheme
Mittleman et al. (2008)
Jointly authored pages
Technologies that provide a shared window within which multiple users may contribute, usually
simultaneously.
- Conversation tools (e.g., instant messaging)
- Shared editors (e.g., Google Docs)
- Group dynamics tools (e.g., idea generation, clarification, evaluation, and consensus-building)
- Polling tools (e.g., BlogPoll)
Streaming technologies
Technologies that provide a continuous feed of dynamic data.
- Desktop/application sharing (e.g., IBM Lotus Sametime, and TeamViewer)
- Audio conferencing
- Video conferencing
Information access tools
Technologies that provide group members with ways to store, share, find, and classify data objects.
- Shared file repositories (e.g., Dropbox)
- Social tagging systems (e.g., Delicious)
- Search engines (e.g., digital libraries)
- Syndication tools (e.g., social networks )
Aggregated systems
Technologies that combine other technologies and tailor them to support a specific kind of task (e.g.,
Microsoft SharePoint).

Comparison scheme
attributes to evaluate
technical functions
Core functionality(ies)
Identifies primary capability(ies) provided by a tool.
- Text-sharing (e.g., chat)
- Hyperlinks (e.g., Wikipedia)
- Conferencing (e.g., Skype)
Content
Describes the kinds of data structures that may be used to a particular collaboration (e.g., text message,
picture, URL, sound, video, and hypermedia)
Relationships
Associations established by users among contributions (e.g., before/after, bigger/smaller)
Supported actions
Actions that users can take on structures or relations (e.g., add a new item to a blog, and delete text from
a session)
Synchronicity
Expected delay between the time that a user executes an action and the instant at which other users
respond to that action (i.e., synchronous, asynchronous, or synchronous/asynchronous settings)
Identifiability
Degree to which users can determine who executed an action (i.e., full anonymity, subgroup, pseudonym,
or full identification).

Comparison scheme
attributes to evaluate
technical functions
Access controls
Configuration of user‟s rights and privileges with respect to entering a session and
executing supported actions.
Session persistence
The degree to which contributions visibility is ephemeral (disappearing as soon as they are
made) or permanent (e.g., an e-mail sent to other person).
Alert mechanisms
Interruptions or notifications used to attract immediate attention (e.g., RSS feed).
Awareness indicators
The means by which a user may know what each member have access to a session, the
nature of their roles, and their current status.

Logical diagram
of classification
process

Biannual analysis of
application-level categories

Biennium analysis of the
functional attributes of
collaboration systems

Limitations on scientific literature evaluation
• Size and scale of analytical corpus;
• Laborious and time-consuming processes of data seeking, gathering,
cataloguing, and analysis;
• Lack of bibliometric perspectives (e.g., downloads, author’s production,
and affiliation indicators);
• Subjective and inconclusive results at a semantic level for several units;
• Static granularity of classification models;
• Cognitive load (i.e., mental rigor needed to perform tasks);
• Reading level (i.e., complexity of words and sentences);
• Absence of human-centered studies and results concerning cognitive
aspects in meta-knowledge research practices with a massive number
of humans interacting on bibliographic data.
at a human-centered perspective

Quinn & Bederson (2011)
Growing indicators on human computation
and crowdsourcing literature

Towards a crowd-enabled model for
bibliographic data analytics
Considering the “tedious and lengthy task” of finding all journal papers,
conference proceedings, posters, tutorials, book chapters, images, and videos,
analyzing and classifying them manually under different analytical perspectives.
Jacovi et al. (2006)
Human labor Computational
intelligence
with

Crowd-enabled model for
bibliographic data analytics

Future directions
• Reformulate results broadening the sample.
• Correlate different perspectives of analysis (e.g., co-authorship networks,
affiliations, and semantic evidences).
– Text mining, bibliometric tools, and visual analytics.
• Understanding collective and social behaviors through larger datasets.
– Human Intelligence Tasks (HITs) such as annotation, classification, and evaluation.
• Design of experiments to validate a crowd-powered model for bibliographic
data analytics involving humans working cooperatively and massively.
– The prototype of an observatory for CSCW publications is under development and will serve to
identify requirements and validate a human-centered research ecosystem based on the
aggregation of massive collaboration efforts around bibliographic data.

I hope you enjoyed
this presentation!

Exploiting classical bibliometrics of CSCW: classification, evaluation, limitations, and the odds of semantic analytics

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (17)

Destaque

Destaque (11)

Semelhante a Exploiting classical bibliometrics of CSCW: classification, evaluation, limitations, and the odds of semantic analytics

Semelhante a Exploiting classical bibliometrics of CSCW: classification, evaluation, limitations, and the odds of semantic analytics (20)

Último

Último (20)

Exploiting classical bibliometrics of CSCW: classification, evaluation, limitations, and the odds of semantic analytics