15. Table 1 from Grigson’s eBookUsage Indicates peak-use periods of high demand for portions of the collections
16. Evaluating Ebook Usage Data Kinman (2008). Putting the Trees Back in the Forest: E-Resource Usage Statistics and Library Assessment. ER&L, March 18-21, 2008, Atlanta, GA. https://smartech.gatech.edu/bitstream/1853/20665/1/forest_trees_kinman.pdf A description of a 5 year study on library services and resource usage, including a novel application of Tufte’s Sparklines http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001OR
19. Thank you! Comments: John McDonald Libraries, Claremont University Consortium [email_address]
Notas do Editor
Good afternoon everyone, or Good morning to those of us on the West Coast. Thanks to Peter and Peter for great presentations, which makes my job that much harder. I’d like to start by giving everyone a little background on me: I’m the library director at the Claremont Colleges in California and have spent most of my professional research interest in trying to understand academic information usage behaviors by examining statistical and quantitative information. I’ve done research on online journal usage by studying OpenURL logs and am currently working on understanding print and ebook usage in an academic setting. Today I’m not presenting research that I’ve done, since I’ve already presented elsewhere on my most recent projects. Instead, I’d like to talk about the importance of doing usage data research and the applicability of research in practice.
As evidenced by the participation here today, there’s currently intense interest in studying usage data. We now have the ability to measure information usage actions, including downloads of articles, citation of articles, and to measure the actions between resources, for example between a database search and the objects used in the results sets. In addition, the normal librarian has the computing power to analyze these datasets since, like COUNTER has done, there are mechanisms to deliver usage data to us and standards for that information provision and distribution. And finally, we now, as librarians, are very interested in our Return on Investment as we seek to rationalize our decisions and justify our value to our organizations.
I feel that while we’re still at a nascent stage of usage data research, it’s an exciting time because there are endless possibilities for what data is collected and how to craft a research project around specific data sets. We’ve had ISI citation data for a long time, but now we can match that with things like COUNTER reports of journal, database, or ebook usage. Many publishers provide robust and interesting reports that may not be COUNTER compliant but are useful data for analysis. We also have web server logs, proxy logs, openURL logs at our disposal, and even have third party non-library software that we can use to understand our users like Google Analytics.
There are basically two types of usage data research that is important for librarians to understand and/or perform. There is the more theoretical, bibliometric-based, studies and then there are practitioner based studies that help us put some theory into place and address specific issues within our institutions. Three theoretical studies I’d like to highlight today include Bollen’s study of journal Centrality Measures, Rosvall & Bergstrom’s journal maps based on citation data, and Phil Davis’ studies on open access citation rates.
Bollen and his colleagues have been working on some advanced statistical studies of citation and usage, and this study tested 39 journal measures of ‘impact’, some very standard like ISI Impact Factor, and other self-developed by combing OpenURL server logs at their institution. What they found that is important for the practictioner is that the centrality measures indicated that usage-based measures do not correlate with citation-based measures, indicating that they are different ‘usage’ events.
This figure is from that study, and you’ll see that the Usage Based Measures are clustered closely together but far away from the citation based measures. While the citation based measures are not clustered closely even near themselves. This study is also interesting for the statistical tests used: spearman rank-order correlation to address the non-linear datasets involved.
Now turning to another study by Rosvall & Bergstrom that used citation data to draw scientific maps of relationships. Their study found that most basic science fields have bidirectional relationship with each other, but many applied fields cite journals in the basic sciences without the same reciprocal relationship.
Here’s an map from their article illustrating this relationship. The fields of Medicine and Molecular & Cell Biology cite each other heavily and equally, while the applied field of Ecology & evolution cites Molecular & Cell Biology heavily, but is not cited by that field to the same degree. I find this fascinating and would like to see this study repeated with usage data, either at the global or the local level.
Davis’ studied 11 journals who randomly assigned open access status to articles. He found that there was no evidence that OA articles accumulated more citations than paid articles.
And again, this study is interesting since we could use usage data to repeat the study and find out if citers and readers exhibit different behaviors based on the cost of an article. And the linear regression model that he developed is interesting, as you can see the complexity of the relationship of the independent variables that were studied.
Now onto the practioner articles. The following articles all come from the most recent issue of the Journal of Electronic Resources Librarianship that I edited. Betty used Google Analytics to study online tutorials, Grigson studied usage data of different ebook vendors to determine the best business modelf or their institution, and Kinman used a novel application of Edward Tufte’s sparklines to graphically display usage data.
Betty’s study involved installing google analytics to track the usage of his department’s web-based tutorials for library instruction. The study is fascinating because there isn’t yet a large corpus of research on the use of Google Analytics in the library environment and the results that he found from the study.
Those results found that for one tutorial, a significant portion of the hits were from an unintended audience, giving him the opportunitiy to look into ways to mitigate that in the future. He also found that there were high hit rates