lib-ir Archive
Date: Tue Apr 11 14:27:02 2006
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lib-ir: Fwd: Use of Navigational Tools in a Repository



Since the new version of DSpace (up in Irtest thanks to Corey's
work) includes a subject browse and some form of authority control,
this study seems particularly timely.

Up to this point, we haven't had any easy way of checking the keywords
(mapped to Dublin Core "subject") that are assigned to items in
Scholars' Bank. And, since the model of the archive was originally
for author self-submission, I haven't worried about it. Coupled with the
fact that textual documents are full-text searchable (in most cases
when the Media Filter software works as expected), I have more and
more often opted not to supply keywords to items in the IR. And,
when we have supplied keywords, the source vocabularies have varied
- if there was any source vocabulary at all.

Read the following and see what you think. And if you want to take
a look at the Subject Browse in test, go to:
https://irtest.uoregon.edu/dspace/

Carol


Date:         Thu, 9 Mar 2006 00:37:44 +0000
From: Leslie Carr <lac@ECS.SOTON.AC.UK>
Subject: Use of Navigational Tools in a Repository
To: JISC-REPOSITORIES@JISCMAIL.AC.UK

A recent discussion between some colleagues on the utility (or
otherwise) of subject classification in repositories prompted me to
undertake a brief investigation whose results I present here. (I'll
also send this to AMSCI, so apologies for any duplicate copies that
you see.) The discussion has broadly been between computer scientists
and librarians over whether subject classification schemes offer
advantages over Google-style text retrieval; the study below looks at
the evidence as demonstrated in the usage of one particular
repository. As such it doesn't address the intrinsic value of
classification, but it does offer some insight into the effectiveness
of navigational tools (including subject classification) in the
context of a repository.

----------------
The University of Southampton Institutional Repository has been in
operation for a number of years and an official (rather than
experimental or pilot) part of its infrastructure for just over a
year. As part of its capabilities, it includes lists of most recently
deposited material, various kinds of searches, a subject tree based
on the upper levels of the Library of Congress Classification scheme
and an organisational tree listing the various Faculties, Schools and
Research Groups in the University and a list of articles broken down
by year of publication. These all provide what we hope are useful
facilities for helping researchers find papers (ie by time, subject,
affiliation or content).

Over a period of some 29.5 hours from 0400 GMT on March 7th 2006,
1978 "abstract" pages (ie eprints records) were downloaded from the
repository (ignoring all crawlers, bots and spiders).

Of the 1978 downloaded pages, the following URL sources (referrers,
in web log speak) were responsible:
  439  - (direct URL, perhaps cut and paste into a browser or
clicked on from an email client)
  225  EPRINTS SOTON pages
    25  OTHER SOTON WEB pages
1264 EXTERNAL SEARCH ENGINES
    21  EXTERNAL WEB PAGES

ie the local repository facilities, including subject views and
searches, led to only 225/1978 = 11% of all downloads.





From that we can tell that the repository navigation and search
facilities affect little of the ultimate repository usage. (This may
be a depressing message for a repository administrator such as
myself, because it highlights how little control I have over my
repository's users either to help or manipulate them!)

Of the 225 local repository links, the following breakdown applies:
  13 Latest Deposits page
103 Searches (both simple and advanced)
  57 Browse by Schools and Groups Hierarchy
  17 Browse by Subjects Hierarchy
    0 Browse by Year of Publication
  33 Directly linked from other abstracts (or reloads).
  12 Misc infrastructure

ie 11% of the downloaded records are accounted for by use of the
local repository. 8% of that usage is caused by the subjects tree (ie
0.86% of all eprint downloads are caused by the subject tree). For
what it's worth, a breakdown of papers by school and research group
is three times more popular than the subjects list, but it is still
only involved in 3% of the downloads. Local search accounts for 5%,
but it still isn't very significant! The result is even more gloomy
for the breakdown by "Year of Publication", which didn't lead to any
eprint downloads whatsoever!





The majority of repository use, if I can equate eprint downloads with
repository use, is due to external web search engines (64%).

This may be due to the fact that of the 1978 downloads, only 131 (or
7%) came from Southampton University IP addresses. In other words,
behaviour of external traffic dominates the repository usage.

If you look only at the local users from the above data (the
downloads that came from Southampton IP addresses), then the
breakdown is as follows.
 39 (direct URL, perhaps cut and paste into a browser or clicked on
from an email client)
    1 Directly linked from other abstracts (or reloads)
  10 Latest Deposits page
  71 Local Repository Searches
    1 Browse by Schools and Groups Hierarchy
  10 External Search Engines

These numbers are quite low and really need a longer period to be
confident, but it appears that local repository searches are much
more popular than external search engines for local users. But the
browse by year/subject/school are all largely ignored.

Taking a diifferent approach and looking at all of the page requests
for the repository that were coming from the University of
Southampton users (not just eprint downloads but the home page and
all search requests and browsing pages but ignoring icons,
stylesheets and javascript), in the same period there were 1025
requests coming from 52 uniquely identifiable users.
  72 Home Page
  52 Latest Deposits
122 Search
     2 List of Browse Choices
   25 Browse by Group
     6 Browse by Subjects
     2 Browse by Year
132 Download Eprint Records (abstracts page)
  26 Download EPrints Files (full texts)
544 User Login, Deposit and Admin
  14 OAI-PMH

Once again we can see that local search overwhelms the use of local
browse categories (whether by subject, group or year).

Conclusions
==========
External users dominate repository usage.
External search engines (including OAI search engines) are the
primary mechanism for finding papers.
Local users show a somewhat greater tendency to use local search
facilities.
Neither external nor local users appear much influenced by subject
listings or other browse categories.

This study seems fairly conclusive but its results may not be
typical. Further study is being undertaken to compare these results
with other types of repository and to determine the repository
features (if indeed there are any) that can best help readers in the
task of finding relevant material (resource discovery).
---
Les Carr


Attachment: pastedGraphic.tif
Description: TIFF image

Attachment: pastedGraphic1.tif
Description: TIFF image