In my last post I talked a bit about catalogues and how I should write a post in which I test-drive a few. So here we are. I've given five Sheffield-based library catalogues a spin: two from the two Universities in the city, and one from the public library service. Let's take a look at tonight's contestants:
Sheffield Hallam University's Innovative WebPAC Pro "Classic" catalogue:
The out-of-the-box
OPAC for the
Millennium LMS. This is a simple little piece of kit with a drop-down choice between searches for Keyword, Author, Title, and a few other bits and bobs (plus an advanced search option). But for the challenges we're about to put its way, we shall only be using the default Keyword option.
Sheffield Hallam University's ProQuest/SerialsSolutions Summon "Library Search" catalogue:
After dabbling with
Innovative's "Encore" for a year,
SHU have recently (last Autumn) opted instead for the
Summon federated search catalogue system. While there's an advanced search option, and opportunities to limit searches to particular criteria, the default approach is a Keyword search of everything except newspaper articles and book reviews (which were
taken out of the default search just a week before I conducted this test), and it is upon this default that we hurl our challenges.
The University of Sheffield's Talis Prism "Star" catalogue:
Upgraded from a previous
Talis system in 2003, Star definitely looks its age. There's a choice of Keyword, Author and Title search, and again we shall be taking the Keyword option for purposes of parity.
The University of Sheffield's ExLibris Primo "StarPlus" catalogue:
They thought it would never happen, but at long last Star has been updated. The new catalogue (in a perhaps perpetual "Beta" stage)
came on-line in November 2011 (not long after SHU's
Summon). The default search is by Keyword and limited to university collections (including journals and etext subscriptions), with a secondary option for literature cross-searching across multiple databases. There is also an advanced search, but we shall be using the default set-up.
Sheffield Public Libraries' SirsiDynix iBistro library catalogue:
The aesthetics of Star alongside the drop-down functionality of the classic SHU. As with the other contenders, we shall be using the Keyword ("words or phrase") option for our search experiments.
Our contestants met, it's time to start...
Round One: Vital Statistics
"Clicks to search" is the number of clicks it takes to reach the catalogue search box from the institution's front page. "Results per page" is the number of search results per page. "Average loading time" is based on a search for
pride and prejudice and is the mean number of seconds calculated from five successive attempts on my home connection of a Sunday afternoon. The results of that search are converted to
MHTML to give the page size (in megabytes).
Both SHU catalogues take quite some finding from the University's homepage, although access is more direct from the student homespace and
VLE. The slowest loading time was achieved by the public library, despite having the second smallest page size, but such is the nature of the server. Heaviest on the bandwidth is Summon: its results exceed the 1
MB mark, making for a loading time twice that of its classic companion (the fastest of the five on the day) despite the fact that the classic search packs in twice the number of results per page. Unquestionably, then, SHU Classic is the most efficient in terms of time and bandwidth.
Round Two: Error Correction
Just a quickie, to test the catalogue's error-correction capacity. SHU Classic and Star don't do correction. What you type is what you get, and what I've typed is
pride and predujice. The public library offers a list of "closest matches" which includes what we're after (the novel by Jane Austen), so that's a pass. StarPlus correctly deduces that we're after
pride and prejudice and offers that as an alternative search, so that's a pass too. SHU Summon, however, thinks we want
pride and predjudice which gets us 28 (irrelevant) results. Only then does it offer
pride and prejudice.
So having finally got some results for
pride and prejudice, how do they look?
Round Three: The Austen Test
Here's three formulations of the title:
pride and prejudice,
pride & prejudice, and
"pride and prejudice". The first is a basic test, the second will challenge the catalogue's ability to cope with ampersands, and the third will demonstrate the effectiveness of quotes.
Star does not do ampersands (not in a Keyword search at any rate). It cries if you try.
In the above table, the second column for each catalogue is the number of matches retrieved, and the first column indicates how far down the list we have to go before we find our first example of the Austen novel. With the exception of the public library, our book appears on the first page of results, and in the case of StarPlus, it is right at the top. Curiously, although it limits the number of results, adding quotes to the search also drops the book lower down the page for both SHU catalogues.
The lower number of returns for StarPlus compared to Star is down the way that StarPlus groups like titles within a combined record. The higher returns on Summon compared to SHU Classic are on account of the breadth of external material being included within the search.
Round Four: The Proust Test
In this round we test the way the catalogue (and perhaps to a larger extent the cataloguer) handles the questions of translation and the episodic novel. In all seven of the above searches we're trying to find for ourselves a copy of Proust's "Swann's Way" (in either French or English; we're not fussy). The diacritics in the French formulations garner no results from the public library and the classic SHU catalogues, which is unfortunate. In the case of the SHU catalogue, at least, this is due to the diacritics being absent within the catalogue records. The University of Sheffield has superior catalogue records regarding its use of alternate titles, but the search also seems to strip diacriticals from a character, thereby giving the same hits for
Du côté de chez Swann as it does for
Du cote de chez Swann. On the first three searches, Summon fails to find our book within its first 100 records (an arbitrary cut-off on my part) though does manage to find later volumes. The Star catalogues on the other hand are immensely successful in giving us what we're after, with StarPlus being spot-on every time.
Round Five: The Godard Test
Here's a similar test, but one which is broadly passed. This time we're looking for a film by Jean-Luc Godard. The only failure is the missing diacritics in the public library, but the grave has made it to the SHU catalogue record this time round. SHU has a problematic lack of overlap, though, unlike Star's results.
Round Six: The Taylor Test
Now we shall see how the catalogues cope with a popular initialism. Ideally, the results for
ajp taylor should be a subset of those for
alan john percivale taylor, and this appears as if it may be the case for all the catalogues except Summon, where the noise from various articular references to AJP creeps in. Understandably, the
a j p taylor
returns are comparatively high, but in all cases bar Summon we still get a top rank result (we don't get a work by AJP Taylor till page 2 on Summon).
Round Seven: The Orwell Test
This round is another test of the ability to locate a work given the author's name. We're after any individually credited work by George Orwell (letters excluded). Unsurprisingly for a Keyword search, the search for
george orwell gives the same results as the search for
orwell, george, and likewise with
g orwell and
orwell, g. The problem with searching for the author by Keyword is that there are an awful lot of books about the author, with the author's name in the title, and these tend to rise to the top of the rankings. Summon is a particular victim of this, and it takes a set of inverted commas to bring any works by Orwell into the top 100 returns. Conversely, adding the quotes to StarPlus drops our first hit beyond the first page. In fact it is the old version of Star that is the most successful in this round (failures on the initialised formulation notwithstanding), while only SHU Classic gives us a front page hit for every case.
Round Eight: The Plural Test
I want to find a book about
search engines. Or should I find a book about
search engine stuff? A wildcard would go well here, but is unnecessary in Star which automatically knocks off any terminal "s" and adds those findings to the pool. So it is that a search for
foxes in Star brings up a page of works by John Foxe. A similar phenomenon appears to be occurring in SHU Summon, although it seems the results are more relevantly ranked without the plural than with it.
Round Nine: The Amey Test
Our contestants have made it to the final round, and the toughest of them all. For this is the specialist round. Their task is to find "The Collapse of the Dale Dyke Dam, 1864" by Geoffrey Amey. It's a book about the
Great Sheffield Flood of 1864, and it's a book that all three collections contain. To find it we shall use the simple but likely search terms
sheffield flood and
sheffield floods. As can be seen from the above table, it was not a task well-met. Again, the problem here is one of cataloguing more than it is of the catalogue itself. Even the one catalogue to give a direct hit (thanks to the
MARC 650 entry "Floods|xEngland|xSheffield.") does so in a rather hit and miss way (three of the ten hits for the
sheffield flood
search are relevant texts, with only one of them appearing in the results for
sheffield floods
. Star uses its plural trick to give us all the books it can in both cases, but Amey is not listed among them (despite being in the collection). In Summon it's the last book we deign to look at, right at the bottom of page 4, and Sheffield Public Libraries aren't much more forthcoming. But at least it is there in both cases.
Final Results
Summon and StarPlus consistently returned results irrespective of the rigour of our tests, though Summon gave an average return of 58,203 results which is three orders of magnitude out from the rest of the catalogues. Such high returns were not stacked in our favour, however, and the average relevance ranking of 23 is the lowest of the five, though it is at least on the first page of results. Most successful in retrieving our desired items (and also in ranking them prominently) is StarPlus, though Star and SHU Classic don't do badly in these regards either.
The above tests are all simple searches made upon the default search settings of each catalogue. The noisy irrelevance we encounter in Summon could be cut back were we to refine our search, either through the advanced options or via the limiters in the side menu. Unfortunately, the bulkiness of Summon makes it the slowest of the four Higher Education systems here, and we know from Google how important speed is in our searches. Having to make secondary moves, such as limiting our searches to books, makes the search process all the slower. Those behind Summon would point out that I am using a screwdriver to hammer a nail here; that Summon is geared towards the literature searching needs of the modern student, and is more than a book-locating tool. But I do feel that book location is an essential feature of a library catalogue, and that Summon isn't really all that good at it. Again, I throw in the caveat that we've been using Keyword searches where more specific advanced field searching might serve us better, but the same caveat applies across the board. When StarPlus is capable of being near-consistently on the money for us, while also offering a range of federated search options beyond the main holdings, the bulkiness and noisiness of Summon appears unhelpful.
But it isn't all about the catalogue system. As a number of the examples demonstrate, the catalogue record itself is of immense importance.
Charles A. Cutter wrote (in 1876) that the purpose of the catalogue is "To enable a person to find a book of which either (a) the author / (b) the title / (c) the subject is known... To show what the library has... [and] to assist in the choice of a book". The most difficult index to achieve is that of the subject, which is why we have things like
AACR2. Attempts to Google-ize the catalogue experience have all-too-often fallen at this hurdle: it's hard to build a foolproof relevance ranking from MARC fields, as some of the above experiments show, and things can get particularly messy if you begin to introduce some full texts into the mix. This is what seems to be happening in Summon: full-text articles are getting more matches for the search-term and are rising to the top of the results at the expense of the terser records. In the long term, when everything in our collection is fully searchable, such Googly games might actually work, but we're not there yet. That's not to say that one couldn't come up with a sensible (and probably quite complicated) algorithm to deal with both types of material, but Summon has not done this, and StarPlus has gone with a less integrated model in an effort to avoid such pitfalls. On the strength of the above tests, it's an approach that seems to have worked: as a catalogue of the university's holdings it is undoubtedly an effective tool. By not attempting to be a jack of all trades, StarPlus succeeds in mastering the core role of the catalogue. Whether it operates as successfully when employed as a wider literature-searching tool remains to be seen. That's one for another day's testing.