Open Access Harvesting: Service Providers Review
My task this week is to review the list of service providers at http://www.openarchives.org/service/listproviders.html and also http://gita.grainger.uiuc.edu/registry/services/ and examine three service providers. I am primarily interested in what features could be used for a good ”federated collection.” Two of the service providers I reviewed are very large, OAISter and BASE while the third service provider is a subject-based service provider in the Library and Information Science field.
OAISter’s website states thatOAIster ”includes more than 30 million records representing digital resources from more than 1,500 contributors.” The record are harvested form collections worldwide and OAIster is accessible through WorldCat. A list of contributing repositories is not available for OAISter on its website and in many cases the links of the items listed go to the individual repository or collection website.
A keyword search for metadata harvesting brought up 1558 records. The search results can be refined by a number of formats including archival material, by author, by year, by language and by topic. The main advanced search page also limits search by format and author as well as a number of other items. Such as the unusual audience term which can be juvenile or non-juvenile. Some items that I clicked on had dead links but this was not the norm. The user also has the ability to add reviews and tags for an individual item.
DLIST http://arizona.openrepository.com/arizona/handle/10150/105067 Digital Library of Information Science and Technology (DLIST) according to its website “is a cross-institutional, subject-based, open access digital archive for the Information Sciences, including Archives and Records Management, Library and Information Science, Information Systems, Digital Curation, Museum Informatics, records management and other critical information infrastructures.” It is, however, currently closed to new submissions
DLIST is part of the University of Arizona Campus Repository and is among a number of different collections in a DSpace repository. It has a comprehensive Dublin Core metadata record. See an individual record at http://arizona.openrepository.com/arizona/handle/10150/105232
Advanced search is not as comprehensive as the BASE advanced search and the item record does not identify the individual repository. A number of filters are available such as Date Issued and Journal. A search for metadata harvesters in using the Description filter returned 375 records which can be sorted by title, issue date, submit date and relevance.
No list of repository providers is currently available on the website but an earlier version of the repository did contain such a list. See http://www.dlib.org/dlib/december05/coleman/12coleman.html#9 The bundling of the DLIST repository with the other Arizona repositories is confusing and may be a deterrent to the user. Every item can be exported and shared via services such as Facebook and Twitter.
One of the most innovative service providers is probably the BASE search engine. According to a Wikipedia article it is based search technology developed by Fast Search and Transfer (FAST) a Norwegian company. See http://en.wikipedia.org/wiki/Bielefeld_Academic_Search_Engine
BASE is one of the “world’s most voluminous search engines especially for academic open access web resources” and is operated by Bielefeld University Library in Bielefeld, Germany. It provides more than 50 million documents and as such is larger than OAISTER. See http://www.base-search.net/about/en/
According to the Registered Services Provider website, “BASE integrates scientific OAI-resources as one information type among others into the local digital library environment, together with catalogues, article databases, digitized collections. The search interface features many characteristics of internet search engines, thus offers a new type of search interface for a local digital library. BASE uses the search technology of FAST Search & Transfer. To learn more about the project see
Documents can be browsed using the Dewey Decimal Classification number and the document type. Document types include videos, audio, and software. It has a number of services for users including a website for mobile devices. See http://www.base-search.net/about/en/about_develop.php?menu=2
The Advanced Search limits results by country or region such as Europe or North America and by publication date. Full-text searches of documents are also available. Search statics are available and my search for “plasma” in the Subject field resulted in 54,022 hits over 52 million documents in 0.62 seconds. Only 5403 documents were returned for the subject search. The name of the collection where the paper or other document is stored is clearly visible at the bottom of the search result.
Features such as checking in Google Scholar, Adding to Favorites, Correct the Dewey Classification number, emailing and exporting records are available for users. Services are also provided for database and repository managers such as the integration of the BASE interface into their own local system. BASE has an excellent Browsing tool which uses the Dewey Decimal Classification and which would be very useful in narrowing the number of records for each subject, for instance, in Physics.
OAISter and BASE are very large service providers but have different search interfaces. The interfaces have used many similar terms but I fail to see how limiting search to an audience who is juvenile or non-juvenile is helpful to researchers, faculty and students in particular. BASE’s Browse tool is excellent as is its search interface. It is also fast and has a number of features such as Save to Favorites. BASE is also listed as a top ten search website for researchers on JISC’s website DLIST’s search features are not as extensive and the user could easily be confused with trying to search the other repositories in the University of Arizona system instead of DLIST.
Indrani and Thulasi (2009) provide a checklist of search features for service providers that is similar to the Advanced search features in both BASE and OAISter but also point out that each archive follows “their own rules in rendering information related to various metadata fields, users face difficulty in performing efficient search and retrieval from individual Service Providers. They conclude that, “Standardization in rendering information for all metadata elements is also very essential.”
CORE Repository Blog http://core-project.kmi.open.ac.uk/blog
Top Ten Resources for Researchers
Indrani, V., & Thulasi, K. (n.d.). A comparative study of the search and retrieval features of OAI harvesting services. (International Conference on Semantic Web and Digital Libraries (ICSD-2007). ARD Prasad & Devika P. Madalli (Eds.): ICSD-2007.) DRTC. Retrieved from http://core.kmi.open.ac.uk/display/11874812
- Where is all of our digital stuff? (dssumd.wordpress.com)
- 15 Educational Search Engines for Research (rasmussen.edu)
- National digital library gains traction (news.harvard.edu)