IRLS 675 Unit 9: Subject Listings, Keywords, Tags, Categories and Facets.

My collection is a collection of Physics and Astronomy material that is composed of  papers and research data that students and faculty of the Physics department in my college will be using for their work. They will be both uploading files and downloading files. A comparison of the different ways that Drupal, DSpace and EPrints deal with subject listings, keywords, tags, categories and facets will help me to design a repository that best suits my users’ needs.

Subject Headings
It is difficult to know how experts with doctorates in Physics or Astronomy will use keywords for their own or other researchers’ work. For this reason, I decided to use a combination of tagging provided by the users of my collection and using some broad terms for subject headings that I choose from the material submitted. As I have a background in Physics, I understand many of the terms used and their significance. However, the repositories I am working with, (Drupal, DSpace and EPrints) have different options for implementing controlled vocabularies which could be used as subject headings.

Too Broad?
Keywords or subject headings that are broad could generate too many answers. For instance, seven of my total of  ten digital items can be found using the subject term Physics in EPrints. However, I have used uncontrolled keywords in Eprints which gives me a narrower set of results such as two items when I use the keyword “plasma.” Uncontrolled keywords  can also be submitted in Drupal and DSpace.

My Eprints Subject Headings

My Eprints Subject Headings

Controlled Vocabulary and Metadata Fields
As Heather Hedden (2010) suggests, “not every metadata field needs to have a controlled vocabulary.” Fields such as the title field and the size and date fields do not need a controlled vocabulary according to Hedden. However, I did put digital item authors into a controlled vocabulary field in Drupal to minimize spelling mistakes as many of the authors are employees or students in the Physics Dept. where I work. This would be the only reason to have a controlled vocabulary for authors and follows Hedden’s (pg. 280) suggestion. It is much easier to create such a controlled vocabulary for authors  in Drupal than in Eprints or DSpace. The only reason for not having such a vocabulary is the use of external authors which I will also be using in my collection.

Most experts agree that the problem of consistency or labeling information in a consistent way can be overcome by controlled vocabularies. See For instance, different users will create different terms for the same digital object and if they pick terms form a controlled vocabulary, then there will be less of a problem with labeling the object in a consistent way.

Non-Preferred Terms
Hedden (2010) discusses why it is important to include soem non-preferred terms in controlled vocabularies. Non-preferred terms, according to Hedden (2010) “may be near-synonyms, alternate spellings, grammatical / lexical variants, slang or technical versions, phrase inversions, acronyms and so on.” Since some of my users will be student, it would be good to have some non-preferred terms in my controlled vocabulary. For instance, exoplanets my be a term that is misunderstood by some students and I could include the phrase, “planets external to our solar system” to describe such a planet. This would be easier to implement in Drupal than in Eprints or DSpace.

Categories and Facets
A number of DSpace repsoitories enable searching by Subject, Title, Type and Authors. See the DSpace at Cambridge search page at  Users can browse these different categories in DSpace. However both Drupal and Eprints use advanced search for items such as format or type and most of the other categories Drupal has the capacity with the Views module to create a number of different categories and facets that would be useful to users of the system but Eprints does not have such a module. Eprints has a number of  plug-ins and more plug-ins could be developed to facilitate browsing by categories and facets.

Tagging by Users
Hedden (2010) suggests in ”The Accidental Taxonomist”  that “the wording that is most likely to be looked up by the intended users/audience- in other words the preferred language of the taxonomy’s target population-should take precedence  over other criteria” (pg. 79) in choosing preferred terms for a controlled vocabulary. This is the primary reason why I think users need to tag or add uncontrolled keywords as my collection is built. The perspective of a researcher with a PhD in Physics or Astronomy is much different than the perspective of a student researcher or a the creator of the collection and may lead to much different sets of keywords. If users participate in the selection of  keywords and terms, then they will find the digital collection much easier to search and browse. As the creator of the system, I will gather invaluable information on how users could search the digital collection.


Hedden, H. (2010). The Accidental Taxonomist. Medford, NJ: Information Today Inc.

Hedden, H. (2010). Taxonomies and controlled vocabularies best practices for metadata. Journal of Digital Asset Management 6, 279 – 284. doi: 10.1057/dam.2010.29


2013 Open Repositories Conference

Search User Interface proposal for Subject Repositories: DSpace implementation for Retrieved from

Presentation on DSpace implementation for

DSpace Discovery: Unifying DSpace Search and Browse with Solr

Earlier Version of DSpace

This entry was posted in repositories and tagged , , , , , , . Bookmark the permalink.

1 Response to IRLS 675 Unit 9: Subject Listings, Keywords, Tags, Categories and Facets.

  1. Heather says:

    This really is the third blog post, of yours I actually read.
    And yet I like this one, “IRLS 675 Unit 9: Subject Listings,
    Keywords, Tags, Categories and Facets. | cloudban777” the
    most. Take care -Katrina

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s