During October 3-7, 2011, I ran a series of blog posts on expert catalog search skills. I’ve gathered together all the posts here so they can be read on one page.
This is the first in a series of posts this week about searching the catalog like a pro. I know, you are yawning already. Bear with me. This is one of those skills that will change your life once you learn it. No more lame catalog searches for you; you’ll know how to find stuff.
Some of you who grew up with card catalogs probably think you have this down pat. You might be right. But there might be a few tricks I can show you to take your search skills up a notch.
So let’s get to it.
When you go to the library’s home page, you are presented with two catalog search options:
The first is the search box that says “Enter keywords”. This is a new kind of catalog search option. It has some interesting features, but it doesn’t let you do the kind of power searching I am going to explain here.
The second option is to click on the link that says “HELIN Catalog Advanced Search”. That’s for power searching.
Now make sure your seats are upright and your trays are in the locked position, and hang on for the ride.
When you go to the advanced search page, you’ll have several options at the top. You’ll want to choose the one that says “LC Subject”:
“LC” stands for “Library of Congress”. Like most US academic libraries (and a majority of US public libraries), we use the Library of Congress Subject Headings.
Let me explain.
If you’ve ever used Wikipedia, you know that sometimes you’ll find a situation where the name of the article can refer to more than one thing. For instance, if I look up “bash”, there is a page called “Bash (disambiguation)“. On that page, Wikipedia points to the many different pages that it has for the term “bash”. Wikipedia doesn’t hit you over the head with this, but here’s what’s really happening: they are showing you their official titles for the articles that you might mean by “bash”. They separate them out, so that if you mean “Bash (Unix shell)“, it’s not the same page as “The bash (band)“.
Wikipedia calls this disambiguation.
The Library of Congress calls it authority control. (That sounds really awesome. Or maybe it sounds like a post-punk revival band. [Like Interpol.])
Tomorrow, I will explain why authority control is awesome, and how you can use it to your advantage.
In yesterday’s post, I introduced the idea of authority control and showed how it is similar to the idea of disambiguation in Wikipedia. Today, I want to go into authority control in more depth.
The purpose of authority control is to have one term – and one term only – to represent a concept, name, or place. (This post will focus on concepts, but you’ll be able to extrapolate this idea for names and places.) This official term is the authorized heading. The Library of Congress publishes lists of these authorized headings.
Why is this important?
If you want to get the best search results, then you’ll want to search with the authorized subject headings.
So, for instance, if you want to search for works about feminist philosophy, you should know that the official term is not “feminist philosophy”, nor is it “women and philosophy”. Instead, the official subject heading is “feminist theory”.
As an aside, the Library of Congress has been accused on more than a few occasions of using racist, sexist, homophobic, demeaning, or just downright stupid subject headings. If you ever want to discuss it, contact me. We’ll go grab a cup of coffee, and I’ll tell you about Sandynistas.
For now, we will leave aside the issue of whether or not you like the subject heading. The point is that there is one and only one subject heading for a given concept, and if you know it, your catalog searches will be awesome, because you will be able to find everything in the catalog that has that subject heading.
Before I get into the complicated part, let’s think about the power of this for a second. When you search Google or you search an article database, you are searching the full text of bazillions of documents. You can find lots of stuff using keywords. But you have probably had this experience when using Google: you enter some keywords, and then the results aren’t what you want. Then you change the keywords and see if you can get different results. You might do this a few times until you find something that looks useful to you.
The truth is, that’s a pretty ineffective search technique. There is no way for you to know if you got the best sites in your results, the most useful sites, or even a significant number of the sites on the topic you want. For instance, let’s say you hit the lottery and you want to hit the fresh powder at Vail. So first you Google ‘Vail fresh powder’. You’ll find some stuff. Then you try ‘skiing Vail’. That looks better. Then you think about what you’re really looking for, and you try ‘Vail ski vacation package’. Now you’ve got some really helpful results. But what you can never know is if you’ve gotten all the results on ski vacation packages at Vail.
Imagine you could look up an official term, like “Skiing–Vacation packages–Colorado–Vail”, and if you put it in Google, it would bring you only the appropriate results and all of the appropriate results.
Google doesn’t do that, but the catalog does.
Why? Because Google bases its searches on searching the full text of the website. There is so much stuff on the web, and language is a messy thing. There is no standard way of talking about anything. (Did you ever read Ernest Hemingway’s “Hills like white elephants”? It’s a short story about abortion that never once mentions abortion. It doesn’t even mention pregnancy. If you tried to do a keyword search on abortion, it would never be in the results, since the word ‘abortion’ is not in it.)
The catalog is different. When you search the catalog, you are not searching the full text of the items in the catalog. Instead, you are searching catalog records that have been created by professionally-trained catalogers.
I’ll pick up tomorrow by traipsing through a catalog record in detail, to show you where the expert searching power lives in the catalog.
In yesterday’s post, I explained why keyword searches aren’t very effective in the catalog, and why authorized headings make catalog searches superior to, say, Google keyword searches. Before I continue with that line of thinking, let’s go off on a tangent here and look at a catalog record. (If the print is too fine, you can look at the record here.)
Let me sum up what is included in records, so I can explain the really important parts:
- Publication Info (labeled “Publ Info”)
- Permanent link to the record (you can use this URL to send people to this exact catalog page)
- A box that tells you what libraries have the item, the call numbers at those libraries, and the status of the items
- Physical description (labeled “Descript”)
- LC Subjects
All of these parts are helpful in telling you something about the item. But the really interesting parts are the parts that have hyperlinks built in them. (In our catalog, those are the underlined blue terms.)
In this example, we have an author, a series, and LC Subjects that are hyperlinked. (Just for clarification, I want to mention that the call numbers are hyperlinked too, but if you click on them, they will take you to a page where you can browse nearby call numbers. That’s really helpful, since call numbers are arranged by subject, so you can see what ‘sits’ nearby in the collection. But the call number is not an authorized heading, which is my focus here.)
In tomorrow’s post, I’ll explain why the author, series, and LC subject fields are a Big Deal. Stay tuned.
These are all authorized headings. (Hoo boy! We are now getting into the heart of the matter…)
The Library of Congress maintains an authorized list of names, called the Library of Congress Name Authority File. (Catalogers call it the LCNAF for short.)
The names of series are also maintained in the Name Authority File. (I know that sounds weird, because it’s more like a title than a name. Just trust me on this one; series are in the Name Authority File.)
The Library of Congress also maintains an authorized list of subjects, called the Library of Congress Subject Headings, or LCSH.
The LCSH has about 265,000 subjects in it. The LCNAF has about 5.3 million name headings and about 350,000 series titles in it. In the LCNAF, those 5.3 million records include about 3.8 million personal names, about 900,000 corporate names, about 120,000 names of meetings, and about 90,000 geographic names. [More info here.]
What does all this mean?
When your friendly neighborhood cataloger (moi) adds a new item to the catalog, I have to record all that information you saw in the catalog record: the author, the title, the subjects, etc. There are rules on how to do this. (These rules sometimes seem more complicated than the tax code – but I digress).
The rules state that I have to use the authorized form of names, subjects, and series.
This is really important because it brings all of those names, subjects, and series together in the catalog. In our sample record, if you click on the author’s name, it will show you his name in a list of author names, and it will show how many items we have where he is the author. (In this case, there is just one.)
How about if we click on the subject “Feminism in literature“? It will show us an alphabetical list of subjects, and this one will be in the middle of the page. It will also show us how many works we have with this subject. (There are 26 as I write this.)
Similarly, I can click on the series, and it will show me a list of series titles that are nearby alphabetically, and it will also show me how many other works we have in this series. (In this case, we have just one.)
You may not realize it, but I just showed you a sneaky way of searching the catalog. Say you are researching a topic, and you found one book that is on the right topic – or is close to it – and you want to find more. Look the book up in the catalog and start clicking! It’s especially useful to click on the subjects, because that will take you to an alphabetical page of subjects.
This is important, because some subjects are more specific than others, but they should sit near each other in the list.
Let’s use the example of the subjectpoverty.
Note that the first thing on the list says Poverty-5 related subjects. That’s a link to 5 other terms you might want to try. (Score!) These terms are Basic needs, Begging, Homelessness, Poor, and Subsistence economy. This can be really helpful. You might have started by searching Poverty, but maybe what you really need are works about the Poor. The catalog can get you there.
(I mentioned there are 265,000 subject headings in the LCSH. What is also in the LCSH is an advanced hierarchy of subject headings. Some are broad, some are narrow, and there are links built between them to get you from one to the other. I’ll explain how it works in more detail tomorrow.)
Then you start with the list of subjects that start with Poverty.
Quick digression. When a cataloger records what a work is about s/he is supposed to record the most precise subject. If it is a book about poverty in general: across many times, places, and approaches, then the subject will be “Poverty”. but what if the work is really about an international meeting (also called a congress) on preventing poverty in developing countries? Then the catalog is supposed to use this subject: Poverty–Developing countries–Prevention–Congresses.
This subject is not included under the subject Poverty. If you want to look at works about both the very general topic and the other specific topics, you will have to click through each subject you want to investigate in order to see what we have on those various subjects.
So, the list of subjects you’ll see after Poverty include more specific subjects, such as:
And then there’s this strange entry:
Poverty–Bibliography — See also Poverty literature
When a subject like this shows up in the list, what it’s doing is showing you the authorized form. You might think to use “Poverty–Bibliography”, but the authorized form is Poverty literature.
Go sleep on it. Tomorrow I’ll wrap this up by showing you how to find the authorized name and subject headings so you can start mining the catalog like a pro.
In yesterday’s post, I finally got to the point of this series of posts: namely, that there are authorized subjects and names, and that they can be used to do the most precise catalog searches.
I have two more points to make about this kind of search. The first is on precision and recall. These terms are used in libraries and in other information retrieval circles to talk about how well a search works. When we talk about precision, we are talking about the search results being relevant. So, for instance, if you search for information on the Ugandan civil war, you are not going to be happy if most of your results have to do with the American civil war.
When we talk about recall, we are talking about the number of relevant results returned. Let’s say you search for books about the American civil war, and you get three results. In a university library catalog, that’s ridiculous. You should get hundreds, if not thousands, of results.
The point of using authorized names and subjects is for good precision and recall. If you find out that the authorized subject heading for the American civil war is “United States–History–Civil War, 1861-1865“, then using that subject heading should only bring up works about the American civil war AND it should bring up all the works on the American civil war. (Well, to be more precise, it should bring up works about the civil war along with a list of more specific subject headings, such as “United States–History–Civil War, 1861-1865–African American troops“.)
That’s the point of all of this. Authorized subjects and names are not designed to drive you crazy. (That’s just a bonus!) No, they are designed so that you can make the most of your searches.
I haven’t really gone into this in detail, but authorized names work very similar to authorized subjects. For instance, if you want to find the works by Bach, you don’t just want to search by “Bach”. (That’s a common name.) You don’t even want to search by “Bach, Johann”. (There are ten of those in our catalog.) If you want to make sure you’ve got the Bach, you should use the authorized name: Bach, Johann Sebastian, 1685-1750.
As an aside, if you want to bring up all the works by Bach, search by author and use the authorized form of his name. If you want to search for works about Bach, search by subject and use the authorized form of his name.
By this point, you should be chomping at the bit, ready to search the catalog for all kinds of stuff. Wait a minute! I’ve got to answer just one more question for you: where, exactly, are these lists of authorized names and subjects?
If you want to wrestle with the print version (which is actually really helpful), the Library of Congress Subject Headings are available in print in our Reference area. (These are the big Red Books I mentioned a few days ago.)
There is no print equivalent for the authorized names. The list is simply too long and it changes to rapidly.
So where do you find those? Both the names and the subjects are built in to the advanced search catalog. If you go to the advanced search option and search for the author “Aquinas”, for example, it will point you to the right name: Thomas, Aquinas, Saint, 1225?-1274. (Another aside: Pre-15th century names are often written this way. We call him ‘Thomas Aquinas”, but “Aquinas” is not his last name. It means he’s from Aquino. His name is really just Thomas, so his name starts with Thomas. Most authors during and after the 15th century are in inverted order: Lastname, Firstname.)
Similarly, if you search for subjects in the advanced search, it often will point you to the right one.
What other options do you have? As I mentioned yesterday, you can take the approach of finding something in the catalog on a related topic, and seeing what subjects are in there.
Best of all? Consult a librarian. One of the reasons we are here is to help you find the best authorized terms for your searches. (In fact, I get downright giddy when people ask me about subject headings. Finding subject headings is like playing detective, only without the trauma.)
If you are an upper level undergraduate or a graduate student, or a researcher in a particular area, it is a good idea for you to come to grips with the subject headings in your area early on. Don’t do a bunch of sloppy searches and then wait until the last second to start thinking about the authorized terms. Librarians are here to help you do searches with good precision and recall.
This concludes my series on expert searching. There is so much more I could cover; my focus here was on introducing the concept of authorized names and subjects, and explaining why they are important for the best searches. I really hope you found this information useful.
(This is the first time I’ve done a series like this on an advanced topic. If you have any suggestions or comments, I’d be glad to hear them. You can either use the comment box below, or email me. See my “About me” page for my contact information.)