Monday, September 19, 2011

Reference Management Software

"Reference Management Software." Just listen to that noun; it's dull and dry as a bone. But, for those of you who don't know it's a hugely important tool for researchers. "Reference management" is all about being able to find research papers, to organize and sort them, and to create references for your own papers. Here's an interesting comparison of four RM systems.

For those that don't know, though, what are they and how do you use them?

Imagine, if you will, a research paper. This one, for instance, about the distribution of city sizes1. Now imagine you're working on a project on cities and stumble onto this one. It may beome in useful so you save it for later. You drop it into your reference manager — RM — software; it saves the PDF file, and records the name of the paper, the authors, the date and place of publication, when you added it, and perhaps also keywords, what webpage you got it from and other relevant information.

It's some time later and your project is picking up steam. You vaguely remember some paper about city size you saved earlier; or you're looking for all papers by Ethan Decker; or looking for any paper that mentions Zipfs law — all of which would match this paper — so you turn to your RM and use its search function to find the paper again. You could search for "Decker", or "Zipf", or "PLoS city" and get all your saved papers that match. You find the paper and look through it, and as you read you add tags and notes about the paper right in the RM itself.

It's later still and we've read a lot, done our own research and we're well on our way writing our own paper. We need to add citations to all our sources in the text, and make a bibliography at the end. How you do that varies depending on how you write. If you use Word or OpenOffice you probably use a plugin that helps you search and insert citations from your RM. If you use LaTeX (and if you write research papers you should. Really.) you can simply export all your citations into a file that LaTeX will be able to read. In either case you're saved from having to type in and format all the citations and reference list manually — if you've never done that by hand I can tell you it's long, dreary work that is hard to get right and a pain to keep up to date.

OK, but why can't you just keep the papers in a folder and just get them from there? You could, if you have just a few dozen papers. You could be more methodical and save each paper by author and title, then store in a separate folder for each journal and year. But eventually it becomes unmanageable. I have several hundred papers saved just for my current project, and there's people out there with tens of thousands of papers stored. There is no way you can remember all the relevant papers you have, never mind actually remembering any particular details about each paper.

A reference manager helps you not only store your papers. A good manager helps you get them and add all the metadata it can without your assistance. It helps you search for and find relevant papers when you need them, and lets you add comments, notes and tags to them so you can find them again and remember what they were about without having to reread them. And it helps you generate and format your citations and reference list in a consistent, correct fashion.

For what it's worth I use Zotero as my manager. It's cross-platform — it works anywhere Firefox does — and free and open to use. The review above does a pretty good job of showing its strengths and weaknesses.

What I like most about it is the ease of importing and exporting data. When you find a paper online you just click on the import icon in the browser bar and the paper and citation data is automagically downloaded and indexed. It doesn't always work; for a few sites it only collects the citation data and you have to add the PDF file yourself, but it's not a major hassle. Exporting is also really easy. I just add papers to a collection specific collection for the paper I'm writing, then export the entire collection as a BIBTex file that LaTeX can read and use. It really can't be simpler than that.

If you're a budding researcher and doesn't yet use a reference manager you really owe it to yourself to start today. Pick any one you like, of course, but I think Zotero is a good choice if you don't know which to pick.

#1 It's a neat paper, and I've been meaning to write something about it. Very shortly, they show that while larger city sizes are described well by a power law, regional and smaller communities are better described by a lognormal distribution. They show how a very simple model of human migration and reproduction can generate the real distribution of smaller communities.

It could mean that network effects — that we want to live in a city because other people do so already — only kicks in over a certain city size. Below that size the number and distribution of communities depends mostly on local random migration and reproduction patterns. Smaller communities do not inherently attract people the way larger cities do. Is the paper correct? I don't know, but it's very interesting.


sajith said...

I've been using mendeley with some success. I like that they have a Debian/Ubuntu version of the desktop software. I'd have liked it to be real libre/free software, but I'm not complaining too much about that.

George said...

That has been an interesting read.
I just began a project about Economic development, I'll have up to December for ending it.

I am using a lot internet research and I've been saving webs and copying text and references; all on folders I've enabled. Still not overwhelming but it should become rather unmanageable soon.

I Will try Zotero. I've gotten used to Chrome but FF has interesting features for research. Like restoring a session, which lets me pick up where I closed, is useful. But I've gotten used to the chrome speed.

Will give a better read tomorrow, as it's time for me to sleep. The advice is very much appreciated. Although my project is much lighter than Postuniversity work, like you do.

Janne Morén said...

Sajith, George, the thing Zotero really shines at is the integration with the web — you can get the paper, or web page, or quotation or whatever right then and there, with just a clock on the icon in the address bar.

With other systems you need to do it separate from the browser, and I end up forgetting one or two in a long list of reference material I looked up.

I've used Chrome too — have it on my machine right now — but I just don't see the large speed difference other people talk about. The main problem with FF for me is the memory consumption; though if I'm running Zotero in it it's perhaps not surprising it should be pretty high.

Allerwelt said...
George said...

Well, I can write a bit after a few days using Zotero.

Really nice tool. I haven't exploited all what it might offer, but I like the way of being able to see all the info of that particular source.

I also want to reintegrate Firefox to my daily use again. As of the memory consumpion, yes. ALthough chrome managed to freeze my Laptop for a minute, numerous times, when opening a new tab.
IT is a complex thing, whatever.

Thanks for the article! Zotero should ease a bit the reference and bibliography part of my project.