Friday, August 22, 2008

An Experiment in Coffee Storage

CoffeeCoffee is perhaps the most popular drug on Earth. There is broad agreement on the different methods to brew coffee, but there is little agreement on how to properly store roasted coffee beans. While everybody agrees that coffee should be ground and used within two weeks or so, this is not always possible. We conducted an informal test and came to the conclusion that chilled, but not frozen, storage offers the best tasting coffee.


We drink coffee; so do many people. You get good coffee by buying whole beans and only grind it as you use it. Normally you should use coffee beans within two weeks of roasting. For the disorganized caffeine addict - such as myself - this is not practical as you constantly risk running out of coffee. And having me got without my morning coffee may be entertaining for onlookers, but rather unpleasant - not to mention dangerous - for myself. So it is prudent to buy and store coffee in larger batches.
But how to store it? It turns out that there is no consensus on this. Some argue for storage in the freezer; that would preserve ethereal oils in the bean that otherwise evaporate. Other argue for room temperature as it keeps the beans dry and away from other foods that can impart flavour. Both groups generally dismiss refrigeration, arguing that it is neither as dry or free of smell as room temperature, nor as effective as the freezer.

As this is a question of vital importance to humanity - or the subset that depends on coffee to wake up in the morning - we resolved to find out, so in June we set up a small experiment. Since this is a real experiment of sorts I thought it could be a nice illustration on how to go about an experiment like this and how to present it. Science is a method or a way of thinking about things, after all, and it doesn't have to be about complex or esoteric things. As a further illustration I write this post in the approximate format - if not the language - of an academic paper1.

In early June we put fresh coffee beans into three sealed plastic bags. One bag went into the freezer, one into the refrigerator, and one into a cupboard. During the Obon holiday last week we took the now two month old coffee beans, and - making sure we did this double-blind2 - brewed three cups. We tasted all three batches at the same time and ranked them in order of preference. We only identified the coffee after we had ranked the batches.

Coffee testing
The testing setup. We brewed coffee straight into the three glasses at the top, and while still hot, poured into the small glasses for testing.

We were in complete agreement on the rank order from the beginning; the difference between the batches was surprisingly large. Finishing dead last was the room temperature coffee. It felt thin and sour compared to the other two; there really was no contest. The refrigerator and freezer coffees were closer to each other in flavour and smell, but we both agreed that the refrigerated coffee did taste better and more balanced than the freezer coffee.

The result, in short, was:

1. Refrigerator
2. Freezer
3. Room temperature


In early June we took three small freezer bags such as you can find (not at all coincidentally) by the register in Life supermarkets, and filled with two spoonfuls each of whole roasted Hanshin Special Blend beans - our normal coffee. The amount is approximately equal to that needed for half a pot of coffee. The bags were unused and had been checked for holes. All three bags where compressed to remove as much air as possible, then tied up tightly. One bag was put in the kitchen cupboard we normally store coffee and tea; one was placed in the top shelf of the refrigerator door together with bottles of salad dressing and the like; and one was placed in the freezer.

We waited two months. While this is an unusually long time - we normally use a bag within three to four weeks or so - we wanted to give the coffee sufficient time to develop any differences due to storage type. Also, the Obon holiday was the first opportunity we had to conduct the experiment at our leisure.

For the testing we recruited two experimental subjects. The subjects were two healthy adult humans, one male, one female, that fully understood the nature and purpose of the experiment, and agreed to all conditions. Which is not surprising since the subjects are me and Ritsuko ourselves.

RandomizeThe experiment was rendered double blind by randomizing the labeling, twice, as you can see in the illustration to the left. Without Ritsuko in the kitchen I took out the bags and waited until they all were at room temperature. I opened them and poured the beans into bowls labeled 1, 2 and 3, chosen randomly ("Randomize #1" to the left). Then I left and Ritsuko entered. She took the three bowls, and randomly (we used lots) poured them into mugs labeled A, B and C ("Randomize #2"). As a result, she did not know which bag of coffee was 1, 2 or 3. I knew, but did not know which of the bowls had become A, B or C. Neither of us knew which batch of coffee was in which mug, in other words.

We ground each batch using a common mill (not knife) type grinder, with the grinder thoroughly cleaned of coffee grinds before and between each batch (what fun that was). We don't have three coffee pots and filter holders at home unfortunately, instead we took three ordinary glasses (as seen in the picture) and loosely fitted a paper filter with a batch of beans in each of them. We poured the water in two steps, first a little just to soak the grinds, then enough hot water to fill each glass with coffee. The resulting coffee was a bit thicker and stronger than normal for us.
To test the coffee we used small snaps-glasses, one glass per batch per person (the small glasses in the image). Tasting was free-form, and we were slow enough that we got to taste each batch both hot and cool. We ranked each batch individually, then compared notes.


The results were completely unambiguous: the room-temperature stored coffee was ranked last by both me and Ritsuko. We were in complete agreement that it was significantly worse in every respect to the other two; thin and with a somewhat unpleasant sour aftertaste. When compared to the other two I was almost hesitant to finish the glass. Almost - I mean, it's still coffee after all.

The two chilled batches were significantly better, and closer in taste and smell to each other, than the room temperature batch. But here too, there was a clear difference and we were both in agreement. The frozen coffee was more bitter and thicker, while the refrigerated coffee was well balanced with somewhat less bitterness and richer flavours. Nothing changed as the coffee cooled: the refrigerated beans tasted somewhat better than the frozen ones, and a lot better than the room-temperature beans.


Our aim has been to find out how we should store our coffee - not how people in general should do so. There's a number of factors that could change our results, and we don't really have the time or inclination to control for all of them. The choice of coffee may influence the results, as will the storage time. We chose our coffee and storage time as representative of what we normally drink and the maximum storage time we use respectively. A different brewing method than drip brew, or a different coarseness to the grind may also affect the result.

Our brewing method leaves something to be desired, and could have influenced the results. The basics are sound - mill grinder, pre-wet the beans - but the lack of proper equipment meant that we got stronger coffee than normal, and the three batches were not brewed for precisely the same length of time. The results did not follow the order of brewing time, however, indicating that the effect is modest at best.

So, why this result? First, the poor performance of room-temperature storage seems to indicate that flavours do evaporate over time, even when sealed in an airtight container. It is worth noting, however, that "room temperature" in Osaka in the summer is normally around 30° even with running air conditioning, so this result may not hold in winter or in temperate regions.

We do not know why we have a difference between freezing and refrigeration; a different kind of experiment would be needed to determine it. But we can speculate. First, it seems reasonable to assume - as room-temperature advocates do - that freezing does in fact damage the beans to some extent when they freeze. This damage occurs for other foods as well after all, and is one reason you should not refreeze foods. The result could be that flavours, including bitter ones you don't want, seep out faster into the water than with fresh beans. The frozen, ground coffee would effectively act as if it had been ground more finely than was the case.

As I said in the introduction, refrigeration is generally dismissed with the arguments that it is less effective than freezing, and condensation dampens the beans, damaging them. But if freezing causes damage to the beans then the best storage temperature ought to be just above freezing. Condensation can only happen if there is appreciable amounts of water vapour in the surrounding air to begin with. But we made sure to empty each bag of air as best we could, and seal it tightly. There was little air, and thus little humidity around the beans to condense. It may be that refrigeration is damaging if done in a large jar with much empty space, or oif the coffee is frequently taken out then returned, so that humid air has a chance to enter and condense. Yet another experiment would be needed to determine whether humidity really does harm the beans and whether frequent use increases humidity enough for it to have an effect.

From this informal experiment it does seem that for storage of coffee beans, refrigeration is preferable to either freezing or keeping them at room-temperature. While we can't generalize this result based on one experiment, it seems that refrigeration may be a viable alternative to investigate in the general case as well.

I would like to extend a special thank you to Ritsuko for gracefully putting up with me even when I turn a simple thing like having coffee into an hour-long exercise in science geekiness.


Note 1: This is effectively a psychological or psychophysical experiment, so I'm vaguely following the paper structure of that field. While this standard is quite formalistic it's similar in spirit to many other experimental fields:

First a short Abstract that tells you what we did and what we found; just enough that people can decide if the paper is of any interest at all (many aren't after all). Then an Introduction that tells you the background and gives you an overview of the experiment and the results. This section, and the Discussion at the end, is often the only thing people read.

The Methods and Results section dives into all the nitty-gritty details of the experiment and data analysis. This is important information if you want to replicate the experiment or are interested in some specific method information. Most readers are not, and never more than glance at these sections.

In the Discussion we talk about the meaning of our results. Here we can be a little speculative; while the Results section is all about the facts, here we try to interpret them. Why did we get this result? How does it fit in with other studies? Could this be due to experimental errors? What new questions do we have and how could we resolve them (the answer to this is often "give us funding for a follow-up experiment")? This section, together with Introduction, usually has all the important results of the paper, and most people never read more than that.

At the end of the text we thank everybody we think contributed to the work (it's polite - and politic - to thank the agency funding the research for instance). This sometimes gets its own, tiny, section. At the very end come the References. This is actually as important for many readers as the paper itself. Any one paper is usually just one nugget of information, and you need to read and correlate a number of related papers to get the whole image of a particular research area (or read a textbook where someone has done the work for you). References are even more useful the other way around: search for papers referring to the current paper and you can follow the results and consequences of the research. In our case I skip the reference list; on the web it's more natural to embed links directly into the text.

Note 2: Many experiments are about having people comparing things - the taste of different coffee batches in our case, or the results of two different medical procedures, or comparing the effect of a new drug with an older drug or with no drug at all. But we can't just try it and see what happens - we are only human, and our expectations and biases can completely change our perceptions and distort the results.

If a patient in a drug trial knows that they are getting the new drug, for instance, they will show more improvement (and more side-effects) than those not getting it even if the drug really has no effect at all. In the same way, since we expected that the freezer coffee would taste best, we probably would have thought it really did had we known which coffee was which. I noticed a similar effect when I owned a motorcycle: when I had washed my bike, polished it and generally cleaned it up it felt distinctly smoother, faster and more responsive even though its looks have absolutely no effect on performance.

So you want your experimental subjects to be blind to what they are getting. You give some subjects the real drug, others an identical-looking pill with no effect. But, again we are human and we are extremely good at reading people. So if the researcher knows which drug a patient is getting and expecting a certain result, the patient may unconsciously read that expectation from the tone and body language of the researcher and react accordingly - a form of the Clever Hans effect.

The only way to make sure expectations don't affect the results is to make sure nobody involved knows who is getting what. Make both the subjects and the researchers blind to the test; make the test double blind. We only reveal which subjects got the real drug and which did not after the experiment is over. In our case, of course, we are both subjects and researchers, so blind and double blind is effectively the same.

No comments: