Wednesday, December 8, 2010

We Have a Communications Problem

Test Tubes

We all recently got a nice example of how communication can go wrong, and how the research community suffers as a result.

A few weeks ago, NASA sent out a cryptic press release about an upcoming research finding "that will impact the search for evidence of extraterrestrial life." The net explodes with speculation, some wildly unfounded, some informed and down-to-earth.

The big day arrives, and it turns out to be an entirely terrestrial bit of research: Researchers have taken a type of arsenic-tolerant bacteria that lives in an arsenic-rich lake, and coaxed them to actually incorporate the arsenic into their own body chemistry. Kind of cool, but nothing extraterrestrial, no "new life" or some different kind of life or anything. The first round of news articles did their best to play it up as a major breakthrough, while the first reaction from actual researchers was cautious and not overly enthusiastic.

Good thing they were cautious. People have now studied the paper in more detail and the results don't look nearly as good as the authors first claimed. The results are tenuous and not well supported, and the bacteria may in fact not have incorporated arsenic as a functional part of their body chemistry at all.

So, we go from immense hype and speculation, to uncritical reporting of a less-exciting result, to serious doubt that there is any positive result to report on at all. Enormous excitement to complete letdown in a few weeks. What went wrong?

First, take a look at who finally popped the whole media balloon: working researchers in the field that read and reflected on the paper, then wrote up their comments on their blogs or on news sites. Would it not have been good for everyone if they'd be able to chime in right from the start, rather than weeks after the media frenzy? Why weren't they? Why did the whole thing crash so spectacularly?

One reason is spelled "embargo". Many news outlets refuse to cover events like published research unless they can publish their articles right when it's announced. A research project may take five years, and writing and publishing the paper can take six months or a year, but if your newspaper has to wait for two days while their reporter reads the paper they refuse to mention it. Also, the impact of publication is greater for the journal if it's accompanied by a flurry of press coverage at the same time.

What high-profile research journals do is embargo interesting papers: They forbid anybody involved from speaking about the research until the publication date, and give some science journalists advance access to the paper and to the researchers. That gives them time to prepare their articles and lets them all publish at the same time, with higher public impact for everyone.

There's a few problems with that of course. Since nobody else knows about the research the journalist can't ask other researchers for a different perspective. The "journalist" is reduced to a PR-flack rewriting the press release1 in their own words. That gives us all those completely uncritical, overly positive science articles like the ones accompanying this arsenic paper.

And the people who really are well-placed to give a solid opinion on the paper - other researchers - didn't have advance access, and couldn't give their opinion right at publication. We had to wait a week for that, and by that time the damage - to the researchers, to NASA's reputation and to the science journalists - had already been done.

This was made worse in this case by a misleading and sensationalist press release by NASA well before publication. It was designed to fan the flames of media hype from the start, and it succeeded admirably. The people who could normally pour some cold water of reason on those flames could not, since the paper was embargoed and could offer no solid opinion on it.

But research is peer-reviewed; why weren't the problems caught well before publication in the first place? We can't know for sure of course, but people are speculating that the problems were caught by reviewers, but were overruled by the journal editors.

The highest-profile journals like Science and Nature are different from the normal journals most papers get published in. They aim for a wide audience and tend to go for the ground-breaking and surprising stuff - the kind of research that leads to prices and fame (they are called glamour journals for a reason).

Now, they normally publish very high-quality research, don't get me wrong, and having a paper in either of those can make a career. I'd give a body part to have a paper in either journal2. But the reality is that while the research is often very high quality, the actual papers are not. Their allowed page count is too low to give a lot of details or a good bibliography, and in the scramble to be first they can sometimes be rushed and badly edited. I rarely cite a paper in Science or Nature - it's usually better to look for a longer, more thorough paper from the same group published in a normal research journal.

Newsworthiness can trump thoroughness, and people speculate that this is what happened here. From what I understand (this is not my own field) the group would have needed to conduct another series of control experiments to rule out plausible error sources, and that would have added six months or another year to the publication time. The editors may well have felt it was more important to get it out now, rather than wait another year and get scooped by a different group and different journal.

So, here's the problems, in turn: A paper gets substandard peer review, or the journal overrides the review in the interest of speed; the paper gets embargoed - kept in the dark from anybody with the competence to evaluate it - leaving journalists to interpret the results themselves, with no input from specialists; a besieged and attention-starved research organization publishes a factually wrong, hype-inducing press release that triggers a frenzy of speculation and media attention.

The one thing that's not a problem in this mess is the paper itself. Wrong papers are published all the time; that's part of how science works. We don't have peer review to catch wrong papers. It's there to catch papers that are uninteresting, or just replicating earlier results, or that have methodological or experimental problems.

You usually don't know if a paper is wrong until later - years or decades later, sometimes - when pitted against other results and analysed by other research groups. Einsteins theory of general relativity took four years to the first tentative tests and more than fifty years to get definite confirmation. The idea of an ether was around for centuries before getting disproved, and nobody still has a clue if some version of the string theory is the right description of the subatomic universe, more than forty years after it first appeared.

Now, if we'd not had an embargo this would never have become such a big problem. The paper would be published, people chime in on the science and it would never have become such a media debacle. Most journalists would probably have refrained from covering it, once they'd realized the paper wasn't all that amazing, and quite possibly wrong. The only reason to embargo results is to fan media attention, and as we see this can backfire spectacularly. If a newspaper refuses to cover a result unless they can get advance, exclusive access then tough - don't cover it. Embargoes are a bad idea.

But if you must have an embargo, make sure that 1) Everybody respects it - no advance press releases; and 2) include a selection of other researchers, not just journalists, among the people getting advance access. That'd cut the damaging hype, and it would give journalists a better basis on which to write their articles, and perhaps to decide it's not worth covering after all.

#1 And the paper, theoretically, though you'd be surprised how many science journalists have no background in science and couldn't read a research paper if their life depended on it. Rewritten press releases is too often all that you get.

#2 Well... One that grows back.


Jonas said...

I wanted to think of something clever to add to this discussion, but too busy at work with finishing up the next over-hyped paper (not! - it's for a rather obscure Japanese journal).

But yes, I can only agree that it's sometimes sad the way the science/journalism interaction goes - and extra regrettable the way NASA has made a fool of themselves this time.

I should add, though, than in my field (quantum optics/information) we do cite Nature and Science papers very often. We know that the research is usually great, and when citing them it is more often to put our own work in perspective than to refer to specific methods.

It does give a great pressure of trying to publish in those journals...

Janne Morén said...

I'm writing a paper too, which is why I've been so quiet here lately.

I guess citation frequency differs by field. It certainly seems areas like yours are publishing in NatureScience rather more often than mine.

It might have to do with page numbers, actually - short papers are fine for reporting on discoveries ("we found a new planet | a new particle | life on Mars | Sarah Palins brain!"), but inadequate for complex experiments with voluminous results. Just the procedure section of many neuroscience papers can be longer than the allowable page count in those journals. This may account for my experience that other papers are better references; neuroscience papers tend to get squeezed too hard in NatureScience and the text suffers as a result.

Jonas said...

LOL to your last discovery example :)

The page limit is definitely a factor, and sometimes it is frustrating to read a NatureScience paper that leaves a lot of possible questions unanswered because of the four page limit, incl. large figures and lots of bla-bla.

For something like the procedures, though, wouldn't it be possible to just put that in the Supporting Methods that goes online only, and concentrate on the 4 pages on the general outline and main findings?

Janne Morén said...

First, let me clarify my previous comment: I do cite Nature and Science papers too, and there are some excellent papers in my field published there. But not nearly to the extent you would expect for such high-profile outlets.

Yes, people do place methods in the Supplemental Materials. But that's becoem a problem - when the paper is 4 pages and Supplemental Materials is 20, and the paper is useless without the supplements, you can fairly start asking if the paper itself really is a valid publication any more?

I have heard a growing number of reviewers chafe against this trend; they signed up to review a paper, not a paper and a disordered data dump. Some (I have on hearsay) refuse to consider the supplemental material at all, reasoning that if it's important to understand the paper it should be in the paper itself. I sympathise with that.

Journal of Neuroscience - the one publication I cite the most - is taking the drastic step of abolishing supplemental material altogether. It's either in the paper or not at all. On the other hand, they have no strictly enforced word limit on the main sections, so if you really do need a lot of text you can have it.

Me, I think that's taking it a bit too far. I do think all relevant bits - the mathods, all experiments and so on - should be in the main text. If you can dump that into the supplemental data then you can skip it altogether. A reviewer demanding some extra experiment, then stating it can go into the supplemental section, should be treated by the editors as a non-request. If it really was that importnat why should it not be in the main text?

But supplemental sections are good for _data_. Measurements and data dumps, model and analysis source code, raw video and images - that sort of thing. We have a net to keep stuff in, and there's no technical reason any more why we can't access the original data along with the paper describing it.

Jonas said...

Those are interesting thoughts. I understand why you would want to keep all meaningful information in the main text, but on the other hand it is also kind of unfair to those types of research that cannot easily be squeezed into 4 pages, as you said.

In the Nature and Science papers I read, there is often lots of useful information in the supplement, but mostly aimed at researchers in the same specific research field. I do find that to be a fairly reasonable approach - it also indicates to the reader which parts of the paper are more important to read, if you don't have time for it all. Regarding the burden on the referees - if I remember correctly, then they should not take into account in their review the supplementary material. That would at least make sense.

Dumping data and putting source code and such in the supplements sounds very useful, but that might be quite field-specific. I very rarely see that kind of raw data, and I never considered providing that from my own experiments. Maybe it is less relevant for physicists, or are we just a bit behind on Open Science stuff?

Our equivalent of Journal of Neuroscience is probably Physical Review Letters, and that is still a very nice place to publish :)

Janne Morén said...

I think I've heard of it; is that the one with all the letters - Physical Review Letters A, B and so on? Yes, I would certainly not complain if I managed to get my next paper into Journal of Neuroscience either :)

I agree that supplemental material is not black and white. I do think that the pendulum has swung a bit too far in that direction though. A better approach - and the one we'll end up with, eventually - is to abandon the tight 4-page limit. You don't have page limits online and that's where all of this is heading.

I suspect dumping data and code will become commonplace sooner or later. The reality in more and more fields is that data amounts, and tool complexity, is increasing rapidly. You can forget about printing even the highlights of your data set if it is large (whole-genome sets for instance), and you may need so esoteric equipment that others will have little chance to regenerate it.

And models are also becoming more and more complex, to the point where a description of a few pages no longer will be sufficient to describe the model and data analysis in enough detail that anybody can recreate it. In the end, the best description of a model, or of a data analysis, is the code itself, and when systems become complex enough it may be the only viable description.

Fun discussion - if you find yourself in Osaka or Kyoto some day, we can continue it over a beer :)

Jonas said...

Yes, as in many other areas of media and publishing, things also seems to be changing rapidly for research publication. Who knows what form it will have 20 years from now? Both in terms of format, as we discussed here, and how it will be distributed - will it still be the big publishing houses, or will we have moved to some kind of open, social network-inspired system?

I'm sure it would be an even more interesting discussion after a few beers :)
I'll let you know if I get to Kansai (does not happen often, though). You're also more than welcome to send me a shout if you come by Tokyo with time to spare.

PS. There is Physical Review A-E, and then there is Phys. Rev. Lett. which carries higher impact results. Letters are strictly limited to 4 pages as well, while papers in the normal journals can be significantly longer. It's not uncommon to follow up a PRL paper with a longer one in one of the A-E journals with more details.