Mar 162013

nowheretogoIn today’s Guardian newspaper geneticist Steve Jones has a short column replying to a 7 year old child who had asked “Will humans evolve into a new species?“. Jones is known in the UK as the media’s favourite geneticist and evolutionary biologist; he is a frequent guest on media shows and contributor in print media. Unfortunately, although very polished, and far from incompetent, he really isn’t very good with the details. He seems to be a self-confident man and often promotes his personal (not very mainstream) views at the expense of what evolutionary geneticists in general think. I don’t like this much, especially when the places he does it are looking for science information as currently understood rather any one person’s views.

Replying to the 7 year old today he first talked about how the speciation process is driven primarily by natural selection (I’m not going to address that in this post though many would be uncomfortable with that idea too). In the second part of the column he goes on to run out his view that evolution has stopped for humans. I’m actually not going to pick apart this silly idea, though many others have, but really just to encourage him to publish as soon as possible. I haven’t found any academic paper in which he puts forward this view, though he has been talking about it in the media for approximately 20 years. If this idea were true it would be important, very important, and very interesting. I would love to read that paper. He should gather his evidence and publish it as soon as possible in a peer reviewed open access scientific journal. Or else shut up.

Some other scientists’ views on Steve Jones’ ideas:

Human evolution stopping? Wrong, wrong, wrong
No Virginia, evolution isn’t ending
Evolution, why it still happens (in pictures)
Steven Jones is being silly
Not the end of evolution again!
Some comments on Steve Jones and human evolution

Mar 102013

Error_404I’ve been thinking about sustainable and accessible archiving of bioinformatics software, I’m pretty scandalized at the current state of affairs, and had a bit of a complain about it before. I thought I’d post some links to other people’s ideas and talk a bit about the situation and action that is needed right now.

Casey Bergman wrote an excellent blog post (read the comments too) and created the BioinformaticsArchive on GitHub. There is a Storify of tweets on this topic.

Hilmar Lapp posted on G+ on the similarity of bioinformatics software persistence to the DataDryad archiving policy implemented by a collection of evolutionary biology journals. That policy change is described in a DataDryad blog post here: and the policies with links to the journal editorials here

The journal Computers & Geosciences has a code archiving policy and provides author instructions (PDF) for uploading code when the paper is accepted.

So this is all very nice, many people seem to agree its important, but what is actually happening? What can be done? Well Casey has led the way with action rather than just words by forking public GitHub repositories mentioned in article abstracts to BioinformaticsArchive. I really support this but we can’t rely on Casey to manage all this indefinitely, he has (aspirations) to have a life too!

What I would like to see

My thoughts aren’t very novel, others have put forward many of these ideas:

1. A publisher driven version of the Bioinformatics Archive

I would like to see bioinformatics journals taking a lead on this. Not just recommending but actually enforcing software archiving just as they enforce submission of sequence data to GenBank. A snapshot at time of publication is the minimum required. Even in cases where the code is not submitted (bad), an archive of the program binary so it can actually be found and used later is needed. Hosting on authors’ websites just isn’t good enough. There are good studies of how frequently URLs cited in the biomed literature decay with time (17238638) and the same is certainly true for links to software. Use of the standard code repositories is what we should expect for authors, just as we expect submission of sequence data to a standard repository not hosting on the authors’ website.

I think there is great merit to using a GitHub public repository owned by a consortium of publishers and maybe also academic community representatives. Discuss. An advantage of using a version control system like GitHub is that it would apply not too subtle pressure to host code rather than just the binary.

2. Redundancy to ensure persistence in the worst case scenario

Archive persistence and preventing deletion is a topic that needs careful consideration. Casey discusses this extensively; authors must be prevented from deleting the archive either intentionally or accidentally. If the public repository was owned by the journals’ “Bioinformatics Software Archiving Consortium” (I just made up this consortium, unfortunately it doesn’t exist) then authors could not delete the repository. Sure they could delete their own repository, but the fork at the community GitHub would remain. It is the permanent community fork that must be referenced in the manuscript, though a link to the authors’ perhaps more up to date code repository could be included in the archived publication snapshot via a wiki page, or README document.

Perhaps this archive could be mirrored to BitBucket or similar for added redundancy? FigShare and DataDryad could also be used for archiving, although it would be suboptimal re-inventing the wheel for code. I would like to see FigShare and DataDryad guys enter the discussion and offer advice since they are experts at data archiving.

3. The community to initiate actual action

A conversation with the publishers of bioinformatics software needs to be started right now. Even just PLOS, BMC, and Oxford Journals adopting a joint policy would establish a critical mass for bioinformatics software publishing. I think maybe an open letter signed by as many people as possible might convince these publishers. Pressure on Twitter and Google+ would help too, as it always does. Who can think of a cool hashtag? Though if anyone knows journal editors an exploratory email conversation might be very productive too. Technically this is not challenging, Casey did a version himself at BioinformaticsArchive. There is very little if any monetary cost to implementing this. It wouldn’t take long.

But can competing journals really be organised like this? Yes, absolutely for sure, there is clear precedent in the 2011 action of >30 ecology and evolutionary biology journals. Also, forward-looking journals will realize it is their interests to make this happen. By implementing this they will seem more modern and professional by comparison to journals not thinking along these lines. Researchers will see strict archiving policy as a reason to trust publications in those journals as more than just ephemeral vague descriptions. These will become the prestige journals, because ultimately we researchers determine what the good journals are.

So what next? Well I think gathering solid advice on good practice is important, but we also need action. I’d discussions with the relative journals ASAP. I’m really not sure if I’m the best person to do this, and there may be better ways of doing it than just blurting it all out in a blog like this, but we do need action soon. It feels like the days before GenBank, and I think we should be ashamed of maintaining this status quo.


Sep 292011

I just noticed that an old publication of mine has a typo in the title on the Nature website “Animal mitochondrial DMA recombination”, anyone want to guess what the error is? Web of Science has the correct title, Pubmed has the correct title, and it used to be correct on the Nature website, but somehow they have gone there and broken it. It’s an easy typo to make, M and N are adjacent keys. Easy if you are retyping stuff- but who on earth retypes journal article titles? You are professionals, copy and paste it from a reliable source. Jesus.

What makes it worse is that when this paper (the paper that got me a job) first came out early on the Nature website they had spelled my name wrong! Adjacent keys typo again. They fixed it eventually, but it took a while. When you are a postdoc applying for jobs it only adds to the stress.

So I decided to email them to let them know. After about 10 minutes looking for an appropriate way to contact them on the website I’ve just given up. Hey, I’ve got a job now, and it makes Nature look stupid not me.

Lunt, D. H., and B. C. Hyman (1997) Nature 387:247-247. PDF Animal mitochondrial DNA recombination (no doi in my citations anymore as it points to a webpage with a stupid unprofessional typo).

Jul 182011

There have been several obituaries for Horace Judson recently [1][2], and today Larry Moran in an excellent Sandwalk blog post talked about the lack of knowledge of the history of their field by molecular biologists

modern researchers are completely unaware of the history of their field. That’s partly because the work on bacteria and bacteriophage—where the basic concepts were often discovered—is no longer taught in biochemistry and molecular biology courses. This leads to the false idea, as expressed in the press release, that all new discoveries in eukaryotes are truly new concepts that nobody ever thought of before. The solution to this problem is to make all students read The Eighth Day of Creation.

I liked the quote from John Hawks too

I suppose we could rephrase Santayana: Those who ignore history feel privileged to reinvent it.

Judon wrote the truly epic book “The Eighth Day of Creation: Makers of the Revolution in Biology” which describes in detail the development of molecular biology from extensive interviews with its early pioneers. It’s a great read, his writing style is easy and absorbing, and the content fascinating. Despite not having yet finished the book, I can recommend it very highly indeed. What? Wait, you haven’t finished the book yet? How good can it really be? It’s a great book, but one that suffers from poor publishing by Cold Spring Harbor Press. Let me get my excuses out of the way now; I’m really busy, have little time for reading things that aren’t journal articles, and have a big backlog of other books to read. Yet these aren’t the real reasons. The real reasons are that it is enormous and only comes as a paper copy. The book, at 714 pages, is very weighty and thick even as a paperback. It is about as thick as a single volume of this size can be, and of course the pages themselves don’t open out very flat. It is pretty heavy and I have decided not to take it on holiday with me based on this alone. That is a shame, as holidays are when I catch up on reading.

There is a simple solution however – release it as an eBook. I would love to read this as a Kindle book on my iPad and be able to take it anywhere and just dip into it. It wouldn’t matter then how long it was. What is more I would be able to look stuff up when sitting in seminars and journal clubs, just quickly checking the history of a topic. Lastly I would like to be able to highlight and comment on sections. I have an absolute phobia of writing in books, I just can’t do it. Somehow (almost religiously) I know it is just plain wrong, even though I can’t think of a single reason why. I have no such qualms about marking up an eBook however, highlighting sections and adding notes. These notes and highlighted sections are searchable and easily found again- very useful indeed.

Although I really agree with Larry Moran’s concluding sentence “The solution to this problem is to make all students read The Eighth Day of Creation” I think that the chances are remote without good modern publishers helping the process along. Do something useful today, go to the Amazon webpage of Eighth Day of Creation and click on the link (usually just under the picture) to request a Kindle version from the publisher.

Jun 202011

I hope I’m going to submit my PhD student’s first comparative genomics paper very soon. Three of us have written the manuscript collaboratively using Google Docs. GDocs is an online word processor and although I’ve used it quite a bit before, this is the first time I’ve used it to write a manuscript with colleagues. Its been (almost) excellent, here’s the review.

I’ve had a number of manuscript experiences where I’ve spent long hours trying to collate different authors’ contributions into the same Word document. The idea of using GDocs is that multiple authors can have the same document open at the same time making changes without any conflicts or the whole thing crashing. You never have to ask “which is the live copy?” since there is only one copy.

Good parts

  • Nobody has had to collate mutually incompatible versions into one document and circulate (again) for people to check.
  • It is a clean GUI and a pleasure to use.
  • There is a good comment system and these are supplemented with a realtime discussion panel, just like Skype or other IM client.

Less good parts

  • It doesn’t work well in some older browsers. Tell collaborators to use Chrome, otherwise they may complain that its a bit rubbish and doesn’t work properly. Using Chrome there are few to no problems (ie better than Word).
  • You cannot use any sensible reference software. Mendeley, Zotero or any of the other reference managers you know will not allow you to insert references and format a bibliography the way you would in Word.
  • Track changes is not as good as in Word.

Google Docs revision history

Overall its been great I think. There are a few things I really wish were different. Track changes could be easily improved to identify who has done what. Yes versions of the document can be compared, and rolled back to previous versions, both of which are useful but none of it is quite as obvious and easy to use as in Word. More than anything I really wish that reference management was better. We have been typing in place holders (Smith 2000) and then exporting the document as a Word file and introducing the citations using a reference manager before submission. This sounds bad. Why not just do everything in Word? Well, even today, two of us were making some last minute changes on the Word version, each copy with someone’s initials appended and somebody tomorrow has to reconcile it all.

I think Google Docs if adopted widely would have a great impact on writing multi-authored manuscripts. I don’t think it will be very widely used in science though unless reference managers can integrate with it properly. Despite this I have really enjoyed writing a manuscript with it, and, even though it has to be passed through MS Word at the end, on the whole I’ve much preferred it.

Apr 122008

I’ve heard a number of people saying anonymous peer review is broken and we need a different publishing model. One where reviewers cannot hide behind their anonymity. I don’t think it is broken. Actually I think anonymity is essential, one of the few things that protects science from politics and human nature. Sure there can be other ways of reviewing, and variety is not so bad. I am following the progress of journals that have other models with interest. The tone of much on the web however is that anonymous peer review is bad, I think this is an important topic and I outline below why I think we need to protect it and even celebrate it.

I would have serious concerns about reviewing if it was not anonymous. I think what I say would be different, and in some ways less honest. Not in the way sometimes implied- that reviewers “hiding behind” their anonymity can unfairly trash papers with no comeback. Good editors can guard against this to some extent. Instead I would feel more exposed, the anonymity is my protection, it is the thing that allows me to be honest. I cannot imagine in a review saying to someone who knows me (as I should do) that their work is trivial, the importance they claim for their work is not supported by evidence, and that they are not well-read enough in their apparent area of expertise. It would probably not be possible to collaborate after that. Yet it sometimes needs to be said. I have had some of these things said to me and I have had to take a long hard look at my work, and make sure I could rationally justify the importance and quality of the work. I am a better scientist for having had to carefully face those questions. I have also said these things (politely I hope) to others in reviews. How people react to blunt comments is, as we all know from everyday life, very variable. Some people hold grudges, some people are aggressive, some people look for revenge for supposed slights, others misinterpret your motives, others resort to criticism of your weaknesses rather than addressing their own. This isn’t good for science. Anonymous review reduces the effects of all this.

I’m not a weak or nervous person. I’m not easily scared or intimidated by anyone. Rightly or wrongly I do not feel particularly worried about my career, nor feel much need for accolades or approval. Yet I still really need the protection provided by anonymous review. I think it is an integral part of my subconscious honesty as a reviewer.

Different fields of biology are very different in the personalities they collect. Some disciplines within biology take constructive criticism constructively, with healthy debate, and evidence-based progress. In other areas the standards are not the same and people respond very differently. I have worked in quite a number of different research and taxonomic areas with some very open, with a healthy reaction to debate and alternative views. Others are not like this, as I have learned to my cost. Alternative views are treated aggressively like religious heresy. Incorporation of rigorous experimental design and hypothesis testing are rejected because the answer is “obvious” or apparently proven, and leading people have usually staked a lot on the current ideas. I could speculate on the psychology I think underlies this, but this isn’t the place. There is one area I work in occasionally only because I know I can abandon it at any point and get on with other things if the huge egos and paranoia start to drown out my ability to do stuff. Some colleagues are specialists and not so fortunate in their ability to resist these pressures since their entire research careers will be spent in the company of the same people. Surely that would influence open identity reviews. It is dangerous to extrapolate from one’s own confidence and security in a positive discipline to what other scientists experience day to day in other areas of science.

It seems to me that we must compare the risks of two things
(1) Reviewers hiding behind their anonymity to maliciously reject manuscripts for reasons of personal prejudice or gain
(2) Reviewers failing to review to the best of their abilities due to either conscious or subconscious concerns of the consequences of their review

Its hard to quantify and assign significance to the relative risks. Gut feelings and anecdotal evidence are largely what we have. My gut and my experience tells me (2) is much more frequent than (1). My concern that provoked writing this is that I have seen lots of comments on this topic but none has strayed far outside considering situation (1). This does not represent the breadth of the necessary debate. I am actually proud of the concept of anonymous review. I think it is something that stands science apart from other areas of life. It could be improved with reciprocally blind review, but it is a fine thing, it is saying ‘tell the truth, and you need not worry about the consequences of that truth’. I think it is something important to defend.

Feb 122008

OK, so I was going to write about bioinformatics and phylogenetics in this blog, and here is my second post already that ignores both!
I just read a paper testing the relationship between beer consumption and publication output of ecologists in the Czech Republic. Apparently the higher your consumption the lower your total number of publications, total number of citations and citations per paper. Damn.
One of my colleagues just said she was very surprised that there wasn’t a positive relationship. I wouldn’t have been entirely surprised if the relationship had been the other way either- I’ve always thought most “thought science” was actually done in pubs and coffee rooms. But it seems that might not promote actually doing the experiment or writing the stuff up.

Grim, T (2008) “A possible role of social activity to explain differences in publication output among ecologists” Oikos, doi: 10.1111/j.2008.0030-1299.16551.x
“Publication output is the standard by which scientific productivity is evaluated. Despite a plethora of papers on the issue of publication and citation biases, no study has so far considered a possible effect of social activities on publication output. One of the most frequent social activities in the world is drinking alcohol. In Europe, most alcohol is consumed as beer and, based on well known negative effects of alcohol consumption on cognitive performance, I predicted negative correlations between beer consumption and several measures of scientific performance. Using a survey from the Czech Republic, that has the highest per capita beer consumption rate in the world, I show that increasing per capita beer consumption is associated with lower numbers of papers, total citations, and citations per paper (a surrogate measure of paper quality). In addition I found the same predicted trends in comparison of two separate geographic areas within the Czech Republic that are also known to differ in beer consumption rates. These correlations are consistent with the possibility that leisure time social activities might influence the quality and quantity of scientific work and may be potential sources of publication and citation biases.”

Feb 032008

Well no, I guess they’re not. But I’ve been thinking about whether some types of book publishing are even worthwhile. So, what’s my complaint? Well, in many cases I don’t see why either publishers or paper books are of any use. I think researchers can easily self-publish PDFs that are open access, free and may even have higher production values than many books I’ve seen recently.

Multi-author edited books can be great. They can take a long time to get to print though, are not picked up by Google searches and are usually VERY expensive. Is this even worthwhile? So, you have a chapter in an edited volume. It probably took a while but its published. The book costs maybe 100 pounds and isn’t publicized heavily by the publisher (well they aren’t going to sell many whatever they do). It’s bought by a few people in the area and a number of university libraries, then its out of date and forgotten. Sad.

Alternatively, you write up your chapter as before, save it as PDF then upload it to a public repository where anyone can download it for free. Total time involved in publishing- minutes. Readership will be much higher because its a free PDF, fully visible to Google, of course people are going to download it!

But what about peer review?”. Well frankly books are not often peer reviewed to the same standards as journal articles. Research in peer-reviewed journals is different to that in books. In self-publishing though you could always ask a colleague to send it out for anonymous review. Books should have a statement describing their peer review process (if any) but they don’t. If you’re self-publishing you should do this.
What about how nice it is to have a beautiful book sitting on my shelves, or in the library?”. OK I sympathize. Books do look nice, and they keep people making bookshelves in employment. But once you have your work finished you can upload it to an online publisher and get any number of “proper books” sent to you, much much cheaper than typical publishing. If anyone else needs a hard copy to take on holiday they can just download your published book, upload to the online publisher and get a single (cheap) copy mailed to them. Have a look at Lulu or Blurb. Upload a PDF to Lulu and get a single 200 page book for about £20 (US$35). Sell your book in their online bookstore if you want.

But how could I cite it?”. If you upload a PDF to say Nature Precedings you will get a permanent doi number. Lulu offers an ISBN too.

So I’ve been burned in the past. It took 5 years after our chapter was reviewed and accepted (“in press”) to it appearing in print. Exceptionally long? Yes. It took so long because of editorial ineptitude along with publisher sloth, change of publisher, more sloth. But what if it was just 9 months, would that 9 month dead time be OK? No it wouldn’t, you can self-publish in a few minutes. Why delay things for 9 months? We made the naive mistake of putting new stuff into the chapter, but after so many years in press it wasn’t new or even a good review of the literature any more. Gómez A, Lunt DH (2007) Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. In: Phylogeography of Southern European Refugia (eds. Weiss S, Ferrand N). Springer. It sells for £92 (US$182) by the way, save your money.

I like books, no really I do. But I’m not sure many publishers serve research scientists as well as we want. Why should they, they’re companies with shareholders! There are parallels both to world of open-access publishing here and to bands who avoid signing to a record label and just load their music onto iTunes. I will never publish a chapter in an edited book again. The revolution is coming….