Dec 042012

Today I got an email from David E. Schindel, who is the Executive Secretary of the Consortium for the Barcode of Life, announcing Google funding for DNA barcoding. The project aims to create a reference library of endangered species COI sequences so that DNA barcoding can be used as a tool against wildlife trafficking. Good for them, this is a good use of money.

However I was shocked to read later in the email

DNA barcoding is a technique developed at a Canadian university for identifying species using a short, standardized gene sequence

What? Either this was typed and not checked in a bad moment or we have entered the world of barcoding political spin. I assume that ‘at a Canadian university’ refers to Guelph, where the the Canadian Centre for DNA Barcoding is based, lead by Paul Hebert.

The problem is that this Canadian group didn’t invent barcoding, neither the name nor the discipline. I can’t really go into a detailed history of DNA barcoding in this post but the statement in this email makes me squirm, just like when I hear politicians take credit for natural events or someone else’s work. But the meme is out there, the Consortium for the Barcoding of Life begins

In 2003, researchers at the University of Guelph in Ontario, Canada, proposed ‘DNA barcoding’ as a way to identify species.

I don’t want to deny Paul Hebert’s contribution, nor that of the barcoding organisations. They have together popularised, formalised, extended and refined DNA barcoding. DNA barcoding is a force for good in the world and they have explained it beautifully to many diverse biologists, gained funding for several large studies, and refined the methodologies. Good for them.

I would like someone unconnected to the international barcoding groups to write a history of the discipline in a broad context, not just the projects labelling themselves ‘DNA barcoding’. The origins of the methodology and approach probably lie with the bacterial 16S sequencers like Norm Pace. They used short standardised gene segments to identify species and although some bacterial projects were undoubtedly environmental surveys, assigning taxa into molecular clusters with little extra biological information, many others incorporated well-characterised reference strains, which is exactly what most people would describe as DNA barcoding. Jonathan Eisen has an article (“Barcoding” researchers keep ignoring microbes) of relevance here- make sure to read the comments. The first use of the exact term “DNA barcoding” is unclear to me, and may possibly be in classic Hebert paper (12614582), although Blaxter used something essentially the same in the title of his 2002 paper “Molecular barcodes for soil nematode identification” which also employed a short standardised segment of 18S rRNA (11972769). Although there are some who dismiss these sorts of similarity based groupings as ‘environmental surveys’ like those used for bacteria, Floyd et al  also use a phylogenetic approach to link their environmental sequence clusters (MOTUs) to known, classically-described species that have been identified through morphology and vouchers lodged in museums- see Fig 4 in Floyd et al 2002. This is DNA barcoding and differs from typical studies only in the reference locus used. Ritz and Trudgill (1999) cited Blaxter as talking about a ‘molecular bar-code’ a few years earlier in a 1999 publication (Ritz K and Trudgill DL 1999 Plant and Soil 212: 1–11).

Baker and Palumbi (1999) tree identifying whale meat samples by comparison to whale voucher specimen sequences.

So what about mtDNA studies? Well, I haven’t done real research, I’m just trying to remember stuff, and I would be delighted to hear of examples in the comments. It wouldn’t surprise me at all to find that John Avise’s group (pioneers of mtDNA analysis) had used mtDNA to match unknown samples to voucher specimens. They tended to use whole mtDNA and RFLPs though rather than sequencing, would that still count, what do you think? Certainly Silberman and Walsh (1364049) were identifying lobster larvae by RFLPs of PCR amplified rRNA early on, does that count? Alan Wilson’s lab developed some of the first ‘universal’ mtDNA primers used in ecology and evolution (2762322) and again I wouldn’t be surprised to learn that they had assigned unknown specimens to type by DNA barcoding. But they usually chose cytochrome b or 12S rRNA, so would that still count?

A classic DNA barcoding study was published in Science in 1994 (17801528). They took ‘whale’ meat samples from Japanese markets and tried to identify which species they really belonged to. This is almost identical to many classic DNA barcoding studies (10.1016/j.foodres.2008.07.005) in all but that they used a standardised section of the mitochondrial control region rather than COI. I could also mention Hoelzel (2001) “Shark fishing in a fin soup” who identified the species present in shark fin soup using cytb and NADH2 sequences compared to the database.

So what about COI? Folmer et al designed some of the earliest (and best) COI universal primers (7881515). These are great primers and still the most commonly used for DNA barcoding. I was unaware of the Folmer primers when I designed my own universal primers (Lunt 1994 PhD thesis)(8799733) and several labs were doing this. In Godfrey Hewitt’s lab at UEA we had up to that point been using conserved mtDNA primers from Richard Harrison’s lab at Cornell (they were in pairs named after US presidents and their wives). We weren’t barcoding, the primers were being used for phylogeography, phylogeny and molecular evolution studies. This background just illustrates that COI primers had been around and used widely in all types of evolutionary biology for over a decade before the famous Hebert et al 2003 paper. So had anyone used DNA sequencing of COI with universal primers to match unknown specimens to described vouchered species? Had anyone used this approach to discover and describe cryptic species (another important aspect of DNA barcoding)? Definitely, probably lots of people! A study I designed with Africa Gomez an published in 2002 did exactly this (12206243). We had known rotifer isolates characterised by morphology, mating, ecology etc. We had lots of unknown eggs and identified them using a phylogenetic analysis of COI with the standard barcoding primers. Were we the first? Definitely not, we never thought for a minute that we were the first to do this, but I couldn’t tell you who was. Let me just repeat that, we were NOT the first, we did NOT invent DNA barcoding, not even in animals. I just wish people would stop claiming to have ‘invented’ DNA barcoding and instead understand the context in which their work stands. I doubt very much that DNA barcoding in any meaningful sense had a single origin. It was not a moment of inspiration, it was incremental change, as almost all scientific advance is.

If you know any good science journalists please buy them beers and persuade them to write the history of ‘DNA barcoding’ in the wide sense, and especially of the work of the bacterial 16S pioneers, I’d like to read that.


Sep 292011

I just noticed that an old publication of mine has a typo in the title on the Nature website “Animal mitochondrial DMA recombination”, anyone want to guess what the error is? Web of Science has the correct title, Pubmed has the correct title, and it used to be correct on the Nature website, but somehow they have gone there and broken it. It’s an easy typo to make, M and N are adjacent keys. Easy if you are retyping stuff- but who on earth retypes journal article titles? You are professionals, copy and paste it from a reliable source. Jesus.

What makes it worse is that when this paper (the paper that got me a job) first came out early on the Nature website they had spelled my name wrong! Adjacent keys typo again. They fixed it eventually, but it took a while. When you are a postdoc applying for jobs it only adds to the stress.

So I decided to email them to let them know. After about 10 minutes looking for an appropriate way to contact them on the website I’ve just given up. Hey, I’ve got a job now, and it makes Nature look stupid not me.

Lunt, D. H., and B. C. Hyman (1997) Nature 387:247-247. PDF Animal mitochondrial DNA recombination (no doi in my citations anymore as it points to a webpage with a stupid unprofessional typo).

Feb 032008

Well no, I guess they’re not. But I’ve been thinking about whether some types of book publishing are even worthwhile. So, what’s my complaint? Well, in many cases I don’t see why either publishers or paper books are of any use. I think researchers can easily self-publish PDFs that are open access, free and may even have higher production values than many books I’ve seen recently.

Multi-author edited books can be great. They can take a long time to get to print though, are not picked up by Google searches and are usually VERY expensive. Is this even worthwhile? So, you have a chapter in an edited volume. It probably took a while but its published. The book costs maybe 100 pounds and isn’t publicized heavily by the publisher (well they aren’t going to sell many whatever they do). It’s bought by a few people in the area and a number of university libraries, then its out of date and forgotten. Sad.

Alternatively, you write up your chapter as before, save it as PDF then upload it to a public repository where anyone can download it for free. Total time involved in publishing- minutes. Readership will be much higher because its a free PDF, fully visible to Google, of course people are going to download it!

But what about peer review?”. Well frankly books are not often peer reviewed to the same standards as journal articles. Research in peer-reviewed journals is different to that in books. In self-publishing though you could always ask a colleague to send it out for anonymous review. Books should have a statement describing their peer review process (if any) but they don’t. If you’re self-publishing you should do this.
What about how nice it is to have a beautiful book sitting on my shelves, or in the library?”. OK I sympathize. Books do look nice, and they keep people making bookshelves in employment. But once you have your work finished you can upload it to an online publisher and get any number of “proper books” sent to you, much much cheaper than typical publishing. If anyone else needs a hard copy to take on holiday they can just download your published book, upload to the online publisher and get a single (cheap) copy mailed to them. Have a look at Lulu or Blurb. Upload a PDF to Lulu and get a single 200 page book for about £20 (US$35). Sell your book in their online bookstore if you want.

But how could I cite it?”. If you upload a PDF to say Nature Precedings you will get a permanent doi number. Lulu offers an ISBN too.

So I’ve been burned in the past. It took 5 years after our chapter was reviewed and accepted (“in press”) to it appearing in print. Exceptionally long? Yes. It took so long because of editorial ineptitude along with publisher sloth, change of publisher, more sloth. But what if it was just 9 months, would that 9 month dead time be OK? No it wouldn’t, you can self-publish in a few minutes. Why delay things for 9 months? We made the naive mistake of putting new stuff into the chapter, but after so many years in press it wasn’t new or even a good review of the literature any more. Gómez A, Lunt DH (2007) Refugia within refugia: patterns of phylogeographic concordance in the Iberian Peninsula. In: Phylogeography of Southern European Refugia (eds. Weiss S, Ferrand N). Springer. It sells for £92 (US$182) by the way, save your money.

I like books, no really I do. But I’m not sure many publishers serve research scientists as well as we want. Why should they, they’re companies with shareholders! There are parallels both to world of open-access publishing here and to bands who avoid signing to a record label and just load their music onto iTunes. I will never publish a chapter in an edited book again. The revolution is coming….