DNA sequences are the bedrock of molecular taxonomy and phylogenetics. Alarmingly, we find that 20% of the reptilian sequences in GenBank's DNA database cannot be mapped to actual species or subspecies.

Using the Reptile Database (see go.nature.com/2pgkdbw), we investigated how many taxa in the taxonomic database of the US National Center for Biology Information (NCBI) could be mapped to accepted species. We found that just 60% of the 10,510 species in the Reptile Database have their taxon identified in the NCBI database. Moreover, 1,704 reptile names in NCBI (19.5% of all reptile NCBI names) did not match up with their currently accepted species' name in the Reptile Database.

Sloppy practice by authors and journals contributes to these discrepancies. We found that 1,037 species names in published papers were either not rigorously spelled out (for example, the lizard Eremias grammica has been designated as Eremias sp. TMT-2004a) or had not been updated in either GenBank or the NCBI after publication of their DNA sequence.

The problem is likely to get worse as metagenomic and DNA-barcoding studies proliferate. For the sake of the international biodiversity community, we urge authors of taxonomic papers to lodge and update their DNA sequences and the associated taxonomic information in the relevant databases.