Poster #RP118
The Devil is in the Annotation Details: Examples from Secreted Rat Ly-6 Proteins
Christopher Southan*
*AstraZeneca R&D, Mölndal, Sweden
Proteomic data from rat urine (PMID 11840564) resulted in SwissProt entries for secreted Ly-6 proteins (P81827, P81828, P83121) that delineated new protein families in the mouse and rat genomes. However, bioinformatic analysis uncovered a range of problematic annotations and protein database entries. Searches with P81827 revealed three chimeras i.e. mRNAs mapping to two rat chromosomes each, and a pre-mRNA. One of these, L07806, resulted in a protein chimera (gi204682) that was erroneously chosen as an NCBI reference sequence (gi6978563) but later corrected (NP_037047.2). Fortunately, the Swiss-Prot entry (Q03344) highlighted the problem as a conflict feature. The pre-mRNA AF368860 was translated by the submitter as an extended first exon of only 28aa that is now enshrined in trQ91XP0, gi14388973 and NP_001030122. Because it maps uniquely this artefact has not only propagated to Ensemble, the RatGenome Database and Entrez gene but is also miss-named Rup2 which should have been the 110aa of P81828. An additional quirk for these proteins is that their Ly-6 domain, easily recognised manually from Cysteine alignments, falls below the recognition thresholds for InterPro001526. Details of these, and additional unforeseen consequences of automated pipelines that highlight the necessity for SwissProt-style expert annotation, will be presented
