Poster #RP106
Unifying UniProt: tissue vocabulary using eVOC
Oliver Hofmann*, Adele Kruger*, Minna Lehvaslaiho*, Serenella Ferro Rojas**, Eric Jain**, Amos Bairoch**, Win Hide*
*South African Bioinformatics Institute, Bellville, South Africa; **Swiss Institute of Bioinformatics, Geneva, Switzerland
Tissue information within the UniProt Knowledgebase is presently an eclectic set of compound concepts and includes diseases, developmental terms and anatomical structures. Utilizing the SAEL (SOFG Anatomy Entry List) core concepts we mapped the UniProt mammalian tissue vocabulary to the eVOC ontology system.
The eVOC system consists of four core orthogonal controlled vocabularies that unify gene expression data by facilitating a link between the genome sequence and expression phenotype information. Expression terms are linked to libraries and transcripts, enabling the creation of profiles of gene expresion states across the genome. The system is being used by data repositories such as ENSEMBL, FANTOM3 and the AltSplice database.
By applying the eVOC terminology to the nearly 700 terms describing mammalian tissue sources from the UniProt Knowledgebase we have been able to map more than 80% of the terms unambiguously to single or compound eVOC concepts.
We present here the mapping process and COVE, an ontology management framework, developed to maintain and distribute the eVOC ontology system and mappings in RDF (Resource Description Framework) format. The framework was built specifically to aid annotators and curators by keeping track of changes, notes and definitions required during ontology alignment processes.
