If the your browser is internet explorer 5 or any older version, you are only able to read the content of this page, but not to see the layout.

In-Silico Analysis of Proteins

Celebrating the 20th anniversary of Swiss-Prot

July 30 - August 04, 2006 : Fortaleza, Brazil

Poster #RP107

Indexing text with proteins names under their UniProt PANs

Sylvain Gaudan*, Vivian Lee*, Miguel Arregui*, Harald Kirsch*, Dietrich Rebholz-Schuhmann*

*European Bioinformatics Institute

The common way to retrieve publications related to a protein is to query search engines with the name of the protein. However, the current search engines suffer several shortcomings. Retrieving all the publications, and only the right ones, related to a protein can be a tricky task. Omitting synonyms automatically results in missing some publications of interest whereas ambiguous abbreviations and protein names produce unwanted retrieval results.
Finally, proteins are studied in several species and it is difficult for researchers to focus on publications related to their model organism.

We present the creation of a search engine index that has been expanded to include Protein Accession Numbers (PANs) from UniProtKB. To this end, the comprehensive list of protein names and their carefully listed synonyms from UniProtKB are automatically identified, disambiguated and linked to the organism mention in the text. Finally the PAN is integrated into the Medline index to allow retrieval of publications based on PANs instead of protein names. For instance, the query "(six3 OR 'sine oculis') AND (Human OR Homo
sapiens) AND development" becomes "O95343 AND development". The
resulting index allows users to easily retrieve with accuracy all the
publications related to a protein for a specific organism.