iProLINK

iProLINK (integrated Protein Literature, INformation and Knowledge) is a resource to facilitate text mining research in the area of literature-based database curation, named entity recognition, and protein ontology development. This collection of annotated data sources can be utilized by computational and biological researchers to explore literature information on proteins and their features or properties (Hu et al., 2004). The data sets for bibliography mapping and feature evidence attribution include mapped citations (PubMed ID to protein entry and feature line mapping) and annotation-tagged literature corpora. The latter includes ~800 abstracts and/or full-text articles in which text evidence was tagged for ~1200 experimentally validated post-translational modifications (PTMs) annotated in the PIR protein sequence database (PIR-PSD). The data sets for entity recognition and ontology development include protein name dictionaries, word token dictionaries, protein name-tagged literature corpora along with tagging guidelines, and a protein ontology based on PIRSF protein family names. All datasets are freely accessible and can be downloaded at http://pir.georgetown.edu/iprolink/.

Webpage:

http://pir.georgetown.edu/iprolink/

Tags:

Tags
More detailed information about this field from each metasource.

protein sequence databases
metasource: Nucleic Acid Research database catalogue
version: extracted_at: 2022-11-04T11:16:25.468657

protein properties
metasource: Nucleic Acid Research database catalogue
version: extracted_at: 2022-11-04T11:16:25.468657

protein sequence protein properties

More to explore:

1/20

Previous Next

Need help integrating and/or managing biomedical data?

iProLINK

Webpage:

More to explore:

1/20

PIR - Protein Information Resource

BioThesaurus

PIR SuperFamily

The Protein Database

PRotein Ontology

PINT

RESID Database of Protein Modifications

Named Entity Recognition Ontology

UniProt Knowledgebase

dbPTM

EKPD

Manually Curated Database of Rice Proteins

Biological General Repository for Interaction Datasets

NPD - Nuclear Protein Database

Termini-Oriented Protein Function INferred Database

Integrated resource of protein families, domains and functional sites

DroID - Drosophila Interactions Database

Protein InFormation Resource Format

Saccharomyces Genome Database

MegaMotifbase