API
gopher.read_encyclopedia(proteins_txt)
Read results from EncyclopeDIA.
PARAMETER | DESCRIPTION |
---|---|
proteins_txt |
The EncyclopeDIA protein output.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
The EncyclopeDIA results in a format for gopher. |
gopher.read_metamorpheus(proteins_txt)
Read results from Metamorpheus.
PARAMETER | DESCRIPTION |
---|---|
proteins_txt |
The Metamorpheus protein output file.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
The Metamorpheus results in a format for gopher. |
gopher.test_enrichment(proteins, desc=True, aspect='all', species='human', release='current', go_subset=None, contaminants_filter=None, fetch=False, progress=False, annotations=None, mapping=None, aggregate_terms=True, alternative='greater')
Test for the enrichment of Gene Ontology terms from protein abundance.
The Mann-Whitney U Test is applied to each column of proteins dataframe and for each Gene Ontology (GO) term. The p-values are then corrected for multiple hypothesis testing across all of the columns using the Benjamini-Hochberg procedure.
PARAMETER | DESCRIPTION |
---|---|
proteins |
A dataframe where the indices are UniProt accessions and each column is an experiment to test. The values in this dataframe should be some measure of protein abundance: these could be the raw measurement if originating from a single sample or a fold-change/p-value if looking at the difference between two conditions.
TYPE:
|
desc |
Rank proteins in descending order?
TYPE:
|
aspect |
The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.
TYPE:
|
species |
The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.
TYPE:
|
release |
The Gene Ontology release version. Using “current” will look up the most current version.
TYPE:
|
go_subset |
The go terms of interest. Should consists of the go term names such as ‘nucleus’ or ‘cytoplasm’.
TYPE:
|
contaminants_filter |
A list of uniprot accessions for common contaminants such as Keratin to filter out.
TYPE:
|
fetch |
Download the GO annotations even if they have been downloaded before?
TYPE:
|
progress |
Show a progress bar during enrichment tests?
TYPE:
|
annotations |
A custom annotations dataframe.
TYPE:
|
mapping |
A custom mapping of the GO term relationships.
TYPE:
|
aggregate_terms |
Aggregate the terms and do the graph search.
TYPE:
|
alternative |
Type of test that should be run. Could be “greater”, “less”, or “two-sided”.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
The adjusted p-value for each tested GO term in each sample. |
gopher.get_data_dir()
Retrieve the current data directory for ppx.
gopher.set_data_dir(path=None)
Set the ppx data directory.
PARAMETER | DESCRIPTION |
---|---|
path |
The path for ppx to use as its data directory.
TYPE:
|
gopher.generate_annotations(proteins, aspect, go_name, go_id=None)
Generate an annotation file for a list of proteins that are correlated to a single term and aspect.
The term can be in the GO database or a new term.
PARAMETER | DESCRIPTION |
---|---|
proteins |
List of proteins (UniProtKB accessions) that will be annotated to a term.
TYPE:
|
aspect |
String specifying the aspect the term is in (“C”, “F”, “P”).
TYPE:
|
go_name |
String of the GO name for the proteins
TYPE:
|
go_id |
String of the GO ID. If in the GO database, the go id and go name should match the database.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
An annotations dataframe with a single go term. |
gopher.load_annotations(species, aspect='all', release='current', fetch=False)
Load the Gene Ontology (GO) annotations for a species.
PARAMETER | DESCRIPTION |
---|---|
species |
The species for which to retrieve GO annotations. If not “humnan” or “yeast”, see here.
TYPE:
|
aspect |
The Gene Ontology aspect to use. Use “c” for “Cellular Compartment”,
“f” for “Molecular Function”, or “p” for “Biological Process”.
TYPE:
|
release |
The Gene Ontology release version. Using “current” will look up the most current version.
TYPE:
|
fetch |
Download the file even if it already exists?
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
The annotation dataframe. |
dict
|
A mapping of GO terms (keys) to Uniprot accessions with that annotation. |
gopher.get_annotations(proteins, aspect='all', species='human', release='current', fetch=False, go_subset=None)
Gets the annotations for proteins in a dataset
PARAMETER | DESCRIPTION |
---|---|
proteins |
Dataframe of proteins and quantifications
TYPE:
|
aspect |
The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.
TYPE:
|
species |
The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.
TYPE:
|
release |
The Gene Ontology release version. Using “current” will look up the most current version.
TYPE:
|
fetch |
Download the GO annotations even if they have been downloaded before?
TYPE:
|
go_subset |
The go terms of interest. Should consists of the go term names such as ‘nucleus’ or ‘cytoplasm’.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
Dataframe with protein annotations |
gopher.map_proteins(protein_list, aspect='all', species='human', release='current', fetch=False)
Map the proteins to the GO terms
PARAMETER | DESCRIPTION |
---|---|
protein_list |
A list of UniProt accessions.
TYPE:
|
aspect |
The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.
TYPE:
|
species |
The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.
TYPE:
|
release |
The Gene Ontology release version. Using “current” will look up the most current version.
TYPE:
|
fetch |
Download the GO annotations even if they have been downloaded before?
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
Dataframe with protein accessions and GO terms |
gopher.normalize_values(proteins, fasta)
Normalize intensity values.
Normalize using the proteomic ruler approach outlined by Wiśniewski et al. (doi: https://doi.org/10.1074/mcp.M113.037309)
PARAMETER | DESCRIPTION |
---|---|
proteins |
A dataframe where the indices are UniProt accessions and each column is an experiment to test. The values in this dataframe raw protein abundance
TYPE:
|
fasta |
Use the FASTA file to generate molecular weights for normalization
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
The normalized intensities for every protein in each sample. |
gopher.get_rankings(proteins, go_term, aspect='all', species='human', release='current', fetch=False)
Rank the proteins and show whether proteins are in a specified term
PARAMETER | DESCRIPTION |
---|---|
proteins |
Dataframe of protein quant data
TYPE:
|
go_term |
String of specified GO term name
TYPE:
|
aspect |
The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.
TYPE:
|
species |
The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.
TYPE:
|
release |
The Gene Ontology release version. Using “current” will look up the most current version.
TYPE:
|
fetch |
Download the GO annotations even if they have been downloaded before?
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
Dataframe with protein rankings and whether or not the protein is in the specified term |
gopher.in_term(proteins, go_term, annot)
See if proteins are associated with a specific term
PARAMETER | DESCRIPTION |
---|---|
proteins |
Dataframe of proteins and quantifications
TYPE:
|
go_term |
String of specified GO term name
TYPE:
|
annot |
Annotation file for the dataset
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
pandas.DataFrame
|
Dataframe with protein quant and if protein is in the given term |
gopher.roc(proteins, go_term, aspect='all', species='human', release='current', fetch=False)
Plot the ROC curve for a go term in each sample
PARAMETER | DESCRIPTION |
---|---|
proteins |
Dataframe of proteins and quantifications
TYPE:
|
go_term |
String of specified GO term name
TYPE:
|
aspect |
The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.
TYPE:
|
species |
The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.
TYPE:
|
release |
The Gene Ontology release version. Using “current” will look up the most current version.
TYPE:
|
fetch |
Download the GO annotations even if they have been downloaded before?
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
matplotlib.pyplot
|
Plot of ROC curve for a GO term |