Skip to content

API

gopher.read_encyclopedia(proteins_txt)

Read results from EncyclopeDIA.

PARAMETER DESCRIPTION
proteins_txt

The EncyclopeDIA protein output.

TYPE: str

RETURNS DESCRIPTION
pandas.DataFrame

The EncyclopeDIA results in a format for gopher.

gopher.read_metamorpheus(proteins_txt)

Read results from Metamorpheus.

PARAMETER DESCRIPTION
proteins_txt

The Metamorpheus protein output file.

TYPE: str

RETURNS DESCRIPTION
pandas.DataFrame

The Metamorpheus results in a format for gopher.

gopher.test_enrichment(proteins, desc=True, aspect='all', species='human', release='current', go_subset=None, contaminants_filter=None, fetch=False, progress=False, annotations=None, mapping=None, aggregate_terms=True, alternative='greater')

Test for the enrichment of Gene Ontology terms from protein abundance.

The Mann-Whitney U Test is applied to each column of proteins dataframe and for each Gene Ontology (GO) term. The p-values are then corrected for multiple hypothesis testing across all of the columns using the Benjamini-Hochberg procedure.

PARAMETER DESCRIPTION
proteins

A dataframe where the indices are UniProt accessions and each column is an experiment to test. The values in this dataframe should be some measure of protein abundance: these could be the raw measurement if originating from a single sample or a fold-change/p-value if looking at the difference between two conditions.

TYPE: pandas.DataFrame

desc

Rank proteins in descending order?

TYPE: bool, optional DEFAULT: True

aspect

The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.

TYPE: str DEFAULT: 'all'

species

The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.

TYPE: str DEFAULT: 'human'

release

The Gene Ontology release version. Using “current” will look up the most current version.

TYPE: str, optional DEFAULT: 'current'

go_subset

The go terms of interest. Should consists of the go term names such as ‘nucleus’ or ‘cytoplasm’.

TYPE: list DEFAULT: None

contaminants_filter

A list of uniprot accessions for common contaminants such as Keratin to filter out.

TYPE: list DEFAULT: None

fetch

Download the GO annotations even if they have been downloaded before?

TYPE: bool, optional DEFAULT: False

progress

Show a progress bar during enrichment tests?

TYPE: bool, optional DEFAULT: False

annotations

A custom annotations dataframe.

TYPE: pd.DataFrame DEFAULT: None

mapping

A custom mapping of the GO term relationships.

TYPE: dict DEFAULT: None

aggregate_terms

Aggregate the terms and do the graph search.

TYPE: bool, optional DEFAULT: True

alternative

Type of test that should be run. Could be “greater”, “less”, or “two-sided”.

TYPE: str DEFAULT: 'greater'

RETURNS DESCRIPTION
pandas.DataFrame

The adjusted p-value for each tested GO term in each sample.

gopher.get_data_dir()

Retrieve the current data directory for ppx.

gopher.set_data_dir(path=None)

Set the ppx data directory.

PARAMETER DESCRIPTION
path

The path for ppx to use as its data directory.

TYPE: str or pathlib.Path object, optional DEFAULT: None

gopher.generate_annotations(proteins, aspect, go_name, go_id=None)

Generate an annotation file for a list of proteins that are correlated to a single term and aspect.

The term can be in the GO database or a new term.

PARAMETER DESCRIPTION
proteins

List of proteins (UniProtKB accessions) that will be annotated to a term.

TYPE: List[str]

aspect

String specifying the aspect the term is in (“C”, “F”, “P”).

TYPE: str

go_name

String of the GO name for the proteins

TYPE: str

go_id

String of the GO ID. If in the GO database, the go id and go name should match the database.

TYPE: str, optional DEFAULT: None

RETURNS DESCRIPTION
pandas.DataFrame

An annotations dataframe with a single go term.

gopher.load_annotations(species, aspect='all', release='current', fetch=False)

Load the Gene Ontology (GO) annotations for a species.

PARAMETER DESCRIPTION
species

The species for which to retrieve GO annotations. If not “humnan” or “yeast”, see here.

TYPE: str

aspect

The Gene Ontology aspect to use. Use “c” for “Cellular Compartment”, “f” for “Molecular Function”, or “p” for “Biological Process”. None uses all of the them.

TYPE: str DEFAULT: 'all'

release

The Gene Ontology release version. Using “current” will look up the most current version.

TYPE: str DEFAULT: 'current'

fetch

Download the file even if it already exists?

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
pandas.DataFrame

The annotation dataframe.

dict

A mapping of GO terms (keys) to Uniprot accessions with that annotation.

gopher.get_annotations(proteins, aspect='all', species='human', release='current', fetch=False, go_subset=None)

Gets the annotations for proteins in a dataset

PARAMETER DESCRIPTION
proteins

Dataframe of proteins and quantifications

TYPE: pandas.DataFrame

aspect

The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.

TYPE: str DEFAULT: 'all'

species

The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.

TYPE: str DEFAULT: 'human'

release

The Gene Ontology release version. Using “current” will look up the most current version.

TYPE: str, optional DEFAULT: 'current'

fetch

Download the GO annotations even if they have been downloaded before?

TYPE: bool, optional DEFAULT: False

go_subset

The go terms of interest. Should consists of the go term names such as ‘nucleus’ or ‘cytoplasm’.

TYPE: list DEFAULT: None

RETURNS DESCRIPTION
pandas.DataFrame

Dataframe with protein annotations

gopher.map_proteins(protein_list, aspect='all', species='human', release='current', fetch=False)

Map the proteins to the GO terms

PARAMETER DESCRIPTION
protein_list

A list of UniProt accessions.

TYPE: List[str]

aspect

The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.

TYPE: str DEFAULT: 'all'

species

The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.

TYPE: str DEFAULT: 'human'

release

The Gene Ontology release version. Using “current” will look up the most current version.

TYPE: str, optional DEFAULT: 'current'

fetch

Download the GO annotations even if they have been downloaded before?

TYPE: bool, optional DEFAULT: False

RETURNS DESCRIPTION
pandas.DataFrame

Dataframe with protein accessions and GO terms

gopher.normalize_values(proteins, fasta)

Normalize intensity values.

Normalize using the proteomic ruler approach outlined by Wiśniewski et al. (doi: https://doi.org/10.1074/mcp.M113.037309)

PARAMETER DESCRIPTION
proteins

A dataframe where the indices are UniProt accessions and each column is an experiment to test. The values in this dataframe raw protein abundance

TYPE: pandas.DataFrame

fasta

Use the FASTA file to generate molecular weights for normalization

TYPE: Path

RETURNS DESCRIPTION
pandas.DataFrame

The normalized intensities for every protein in each sample.

gopher.get_rankings(proteins, go_term, aspect='all', species='human', release='current', fetch=False)

Rank the proteins and show whether proteins are in a specified term

PARAMETER DESCRIPTION
proteins

Dataframe of protein quant data

TYPE: pandas.DataFrame

go_term

String of specified GO term name

TYPE: str

aspect

The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.

TYPE: str DEFAULT: 'all'

species

The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.

TYPE: str DEFAULT: 'human'

release

The Gene Ontology release version. Using “current” will look up the most current version.

TYPE: str, optional DEFAULT: 'current'

fetch

Download the GO annotations even if they have been downloaded before?

TYPE: bool, optional DEFAULT: False

RETURNS DESCRIPTION
pandas.DataFrame

Dataframe with protein rankings and whether or not the protein is in the specified term

gopher.in_term(proteins, go_term, annot)

See if proteins are associated with a specific term

PARAMETER DESCRIPTION
proteins

Dataframe of proteins and quantifications

TYPE: pandas.DataFrame

go_term

String of specified GO term name

TYPE: str

annot

Annotation file for the dataset

TYPE: pandas.DataFrame

RETURNS DESCRIPTION
pandas.DataFrame

Dataframe with protein quant and if protein is in the given term

gopher.roc(proteins, go_term, aspect='all', species='human', release='current', fetch=False)

Plot the ROC curve for a go term in each sample

PARAMETER DESCRIPTION
proteins

Dataframe of proteins and quantifications

TYPE: pandas.DataFrame

go_term

String of specified GO term name

TYPE: str

aspect

The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three.

TYPE: str DEFAULT: 'all'

species

The species for which to retrieve GO annotations. If not “human” or “yeast”, see here.

TYPE: str DEFAULT: 'human'

release

The Gene Ontology release version. Using “current” will look up the most current version.

TYPE: str, optional DEFAULT: 'current'

fetch

Download the GO annotations even if they have been downloaded before?

TYPE: bool, optional DEFAULT: False

RETURNS DESCRIPTION
matplotlib.pyplot

Plot of ROC curve for a GO term