API

`gopher.read_encyclopedia(proteins_txt)`

Read results from EncyclopeDIA.

PARAMETER	DESCRIPTION
`proteins_txt`	The EncyclopeDIA protein output. TYPE: `str`

RETURNS	DESCRIPTION
`DataFrame`	The EncyclopeDIA results in a format for gopher.

`gopher.read_metamorpheus(proteins_txt)`

Read results from Metamorpheus.

PARAMETER	DESCRIPTION
`proteins_txt`	The Metamorpheus protein output file. TYPE: `str`

RETURNS	DESCRIPTION
`DataFrame`	The Metamorpheus results in a format for gopher.

`gopher.test_enrichment(proteins, desc=True, aspect='all', species='human', release='current', go_subset=None, contaminants_filter=None, fetch=False, progress=False, annotations=None, mapping=None, aggregate_terms=True, alternative='greater')`

Test for the enrichment of Gene Ontology terms from protein abundance.

The Mann-Whitney U Test is applied to each column of proteins dataframe and for each Gene Ontology (GO) term. The p-values are then corrected for multiple hypothesis testing across all of the columns using the Benjamini-Hochberg procedure.

PARAMETER	DESCRIPTION
`proteins`	A dataframe where the indices are UniProt accessions and each column is an experiment to test. The values in this dataframe should be some measure of protein abundance: these could be the raw measurement if originating from a single sample or a fold-change/p-value if looking at the difference between two conditions. TYPE: `DataFrame`
`desc`	Rank proteins in descending order? TYPE: `bool` DEFAULT: `True`
`aspect`	The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three. TYPE: `(str, {cc, mf, bp, all})` DEFAULT: `'all'`
`species`	The species for which to retrieve GO annotations. If not “human” or “yeast”, see here. TYPE: `str, {"human", "yeast", ...}, optional.` DEFAULT: `'human'`
`release`	The Gene Ontology release version. Using “current” will look up the most current version. TYPE: `str` DEFAULT: `'current'`
`go_subset`	The go terms of interest. Should consists of the go term names such as ‘nucleus’ or ‘cytoplasm’. TYPE: `list` DEFAULT: `None`
`contaminants_filter`	A list of uniprot accessions for common contaminants such as Keratin to filter out. TYPE: `list` DEFAULT: `None`
`fetch`	Download the GO annotations even if they have been downloaded before? TYPE: `bool` DEFAULT: `False`
`progress`	Show a progress bar during enrichment tests? TYPE: `bool` DEFAULT: `False`
`annotations`	A custom annotations dataframe. TYPE: `DataFrame` DEFAULT: `None`
`mapping`	A custom mapping of the GO term relationships. TYPE: `dict` DEFAULT: `None`
`aggregate_terms`	Aggregate the terms and do the graph search. TYPE: `bool` DEFAULT: `True`
`alternative`	Type of test that should be run. Could be “greater”, “less”, or “two-sided”. TYPE: `str, {"greater", "less", "two-sided"} optional` DEFAULT: `'greater'`

RETURNS	DESCRIPTION
`DataFrame`	The adjusted p-value for each tested GO term in each sample.

`gopher.get_data_dir()`

Retrieve the current data directory for ppx.

`gopher.set_data_dir(path=None)`

Set the ppx data directory.

PARAMETER	DESCRIPTION
`path`	The path for ppx to use as its data directory. TYPE: `str or pathlib.Path object` DEFAULT: `None`

`gopher.generate_annotations(proteins, aspect, go_name, go_id=None)`

Generate an annotation file for a list of proteins that are correlated to a single term and aspect.

The term can be in the GO database or a new term.

PARAMETER	DESCRIPTION
`proteins`	List of proteins (UniProtKB accessions) that will be annotated to a term. TYPE: `List[str]`
`aspect`	String specifying the aspect the term is in (“C”, “F”, “P”). TYPE: `str`
`go_name`	String of the GO name for the proteins TYPE: `str`
`go_id`	String of the GO ID. If in the GO database, the go id and go name should match the database. TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
`DataFrame`	An annotations dataframe with a single go term.

`gopher.load_annotations(species, aspect='all', release='current', fetch=False)`

Load the Gene Ontology (GO) annotations for a species.

PARAMETER	DESCRIPTION
`species`	The species for which to retrieve GO annotations. If not “humnan” or “yeast”, see here. TYPE: `(str, {human, yeast, ...})`
`aspect`	The Gene Ontology aspect to use. Use “c” for “Cellular Compartment”, “f” for “Molecular Function”, or “p” for “Biological Process”. `None` uses all of the them. TYPE: `(str, {cc, mf, bp, all})` DEFAULT: `'all'`
`release`	The Gene Ontology release version. Using “current” will look up the most current version. TYPE: `str` DEFAULT: `'current'`
`fetch`	Download the file even if it already exists? TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`DataFrame`	The annotation dataframe.
`dict`	A mapping of GO terms (keys) to Uniprot accessions with that annotation.

`gopher.get_annotations(proteins, aspect='all', species='human', release='current', fetch=False, go_subset=None)`

Gets the annotations for proteins in a dataset

PARAMETER	DESCRIPTION
`proteins`	Dataframe of proteins and quantifications TYPE: `DataFrame`
`aspect`	The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three. TYPE: `(str, {cc, mf, bp, all})` DEFAULT: `'all'`
`species`	The species for which to retrieve GO annotations. If not “human” or “yeast”, see here. TYPE: `str, {"human", "yeast", ...}, optional.` DEFAULT: `'human'`
`release`	The Gene Ontology release version. Using “current” will look up the most current version. TYPE: `str` DEFAULT: `'current'`
`fetch`	Download the GO annotations even if they have been downloaded before? TYPE: `bool` DEFAULT: `False`
`go_subset`	The go terms of interest. Should consists of the go term names such as ‘nucleus’ or ‘cytoplasm’. TYPE: `list` DEFAULT: `None`

RETURNS	DESCRIPTION
`DataFrame`	Dataframe with protein annotations

`gopher.map_proteins(protein_list, aspect='all', species='human', release='current', fetch=False)`

Map the proteins to the GO terms

PARAMETER	DESCRIPTION
`protein_list`	A list of UniProt accessions. TYPE: `List[str]`
`aspect`	The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three. TYPE: `(str, {cc, mf, bp, all})` DEFAULT: `'all'`
`species`	The species for which to retrieve GO annotations. If not “human” or “yeast”, see here. TYPE: `str, {"human", "yeast", ...}, optional.` DEFAULT: `'human'`
`release`	The Gene Ontology release version. Using “current” will look up the most current version. TYPE: `str` DEFAULT: `'current'`
`fetch`	Download the GO annotations even if they have been downloaded before? TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`DataFrame`	Dataframe with protein accessions and GO terms

`gopher.normalize_values(proteins, fasta)`

Normalize intensity values.

Normalize using the proteomic ruler approach outlined by Wiśniewski et al. (doi: https://doi.org/10.1074/mcp.M113.037309)

PARAMETER	DESCRIPTION
`proteins`	A dataframe where the indices are UniProt accessions and each column is an experiment to test. The values in this dataframe raw protein abundance TYPE: `DataFrame`
`fasta`	Use the FASTA file to generate molecular weights for normalization TYPE: `Path`

RETURNS	DESCRIPTION
`DataFrame`	The normalized intensities for every protein in each sample.

`gopher.get_rankings(proteins, go_term, aspect='all', species='human', release='current', fetch=False)`

Rank the proteins and show whether proteins are in a specified term

PARAMETER	DESCRIPTION
`proteins`	Dataframe of protein quant data TYPE: `DataFrame`
`go_term`	String of specified GO term name TYPE: `str`
`aspect`	The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three. TYPE: `(str, {cc, mf, bp, all})` DEFAULT: `'all'`
`species`	The species for which to retrieve GO annotations. If not “human” or “yeast”, see here. TYPE: `str, {"human", "yeast", ...}, optional.` DEFAULT: `'human'`
`release`	The Gene Ontology release version. Using “current” will look up the most current version. TYPE: `str` DEFAULT: `'current'`
`fetch`	Download the GO annotations even if they have been downloaded before? TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`DataFrame`	Dataframe with protein rankings and whether or not the protein is in the specified term

`gopher.in_term(proteins, go_term, annot)`

See if proteins are associated with a specific term

PARAMETER	DESCRIPTION
`proteins`	Dataframe of proteins and quantifications TYPE: `DataFrame`
`go_term`	String of specified GO term name TYPE: `str`
`annot`	Annotation file for the dataset TYPE: `DataFrame`

RETURNS	DESCRIPTION
`DataFrame`	Dataframe with protein quant and if protein is in the given term

`gopher.roc(proteins, go_term, aspect='all', species='human', release='current', fetch=False)`

Plot the ROC curve for a go term in each sample

PARAMETER	DESCRIPTION
`proteins`	Dataframe of proteins and quantifications TYPE: `DataFrame`
`go_term`	String of specified GO term name TYPE: `str`
`aspect`	The Gene Ontology aspect to use. Use “cc” for “Cellular Compartment”, “mf” for “Molecular Function”, “bp” for “Biological Process”, or “all” for all three. TYPE: `(str, {cc, mf, bp, all})` DEFAULT: `'all'`
`species`	The species for which to retrieve GO annotations. If not “human” or “yeast”, see here. TYPE: `str, {"human", "yeast", ...}, optional.` DEFAULT: `'human'`
`release`	The Gene Ontology release version. Using “current” will look up the most current version. TYPE: `str` DEFAULT: `'current'`
`fetch`	Download the GO annotations even if they have been downloaded before? TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`pyplot`	Plot of ROC curve for a GO term