pocketutils.biochem.uniprot_go

Module Contents

class pocketutils.biochem.uniprot_go.FlatGoTerm

A Gene Ontology term. Not to be confused with GOTerm in goatools: obo_parser.GOTerm

- identifier

(str); ex: GO:0005737

- kind

(str: ‘P’==process, ‘C’==component, ‘F’==function)

- description

(str)

- sourceId

(str); ex: IDA

- sourceName

(str); ex: UniProtKB

classmethod parse(stwing: str)

Builds a GO term from a string from uniprot_obj[‘go’]. :raises ValueError: if the syntax is wrong.

class pocketutils.biochem.uniprot_go.GoTermsAtLevel

Gene ontology terms organized by level.

Example

go_term_ancestors_for_uniprot_id_as_df('P42681', 2)
query_obo_term(term_id: str) goatools.obo_parser.GOTerm

Queries a term through the global obo. This function wraps the call to raise a ValueError if the term is not found; otherwise it only logs a warning.

get_ancestors_of_go_term(term_id: str, level: int) Iterable[goatools.obo_parser.GOTerm]

From a GO term in the form ‘GO:0007344’, returns a set of ancestor GOTerm objects at the specified level. The traversal is restricted to is-a relationships. Note that the level is the minimum number of steps to the root.

Parameters
  • term_id – The term

  • level – starting at 0 (root)

go_term_ancestors_for_uniprot_id(uniprot_id: str, level: int, kinds_allowed: Optional[Collection[str]] = None) Iterable[goatools.obo_parser.GOTerm]

Gets the GO terms associated with a UniProt ID and returns a set of their ancestors at the specified level. The traversal is restricted to is-a relationships. Note that the level is the minimum number of steps to the root.

Parameters
  • level – starting at 0 (root)

  • uniprot_id – ID

  • kinds_allowed – a set containing any combination of ‘P’, ‘F’, or ‘C’

go_term_ancestors_for_uniprot_id_as_df(uniprot_id: str, level: int, kinds_allowed: Optional[Collection[str]] = None) pandas.DataFrame

See go_term_ancestors_for_uniprot_id.

Parameters
  • uniprot_id – ID

  • level – Level

  • kinds_allowed – Can include ‘P’, ‘F’, and/or ‘C’

Returns

Pandas DataFrame with columns IDand name.