orthomap.of2orthomap module

Author: Kristian K Ullrich date: April 2023 email: ullrich@evolbio.mpg.de License: GPL-3

orthomap.of2orthomap.add_argparse_args(parser: ArgumentParser)

This function attaches individual argument specifications to the parser.

Parameters:

parser (argparse.ArgumentParser) – An argparse.ArgumentParser.

orthomap.of2orthomap.define_parser()

A helper function for using of2orthomap.py via the terminal.

Returns:

An argparse.ArgumentParser.

Return type:

argparse.ArgumentParser

orthomap.of2orthomap.get_continuity_score(og_name, youngest_common_counts_df)

This function calculates a continuity score for a given orthologous group and its corresponding LCA counts.

Parameters:
  • og_name (str) – Orthologous group name.

  • youngest_common_counts_df (pandas.DataFrame) – DataFrame with LCA counts.

Returns:

Continuity score.

Return type:

float

Example

>>>
orthomap.of2orthomap.get_counts_per_ps(omap_df, psnum_col='PSnum', pstaxid_col='PStaxID', psname_col='PSname')

This function return counts per phylostratum.

Parameters:
  • omap_df (pandas.DataFrame) – DataFrame with orthomap results.

  • psnum_col (str) – Specify PSnum column name.

  • pstaxid_col (str) – Specify PStaxID column name.

  • psname_col (str) – Specify PSname column name.

Returns:

DataFrame with counts per phylostratum.

Return type:

pandas.DataFrame

Example

>>> from orthomap import datasets, of2orthomap, qlin
>>> datasets.ensembl105(datapath='.')
>>> query_orthomap = of2orthomap.get_orthomap(
>>>     seqname='Danio_rerio.GRCz11.cds.longest',
>>>     qt='7955',
>>>     sl='ensembl_105_orthofinder_species_list.tsv',
>>>     oc='ensembl_105_orthofinder_Orthogroups.GeneCount.tsv',
>>>     og='ensembl_105_orthofinder_Orthogroups.tsv',
>>>     out=None,
>>>     quiet=False,
>>>     continuity=True,
>>>     overwrite=True)
>>> of2orthomap.get_counts_per_ps(
>>>     omap_df=query_orthomap[0],
>>>     psnum_col='PSnum',
>>>     pstaxid_col='PStaxID',
>>>     psname_col='PSname')
orthomap.of2orthomap.get_orthomap(seqname, qt, sl, oc, og, out=None, quiet=False, continuity=True, overwrite=True)

This function return an orthomap for a given query species and OrthoFinder input data.

Parameters:
  • seqname (str) – Sequence name of the query species used for OrthoFinder comparison.

  • qt (str) – Query species taxID.

  • sl (str) – Path to species list file containing <OrthoFinder name><tab><species taxID>.

  • oc (str) – Path to OrthoFinder result <Orthogroups.GeneCounts.tsv> file.

  • og (str) – Path to OrthoFinder result <Orthogroups.tsv> file.

  • out (str) – Path to output file.

  • quiet (bool) – Specify if output should be quiet.

  • continuity (bool) – Specify if continuity score should be calculated.

  • overwrite (bool) – Specify if output should be overwritten.

Returns:

A list of results such as: orthomap, species_list, youngest_common_counts

Return type:

list

Example

>>> from orthomap import datasets, of2orthomap, qlin
>>> datasets.ensembl105(datapath='.')
>>> query_orthomap, orthofinder_species_list, of_species_abundance = of2orthomap.get_orthomap(
>>>     seqname='Danio_rerio.GRCz11.cds.longest',
>>>     qt='7955',
>>>     sl='ensembl_105_orthofinder_species_list.tsv',
>>>     oc='ensembl_105_orthofinder_Orthogroups.GeneCount.tsv',
>>>     og='ensembl_105_orthofinder_Orthogroups.tsv',
>>>     out=None,
>>>     quiet=False,
>>>     continuity=True,
>>>     overwrite=True)
>>> query_orthomap
orthomap.of2orthomap.get_youngest_common_counts(qlineage, species_list)

This function return LCA counts for a given query species lineage.

Parameters:
  • qlineage (list) – Query lineage information.

  • species_list (pandas.DataFrame) – Species list.

Returns:

DataFrame with LCA counts.

Return type:

pandas.DataFrame

Example

>>>
orthomap.of2orthomap.main()

The main function that is being called when of2orthomap is used via the terminal.