CEGMA: Core Eukaryotic Genes Mapping Approach

Core Eukaryotic Genes Mapping Approach, a "method [...] for building a highly reliable set of gene annotations in the absence of experimental data [using] a set of 458 core proteins that are present in a wide range of taxa."

NOTE: Development of CEGMA has been discontinued. 

See this notice. The Discovery Environment will continue to host CEGMA for a little while longer, but it may not work properly. We recommend that users switch to alternative tools, such as BUSCO in the Discovery Environment.

Quick Start

  • To use CEGMA, load your genome in FASTA format.
  • Resources: Understanding CEGMA output

    Unknown macro: {hidden-data}

    Test Data

    Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> directory.

    Input File(s)

    Use TerriblyIncomprehensible.txt and HorriblyWritten.txt from the directory above as test input.

    Parameters Used in App

    When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.

    Use either the "Default parameters..." section OR fill in the "Use these parameters..." section. Delete the unused section and the OR. Then, delete this note.

    • Default parameters only, no further configuration needed.


    • Use these parameters within the DE app interface:
      • parameter name - value/setting
      • parameter name - value/setting

    Output File(s)

    Expect a text file named after the input files as output. For the test case, the output file you will find in the example_data directory is named BeautifulProse.txt.

Tool Source for App