Publishing Data through the Data Commons

The Data Commons publishes data to our own repository at  datacommons.cyverse.org as well as external repositories. All data published to CyVerse Curated Data receive a permanent identifier (PID) in the form of a DOI (Digital Object Identifier) or ARK (Archival Resource Key) and are expected to be stable and permanent. Data published to the Community Released folder do not have PIDs, and may be changed or  removed at any time. All data published to the Data Commons is expected to have at least minimal metadata. The sections below provide more information on each type of data publication available through CyVerse. For more details on the range of data sharing options in CyVerse, see the CyVerse Data Policy and Data Commons User Agreement.

Publishing CyVerse Curated Data

Data publication to CyVerse Curated Data a service offered for datasets that are intended to be stable and permanent. For  CyVerse Curated Data, the Data Commons provides landing pages, permanent DOIs or ARKs, and the requirement to include an open data license. Permanent identifiers allow data to have a stable location on the web so that other users can always find them, along with the information that makes them understandable, citable, and reusable.  An open data license is important to allow others to reuse your data, but it does not exclude users from the obligation to correctly cite your data.

For more information about whether or not CyVerse Curated Data is right for your dataset, the difference between DOIs an ARKs, and other questions, see the Permanent Identifier FAQs page and the Data Commons Policy.

When you are ready to publish, see the quickstart on how to request a DOI.

Publishing Community Released Data

Community Released Data folders are available for evolving datasets that individuals or communities want to make available as quickly as possible for research and reuse. Community Released Data are intended for datasets that are growing or changing frequently or that may not need long-term preservation. Data can transition from Community Released Data to CyVerse Curated Data by requesting a DOI or ARK.

To prepare your community data for pubic release, see Preparing Community Released Data Folders.

Publishing to external repositories

Currently, the CyVerse users can publish data directly to the NCBI Sequence Read Archive (SRA) and the NCBI Whole Genome Shotgun (WGS) archive. To suggest additional repositories to publish to, contact us.

To submit your files directly from CyVerse to the SRA, see the NCBI Sequence Read Archive (SRA) Submission (Workflow Tutorial).

To submit your files directly from CyVerse to WGS archive, see the NCBI Whole Genome Shotgun (WGS) Submission Tutorial.