Preparing Community Released Data Folders

Overview

Community Released Data folders are available for evolving datasets that individuals or communities want to make available as quickly as possible for research and reuse, especially within CyVerse analysis platforms. Community Released Data folders are intended for datasets that are growing or changing frequently or that may not need long-term preservation. For more information on the policies that apply to Community Released Data, see the CyVerse Data PolicyTo request a public folder in Community Released Data, use this form then prepare your data as described below.

Data can transition from Community Released Data to CyVerse Curated Data by Requesting a Permanent Identifier in the Data Commons.

All Community Released data folders are located in /iplant/home/shared and are visible to anyone at http://datacommons.cyverse.org/browse/iplant/home/shared or to registered CyVerse users in the Discovery Environment under "Community Data".

Step 1: Request the data folder

Use this form to submit a request for a Community Released data folder. You will need to supply a folder name (no spaces or special characters), description of the data, estimated size of the dataset, and a list of collaborators. If your dataset is larger than 1TB, you must provide a sustainability plan that describes what you will do with the data should CyVerse no longer be able to host it. After your request is approved, the folder will be created in to /iplant/home/shared and you will have ownership permission. The folder will not be made public until metadata is applied.

Step 2: Apply metadata to your folder

Community Released Data Folders are required to have minimal metadata needed for attribution. We strongly encourage the use of additional metadata that describes the content of your dataset.

Apply metadata through the DE, using the Dublin Core metadata template. See Using Metadata in the DE for instructions on how to apply a metadata template or apply metadata in bulk. The Dublin Core template contains 22 fields, but only 7 of them are required: Title, Subject, Description, Rights, Creator (may be consortium), Publisher (automatically set to CyVerse Data Commons), and Date. In the Rights field, we recommend an open access license such as CC0 or ODC-PDL. See the Permanent Identifier FAQs page for more information on rights. You can use multiples of fields such as Creator, Contributor, or Subject.

Any metadata you add will be displayed on the dataset's landing page under http://datacommons.cyverse.org/, will be indexed for search, and will aid in data discovery. The required metadata will be used to generate a citation for your dataset.

Step3: Upload and organize your data

If your data are already on the CyVerse Data Store, you can move them to your Community Released folder. Otherwise, upload them using one of the methods described in Downloading and Uploading Data. If you need help uploading or moving a dataset that contains very large fails (many GB) or many thousands of files, please contact the CyVerse data curators.

 Step 4: Make your data public

Once you have applied the required metadata and put some data in your folder, you can make it public. If you have experience Using iCommands, you can do this using ichmod by give read permission to the users "public" and "anonymous". Otherwise, contact the CyVerse data curators who will make the folder public for you.

Need help?

See the Permanent Identifier FAQs page for answers to some common question. Please contact the CyVerse data curators if you have question about how to organize your data, what metadata to include, or which license to apply.