vContact_0.1.49 and vContact-PCs
vContact 0.1.49 and vContact-PCs
Community rating: ?????
vContact is a tool to perform Guilt-by-contig-association automatic classification of viral contigs.
Complete documentation can be found at either the source tool's original site or at its new location. This app is in a state of on-going, constant development and will
undergo a major revision in the near future. This version will stay and the revised version added as a separate app in the HPC.
Quick Start
- The quickest way to use vContact is to run vContact-PCs on a BLAST file and provide a contig info file. vContact-PCs will create the appropriate input files for use with vContact.
- Since this runs at the Texas Advanced Computing Center (TACC) there will be a queue time before the app begins running. Once begun, the time limit is 1 hour, which should be sufficient for datasets of nearly any size.
Test Data
Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> vContact
Input File(s)
There are three required input files. All three need to be in either TSV (tab-separated values) or CSV (comma-separated values) formats. They can be mixed *between files*, but not *within files*.
Protein Clusters Info file: Contains the association of contigs with protein clusters. Must contain the headers "id" and "size"
- If using the test data, this is the file pc_info.tsv
Contig Info file: Contains contig information. At the very least must have the contig name and the number of proteins associated with the contig. Must contains the header "id" "proteins" and "size"
- If using the test data, this is the file contigs.tsv
Protein Clusters Profiles file: Contains protein cluster information. Must contain the headers "contid_id" and "pc_id"
- If using the test data, this is the file pcprofiles.tsv
Parameters Used in App
There are a number of parameters that can be used in the app. Change the defaults only if you know what you're doing.
In general, changing them won't substantially affect the results. For a detailed guide to what each of these options do, please check the documentation.
- Significativity threshold: Significativity threshold in the contig similarity network
- Use permissive: Use permissive affiliation (Flag this option to increase the number of contigs retained in the network)
- Inflation: Inflation parameter to define contig clusters with MCL
- Module inflation: Inflation parameter to define proteins modules with MCL
- Module significativity: Significativity threshold in the protein cluster similarity network
- Module shared min: Minimal number (inclusive) of contigs a PC must appear in to be taken into account in the modules computing
- Link significitaivity: Significitaivity threshold to link a cluster and a module
- Link proportion: Proportion of a module's PC a contig must have to be considered as displaying this module
Output File(s)
The output directory created by vContact contains a number of files.
- cc(*) files are contig cluster files
- mod(*) are module files
- (*).clusters are TSV formatted files containing the clusters, 1 cluster per line
- (*). ntw are TSV formatted *edge-list* files, with source, target and edge weight. These files can be used as input for a variety of different graph visualization tools.
- (*).pandas are pandas-formatted tables, generated by the pandas python package
Tool Source for App
This app was created from the project's original source and is now forked at its new location.
Â
Â