About Log Files
About the Logs directory
The logs directory is created as a subdirectory in every analysis results directory. It contains several log files that may be useful for troubleshooting when problems occur. They also are extremely useful to the people in CyVerse Support as well as those who created the new app interface. This page contains a description of each log file. Note that not all log files are returned for every analysis.
The Job Execution Framework (JEX) treats single-app analyses as a workflow in which only a single app is executed as part of the analysis; hence the files are numbered even when a single tool is being executed. In reality, all analyses are workflows since the JEX adds executions of administrative tools to every analysis execution. The numbers in the log file names (as in condor-0-input-0-stderr) correspond to the number of the step in your analysis; in a single-app analysis, the number is always 0.
Files that contain err and stdout in the name are available in several log files, and one or more may contain useful error and warning messages. One effective troubleshooting technique is to examine all err and stdout files that have a nonzero size.
Renamed Condor files
The files that follow the naming convention condor-0-input-0-stderr and condor-0-input-0-stdout have been renamed to follow this convention: logs-stdout-input-0 or logs-stdout-input-0. This was done because those log files are no longer generated by condor and to make the filenames more easily sorted. The content in the files may has been changed in format, but they still contain the output of the file transfer steps.
The files that follow the naming convention of condor-stderr-0 and condor-stdout-0 have not been changed.
logs-stdout-input-0 (renamed from condor-0-input-0-stderr): stderr output from the first input file transfer of the first app in the workflow.
logs-stdout-input-0 (renamed from condor-0-input-0-stdout): stdout output from the first input file transfer of the first app in the workflow.
condor-stderr-0: stderr output from the execution of the analysis for the first app in the workflow.
condor-stdout-0: stdout output from the execution of the analysis for the first app in the workflow.
Removed files
- imkdir log files: The imkdir log files, imkdir-stderr and imkdir-stdout, are no longer returned. The creation of the output directory is now performed by the tool that does the uploads of the output files, so these logs are no longer created.
- iPlant log files: The iPlant log files, iplant.cmd and iplant.sh, are no longer returned. We no longer generate an iplant.sh file for each job, and the iplant.cmd file is a templated file that barely changes between jobs. The logic for running the applications is now handled by a tool written specifically for that job, so we no longer have to generate bash scripts. The new JobSummary.csv file (see below) roughly corresponds to the useful information that was contained in the iplant.cmd file before it was eliminated.
- Output log files: The output log files, output-last-stderr and output-last-stdout, are no longer returned. There was a nasty race condition where the the tool that performs file transfers would sometimes try to upload these files while they were still being written to by the same uploader tool, resulting in a crash and job failures.
- Script log files: The script log files, script-condor-log, script-error.log, and script-output.log, are no longer returned. They were created by HTCondor in the submission directory on our submission node in an NFS mount shared across all of our compute nodes. The NFS mount was/is a single point of failure that could cause all jobs running in the cluster to fail at the same time, so we've been moving everything off of it. Unfortunately, this means that the script log files are now only available on the submission node (since that's where HTCondor creates them) and we haven't yet found a good way of transferring the files into iRODS without turning the submission node into a significant performance bottleneck.
New files
- JobSummary.csv contains a summary of the job as a CSV file. This roughly corresponds to the useful information that was contained in the iplant.cmd file before it was eliminated.
- JobParameters.csv contains the command-line flags used to invoke the tools used in the app. This can be used to figure out what flags were passed to a tool, what order they were passed in, and how they were parsed by the system.