Workshop Report - Big Data + Bioimaging Informatics
Report on Big Data + Bioimaging Informatics Workshop
Participants: 38 participants whose research relies on large-scale image data described their project workflows and computational and data management challenges, and discussed strategies for overcoming the bottlenecks. There was great diversity of project type (e.g., 3D reconstruction of brain electron micrographs, 3D modeling of plants, breast cancer imaging, high throughput phenotyping of plants in greenhouse and field, patterns of wildebeast migration, species diversity in marine trenches, metal alloy grain composition and stress tolerance).
Challenges to getting science done: The challenges included moving, storing and managing up to 10's of TB of data per experiment; image registration, feature segmentation and classification; feeding outputs of image analysis into downstream steps in workflows; working both on local machines and in the cloud; scaling analyses; the need for large, diverse and ground-truthed datasets for algorithm development and testing; environment in which a variety of algorithms are easily available to try on one's data.
Strategies for overcoming challenges: Participants discussed how iPlant and Bisque, as well as other projects, approach such issues. The Bisque team led people through how to access Bisque services from a Matlab script and how to integrate analysis modules into Bisque. One session on the second day was devoted to discussion of the attributes of a responsive, innovative cyberinfrastructure for image analysis
Deliverables: A workshop report that cross-references aspects of the Wisconsin image data workshop, will be submitted to NSF. A white paper will be written by a group of the PI's. It will be a perspective piece on the attributes of a responsive, innovative computational infrastructure to support image-dependent science. A document describing the next priorities for supporting image data in Bisque and iPlant.