/
2012.10.11 Range Maps
2012.10.11 Range Maps
Range Models
October 11, 2012
Meeting Objectives
- Discuss how the various data files will be used.
- List output files needed for the BIEN wg meeting. (PIs, John, Brad)
- List output files the community is likely to want to access fairly often. (PIs, Mark, John)
- List output files that can be archived. (PIs, Mark, John)
- List applications needed to analyze and view the relevant output files. (PIs, Mark, John, Brad)
- Develop a (preliminary) data management plan for the BIEN species range model data.
- Identify the approximate size of each of those groups of files. (John)
- Discuss management of data that will be actively used versus data to be archived. (Nirav, Mark, PIs)
- Identify where and how the various files can/will be stored for use and for archiving. (Nirav, Mark, PIs)
- Briefly discuss the longer term goals for range modeling as they impact computing and data management needs. (PIs, Mark, Nirav)
- Discuss the estimated useful lifespan of these data (the various files). (PIs, Mark)
- Plan of action
- What needs to be done? (Martha)
- Assign tasks (Martha)
Participants
John Donoghue, Brad Boyle, Brian Enquist, Nirav Merchant, Edwin Skidmore, Mark Schildhauer, Jim Regetz, Martha Narro
Current location of range model files
- Edwin to contact Paul re leaving files on Longhorn for a while longer.
- John to get rough timeframe for completing computing (1 week).
Preliminary list of what’s needed for the BIEN wg meeting in November
- Products
- Summary file for geographic ranges (Completed by JCD)
- tables of the species modeled, outputs and sample sizes (Completed by JCD)
- table of sources of raw data for acknowledgements (To be Done by Brad)
- table of range areas and basic statistics (Completed by JCD)
- map products: best probability map and thresholded map (Completed by JCD)
- Diversity map for the New World*** Raster map displaying the number of species per cell.
- We have 88,000+ range maps to stack. What is the best way to accomplish this?
- Nirav has suggested that gridding the maps and computing the diversity of each cell. (To be done by iPlant - Nirav; Or to be done by John with iPlant's assistance)
- Gridded maps (want to be able to access the same climate data – bioclim)
- JPEGs for pubs and raw shape files simple visual display of range map on the BIEN website (To be Done by John)
- Descriptive statistics of range-specific climate (and climate variability) for each species. (In Process by JCD)
- need new r scripts to generate many/most of these products
- Analyses
- Doing any sort of mapping, e.g. creating diversity maps (superimposing all range maps to create overall biodiversity map)
- Diversity of different groups, different habits.
- correlations on range size, conservatism
- GIS work mapping out diversity
- Doing spatial joins on maps
- overlay range predictions, threshold to presence/absence
- Ability to look at the maps
Discussion of formats for the models
- grid the ranges
- store raster and vector
- should be able to grid the vector versions
- Then they will be tiled
- 88,000 range models
- for each tile, link to identifiers of the data
- the shape files have the projection inforomation
Discussion of how to make the range models accessible for analysis and viewing
- iPlant can provide access to 15 TB of storage now.
- Additional 300TB of new storage coming online at UA.
- On average, 2-4 TB of new data come into the iPlant Data Store from all over the country every day. Moving this amount of data is not a problem. Moving 15 TB is doable, not that we want to move that amount of data around a lot.
- Could have the subset you’ll compute on spinning at TACC
- For now, keep all 15 TB of data on disk and don’t worry about it (i.e., in the iPlant Data Store which is replicated at UA and TACC).
- Revisit annually to see what can be archived.
- Currently data are stored organized by model
- Needs to be organized by species
- iRODS has the features available so we can create different views into the files using symbolic links. So can create different directory structures.
- Can create metadata about the files.
- There’s a tutorial on how to do that using iRODS.
- iRODs has clients.
- MS: suggesting BIEN group should learn to use the iRODS clients.
- Want scientists to learn how to access the data while at BIEN mtg.
- Mark and Jim will pave the way by learning to use iRODS to see if it will meet the groups needs in terms of features and ease of use.
- Metadata is value/ attribute pairs, so not sophisticated, but a start.
- Send getting started with iPlant link.
- Move data to iPlant Data Store at TACC then let iRODS sync it to iPlant Data Store at Tucson.
- There’s a flag that can be set to move the data this way.
- Suggestion to provide the output files for 10 species to iPlant immediately so everyone can see how things will work (be accessed) and can start playing around.
- Will help people decide if the solutions available will meet the group’s needs and help everyone make informed recommendations on else is needed in terms of file formats and access.
- John to send Nirav some shape files, geotiff
- Want to have jpegs of range maps available from BIEN website by Nov.
- How to interface with Map of Life
- They can pull the range maps from either the iPlant Data Store (any/all of them, any/all formats) or from iPlant’s geoserver (the maps of interest as shapefiles and possibly a few other formats).
- ESA data publication – so people can access maps.
- What format and can ESA handle this size of data?
Returning to discussion of data products needed for the next meeting
- Don’t make 88,000 jpegs (yet).
- Climatic variability for each species. (computation will be completed in about a week)
- Large geotiffs are not that big. Could put Qgis on a machine at NCEAS. Let people look at maps that way.
- Others suggested participants would prefer to access them from their own laptops
Decisions
- Archive all the files on Ranch at TACC
- For analyses, for now, keep all 15 TB of data on disk and don’t worry about it (i.e., in the iPlant Data Store which is replicated at UA and TACC).
- Annually revisit to determine which data are not being used and can be archived.
Action Items
- Edwin: Contact Paul re leaving files on Longhorn for a week longer (done).
- John: Compute climatic variability for each species. (In Process)
- John: Compute a jpeg for each of the 88,000 species ranges. (Correction: not on hold.)
- John: Send Nirav range model output files for 10 species (Completed).
- Nirav, Martha: Work with Smaran to load shape files into geoserver. (week of Oct. 15)
- John: Use sync to copy all files on Longhorn to Ranch for archiving. (Halted during maintenance of Longhorn. Will restart soon.)
- Mark and Jim: Learn to use iRODS.
- Martha: Send Mark and Jim the getting started with iPlant link. (done)
- Brad: Create table of sources of raw data for acknowledgements.
- John, iPlant, both? with assistance from iPlant: Diversity map of all New World species (To be done by iPlant - Nirav; Or to be done by John with iPlant's assistance)
- John: Schedule meeting in about 2 weeks.
Actions on hold
- John with Edwin’s guidance: Move data to iPlant Data Store at TACC then let iRODS sync it to iPlant Tucson data store. (When computations are complete.)