AGU Tuesday

AGU Tuesday

AGU Tuesday 12/13/2016

0800-1000

First Speaker: Chris Lynnes (NASA ESDIS)

Heuristics for relevancy ranking of earth dataset search results

  • Earth Observing System Data and Information System

  • variety in EOSDIS data: 3700 science products.

  • https://search.earthdata.nasa.gov/ 

  • Variety: instrument evolution, satellites, processing, algorithm, temporal

  • Return relevant results 

  • Community use hueristics to find most useful

  • Applications user, students, climate modellers.

Second Speaker: Mark Reese (NASA)

Earthdata Search: scaling assessing and improving relevancy.

  • search, discover, and vizualize earth science data

  • sitting on NASAs client

  • 32,960 data products

  • Facets to compartmentalize search results 

Third speaker: Stephen Richard (SDSC)

Integrating semantic information in metadata descriptions for a geoscience-wide resource inventory

  • Community inventory for earthcube resources for geoscience interoperability (CINERGI)

  • Using standard metadata formats 

  • MongoDB, JSON format

  • http://cinergi.sdsc.edu

  • provenance recording: w3c prov and neo4j

  • spatial enhancer (bounding box)

  • keyword enhancer (faceted search)

  • organization enhancer (associate with virtual authority identifiers)

  • harvest from a number of data sources (600k processed / 1.5 mil total)

Fourth speaker: Anu Devaraju (CSIRO)

Data recommender system for research data discovery

Fifth speaker: Mo Wang (Chinese Academy of Sciences)

A hybrid personalized data recommendation approach for geoscience data sharing

  • http://www.geodata.cn/

  • spatial similarity models for returning search results.

  • Showed a lot of equations for computing spatio-temporal metrics.

Sixth speaker: Chao Yang (George Mason)

Advancing geoscience resource discovery with cutting edge cyberinfrastructure technology

Seventh speaker: Tristan Wellman (USGS)

Cross linkage of multiple criteria to enhance selection aplicability and use of biogeographic occurance data.

Eighth speaker: Sky Bristol (USGS)

Measuring the impact of an api-first mentality with ScienceBase after 4.5 years * API use exceeds portal traffic with 70+ API-driven apps * https://nccwsc.usgs.gov/tools * www.sciencebase.gov * iPython notebook 

1020-1200 Dropped into Data session

geoplatform.gov

CZO Townhall

OpenSource software & data

Computational infrastructure for geodynamics

NASA DEVELOP

  • software carpentry

Agile Softwarwe management for Successful open source software projects

DARPA Memex

  • Apache Nutch, Sparkler, Tika, Solr

  • Lucene Geo-Gazetteer

    • GeoTopicParser, GeoReverse

scalable parallel 'tile streaming' architecture

www.rasdaman.org, www.jacobs-university.de/lsis osgeolive planetserver.eu ISO array SQL candidate standard Multi-dimensional Arrays (SQL/MDA)

OpenSource Architecture

datasources --> normalize and extract --> Data Lake (CDR)

apache nutch --> apache tika, deepdive --> lucene, apache solr, elastic

Data Engeineer at the architecture level, Data scientist at the analysis level, Web developer at the interface.