AGU Tuesday
AGU Tuesday 12/13/2016
0800-1000
First Speaker: Chris Lynnes (NASA ESDIS)
Heuristics for relevancy ranking of earth dataset search results
Earth Observing System Data and Information System
variety in EOSDIS data: 3700 science products.
Variety: instrument evolution, satellites, processing, algorithm, temporal
Return relevant results
Community use hueristics to find most useful
Applications user, students, climate modellers.
Second Speaker: Mark Reese (NASA)
Earthdata Search: scaling assessing and improving relevancy.
search, discover, and vizualize earth science data
sitting on NASAs client
32,960 data products
Facets to compartmentalize search results
Third speaker: Stephen Richard (SDSC)
Integrating semantic information in metadata descriptions for a geoscience-wide resource inventory
Community inventory for earthcube resources for geoscience interoperability (CINERGI)
Using standard metadata formats
MongoDB, JSON format
provenance recording: w3c prov and neo4j
spatial enhancer (bounding box)
keyword enhancer (faceted search)
organization enhancer (associate with virtual authority identifiers)
harvest from a number of data sources (600k processed / 1.5 mil total)
Fourth speaker: Anu Devaraju (CSIRO)
Data recommender system for research data discovery
recommender system, data access portal (dap)
DAP Server logs -->Google Analytics, SQL database-->HDF
Fifth speaker: Mo Wang (Chinese Academy of Sciences)
A hybrid personalized data recommendation approach for geoscience data sharing
spatial similarity models for returning search results.
Showed a lot of equations for computing spatio-temporal metrics.
Sixth speaker: Chao Yang (George Mason)
Advancing geoscience resource discovery with cutting edge cyberinfrastructure technology
Seventh speaker: Tristan Wellman (USGS)
Cross linkage of multiple criteria to enhance selection aplicability and use of biogeographic occurance data.
optimize data quality
data assessment module (dam) - regulate data across an interface (like water over a dam)
Eighth speaker: Sky Bristol (USGS)
Measuring the impact of an api-first mentality with ScienceBase after 4.5 years * API use exceeds portal traffic with 70+ API-driven apps * https://nccwsc.usgs.gov/tools * www.sciencebase.gov * iPython notebook
1020-1200 Dropped into Data session
CZO Townhall
OpenSource software & data
Computational infrastructure for geodynamics
workshops, tutorials, webinars, hackathons
NASA DEVELOP
software carpentry
Agile Softwarwe management for Successful open source software projects
Open Geospatial Consortium
Testbed 13 coming
Github, TestNG, Maven, CTL, XSLT
DARPA Memex
Apache Nutch, Sparkler, Tika, Solr
Lucene Geo-Gazetteer
GeoTopicParser, GeoReverse
scalable parallel 'tile streaming' architecture
www.rasdaman.org, www.jacobs-university.de/lsis osgeolive planetserver.eu ISO array SQL candidate standard Multi-dimensional Arrays (SQL/MDA)
OpenSource Architecture
datasources --> normalize and extract --> Data Lake (CDR)
apache nutch --> apache tika, deepdive --> lucene, apache solr, elastic
Data Engeineer at the architecture level, Data scientist at the analysis level, Web developer at the interface.