Schedule

Schedule

Date

Class Number

Topic (2021)

Learning Outcomes

Slides

Home work

8/24/2021

1 (Lecture)

Intro to course and objectives. Introduction to Data Science and ML workloads

What is Data Science?
What is CyberInfrastructure?
Why this Course?
How is the course structured and organized (and graded)?
Course introduction, orientation,
Note: Review past projects, show diversity of project leads

Link

Homework 1- Intro and Data Scientists

8/26/2021

2 (Activity)

How to collaborate as a team and how will we use these rules for the course

How to build productive teams
Strengths of teams, weaknesses of teams
Problems and solutions (when everyone is busy)

Class presentations from Homework 1; Rest of slides from previous lecture

https://cyverse.atlassian.net/wiki/spaces/acic2021/pages/1545896018

8/31/2021

3 (Lecture)

Good software and Data hygiene and automation are central, especially ML workloads

Data Science is a Team Activity
How to take care of code, project planning, etc.
Organization: GitHub, Documentation, Project Planning

Link

 

9/2/2021

4 (Activity)

Hands-on with ML Workloads

Learn what a typical ML lifecycle looks like

Link

https://cyverse.atlassian.net/wiki/spaces/acic2021/pages/1556905985

9/7/2021

5 (Lecture)

ML Workloads/ Code management with Notebooks and Git

How to organize code, experiments. Emphasis on open science

Link

 

9/9/2021

6 (Activity)

Hands on with Notebooks/Git (though CyVerse)

Practical consideration for creating reproducible analysis with notebooks and version control

Hands-on Link

 

9/14/2021

7 (Lecture)

Introduction to cloud

How to cost/budget utilize cloud

Link

 

9/16/2021

8

More depth with CyVerse platform

Using data management platforms, metadata management

Hands-on Link

 

9/21/2021

9

Terminal and CLI

Data Science on the command line

Link

 

9/23/2021

10

Hands on CLI

How to use webshell, conda etc.

Link

 

9/28/2021

11

Intro to HPC

 

Key features of an HPC, the kinds of problems that they are used to solve

 

Link

 

9/30/2021

12

Hands on UA HPC exercise

Log into UA HPC

Link

 

10/5/2021

13

Hands on HPC part two

Hands on examples of how to run HPC jobs including interactive jobs

Link

 

10/7/2021

14

Intro to Machine Learning

Midterm discussion

Types of Machine Learning and which to use for different datasets/problems

Midterm discussion for : https://mlcas2021.github.io/

Link

Teams work on midterm projects and presentations

Work on your mid term during class, swap ideas and solutions

10/12/2021

15

Midterm Team formation; Ten simple rules to cultivate transdisciplinary collaboration in data science

Midterm competition is due Oct. 18th. You need to an interdisciplinary team to win. What skills and expertise will you bring together, and how will you ensure the success of your team.

Link

 

10/14/2021

16

Stand up reporting from each team; work on midterm in class

Dedicated time where everyone can meet together often leads to rapid progress. Prove this to be true in class as each team gets their project ready for submission on 10/18

 

 

10/19/2021

17

Containers and virtualization

Introduction to Software containers capabilities and applications of reproducible analysis

 

 

10/21/2021

18

Making containers/management with DockerHub

Sharing containers and role of Git for managing versions of containers

 

 

10/26/2021

19

Teams go to IEEE Data Viz https://azvis2021.github.io/

Learn importance of information visualization

 

 

10/28/2021

20

Team presentations: make report on one cool data viz technology showing how they got it to work and how it may be applied

Understand how to integrate novel advances in Data viz into current problem space, challenges in doing that

 

 

11/2/2021

21

Organizing data and containers

Sharing reproducible analysis and workflows

Link

 

11/4/2021

22

Docker Containers

Foundations of Docker

 

 

11/9/2021

23

Building and using Docker Containers

Creating your own containers and running them, attaching data

Link to PLoS 10 Simple rules
Link to VIB Docker Tutorial

 

11/11/2021

24

Using web-tools for 2D/3D image/point cloud labeling

Data preparation for ML training

 

 

11/16/2021

25

Relationship between Cyberinfrastructure & ML Workloads (MLOps)

What is MLOps and why is it important for production use

Final overview

 

11/18/2021

26

Back fill -- special topic as needed

 

 

Homework 11 -- Final Project Pitch Presentation

11/23/2021

27

Workflows Management Systems (WMS) lecture

Role of WMS for managing ML workloads

 

 

11/25/2021

X

Thanksgiving

 

 

 

11/30/2021

28

Workflows: Hands on with Snake Make (or similar)

Hands on exercise with select workflow managers (cloud and HPC)

 

 

12/2/2021

29

Check in on Final projects

Prepare for final project, stand ups

 

 

12/7/2021

Last Class

Work on Final projects

Prepare for final project, stand ups

 

 

12/9/2021

Reading Day

 

 

 

 

12/14/2021

Scheduled Final

 

8-10am

 

 

12/16/2021

All Work Due