Skip to content

Commit

Permalink
Merge branch 'pangeo-data:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
pl-marasco authored Nov 3, 2023
2 parents 5c130b4 + f4d6e84 commit 5d44dc7
Show file tree
Hide file tree
Showing 6 changed files with 87 additions and 72 deletions.
90 changes: 50 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Scaling Big Data Analysis with Pangeo and OpenEO: Unlocking the Power of Space Data
# Unlocking the Power of Space Data with Pangeo & OpenEO

<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
[![All Contributors](https://img.shields.io/badge/all_contributors-10-orange.svg?style=flat-square)](#contributors-)
Expand All @@ -11,47 +11,57 @@ This repository contains the documentation and jupyter notebooks used for delive

The content of this repository (folder `tutorial`) is rendered as an online document using [Jupyter Book](https://jupyterbook.org/en/stable/intro.html). **You can access it [here](https://pangeo-data.github.io/pangeo-openeo-BiDS-2023)**.

## Agenda
# Timeline of the workshops

The programmes for each workshop are given below for your information. Each workshop is held separately.

## Introduction to Pangeo

| Time | Activity |
| ---- | -------- |
| 09:00 | 👋 Welcome (5 minutes) |
| 9:05 | Introduction and Motivation (15 minutes) |
| 9:20 | Overview of the Pangeo ecosystem (20 minutes) |
| 9:40 | Understanding Xarray to avoid common pitfalls (30 minutes) |
| 10:10 | Interactive Visualization with Hvplot (15 minutes) |
| 10:30 | Break (30 minutes) | |

## Introduction to OpenEO

| Time | Activity |
| ---- | -------- |
| 11:00 | 👋 Introduction and motivation (5 minutes) |
| 11:05 | Getting started with OpenEO (10 minutes) |
| 11:15 | Accessing and processing data with OpenEO (30 minutes) |
| 11:45 | ntegrate custom code into your workflow using User Defined Functions (30 minutes) |
| 12:15 | Q&A session - feedbacks (20 minutes) |
| 12:30 | 🍽️ Lunch |

## Unlocking the Power of Space Data with Pangeo & OpenEO

Please note that this workshop assumes some prior knowledge of Pangeo and OpenEO. If you are not familiar with any of these technologies, we suggest to check the content of the two other workshops (taught in the morning).

| Time | Activity |
| ---- | -------- |
| 14:00 | 👋 Introduction and motivation (5 minutes) |
| 14:05 | Data discoverability and searchability (25 minutes) |
| | An overview of STAC and the different available sources|
| 14:30 | Data and pre-processing general knowledge (60 minutes) |
| | Introduction to chunking with netCDF, ZARR and Kerchunk |
| | Parallelization with Dask |
| 15:30 | ☕️ Break |
| 16:00 | Different data exploitation approaches (60 minutes) |
| | How to exploit data with OpenEO: snow coverage example |
| | How to exploit data on Pangeo: pure Xarray version |
| 17:00 | How to go beyond (45 minutes)|
| | Custom algorithms: UDF (OpenEO) and ufunc (Xarray) |
| | Scaling with OpenEO (how it works underneath) |
| 17:40 | Wrap-up and feedback survey (15 minutes) |

These timelines are purely approximative and given for indication purpose only. We will adjust depending on the audience.
There will be additional breaks (5 minutes) regularly and time for questions during the workshops.

**Part-1: Pangeo**

- 9:00 Welcome (5 minutes)
- 9:05 Introduction and Motivation (15 minutes)
- 9:20 Overview of the Pangeo ecosystem (20 minutes)
- 9:40 Understanding Xarray to avoid common pitfalls (30 minutes)
- 10:10 Interactive Visualization with Hvplot (20 minutes)
- 10:30 Break (30 minutes)

**Part-2: OpenEO**

- 11:00 Getting started with OpenEO
- 11:15 Finding Data, Running first graphs, difference to client-side processing
- 11:45 Integrate custom code into your workflow using User Defined Functions
- 12:15 Feedback Block
- 12:30 Lunch

**Part-3: Unlocking the Power of Space Data with Pangeo & OpenEO**

- 14:00 Introduction to the afternoon session

- 14:05 Data discoverability and searchability
- An overview of STAC and different sources and platforms (openeo.cloud, STAC browser, STAC Index ...)

- 14:30 Data and pre-processing general knowledge
- Introduction to chunking examples with netcdf, zarr and Kerchunk
- Parallelization with Dask

- 16:00 Different data exploitation approaches
- How to exploit data on OpenEO (30 minutes)
- Snow coverage example
- How to exploit data on Pangeo (30 minutes)
- Snow coverage pure xarray version

- 17:00 How to go beyond
- Understanding how to create a custom algorithm: UDF (OpenEO) and ufunc (Xarray) (20 minutes)
- Scaling with OpenEO (how it works underneath) (30 minutes)

- 17:45 Wrap-up and feedback survey (15 minutes)

## Contributors ✨

Expand Down
22 changes: 12 additions & 10 deletions tutorial/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,12 @@ parts:

- caption: Part 2 - OpenEO
chapters:
- file: part2/agenda_and_links
title: Agenda
- file: part2/openEO - boa_sentinel_2
title: Sentinel-2 Bottom Of Atmosphere (BOA)
- file: part2/openEO - Corine Land Cover Alps
title: Corine Land Cover Alps
- file: part2/openEO - Client Side Land Cover Alps
title: Client Side Land Cover Alps
- file: part2/advanced_workflows
title: Advanced workflows
- file: part2/stac_metadata
Expand All @@ -40,18 +44,16 @@ parts:
chapters:
- file: part3/data_discovery
title: Data discoverability and searchability
- file: part3/chunking_introduction
title: Introduction to chunking with netCDF, ZARR and Kerchunk
- file: part3/scaling_dask
title: Parallelization with Dask
- file: part3/data_exploitability_openEO
title: How to exploit data on openEO
title: How to exploit data with openEO
- file: part3/data_exploitability_pangeo
title: How to exploit data on Pangeo
- file: part3/chunking_introduction
title: Data chunking with zarr
- file: part3/scaling_dask
title: Scaling with Dask
- file: part3/advanced_udf
title: Advanced OpenEO (UDF)
- file: part3/advanced_ufunc
title: Advanced OpenEO (Ufunc)
title: Custom algorithms: UDF (OpenEO) and ufunc (Xarray)
- file: part3/scaling_openeo
title: Scaling with OpenEO

Expand Down
35 changes: 19 additions & 16 deletions tutorial/about/timeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,36 +11,39 @@ The programmes for each workshop are given below for your information. Each work
| 9:20 | Overview of the Pangeo ecosystem (20 minutes) |
| 9:40 | Understanding Xarray to avoid common pitfalls (30 minutes) |
| 10:10 | Interactive Visualization with Hvplot (15 minutes) |
| 10:25 | Wrap-up and feedback survey (5 minutes) | |
| 10:30 | Break (30 minutes) | |

## Introduction to OpenEO

| Time | Activity |
| ---- | -------- |
| 11:00 | 👋 Welcome (5 minutes) |
| 11:00 | 👋 Introduction and motivation (5 minutes) |
| 11:05 | Getting started with OpenEO (10 minutes) |
| 11:15 | Accessing data with OpenEO (25 minutes) |
| 11:40 | Processing data with OpenEO (30 minutes) |
| 12:10 | Working with data cubes with OpenEO (20 minutes) |
| 11:15 | Accessing and processing data with OpenEO (30 minutes) |
| 11:45 | ntegrate custom code into your workflow using User Defined Functions (30 minutes) |
| 12:15 | Q&A session - feedbacks (20 minutes) |
| 12:30 | 🍽️ Lunch |

## Scaling Big Data Analysis with Pangeo & OpenEO: Unlocking the Power of Space Data
## Unlocking the Power of Space Data with Pangeo & OpenEO

Please note that this workshop assumes some prior knowledge of Pangeo and OpenEO. If you are not familiar with any of these technologies, we suggest to check the content of the two other workshops (taught in the morning).

| Time | Activity |
| ---- | -------- |
| 14:00 | 👋 Welcome (5 minutes) |
| 14:05 | Understanding what OpenEO does best and how to exploit it to easily streamline your data analysis (20 minutes) |
| 14:25 | Scaling with OpenEO (25 minutes) |
| 14:50 | Understanding when and how to exploit Pangeo to customise your algorithm and analyse multiple data sources (20 minutes) |
| 15:10 | Introduction to chunking (20 minutes) |
| 14:00 | 👋 Introduction and motivation (5 minutes) |
| 14:05 | Data discoverability and searchability (25 minutes) |
| | An overview of STAC and the different available sources|
| 14:30 | Data and pre-processing general knowledge (60 minutes) |
| | Introduction to chunking with netCDF, ZARR and Kerchunk |
| | Parallelization with Dask |
| 15:30 | ☕️ Break |
| 16:00 | Scaling with Dask (30 minutes) |
| 16:30 | Cloud-friendly access to archival data with kerchunk (25 minutes) |
| 16:55 | Create Analysis Ready Cloud Optimised (ARCO) data (25 minutes) |
| 17:20 | Common workflow that combines the best of the two “worlds” (30 minutes) |
| 17:50 | Wrap-up and feedback survey (10 minutes) |
| 16:00 | Different data exploitation approaches (60 minutes) |
| | How to exploit data with OpenEO: snow coverage example |
| | How to exploit data on Pangeo: pure Xarray version |
| 17:00 | How to go beyond (45 minutes)|
| | Understanding how to create a custom algorithm: UDF (OpenEO) and ufunc (Xarray) |
| | Scaling with OpenEO (how it works underneath) |
| 17:40 | Wrap-up and feedback survey (15 minutes) |

These timelines are purely approximative and given for indication purpose only. We will adjust depending on the audience.
There will be additional breaks (5 minutes) regularly and time for questions during the workshops.
Expand Down
6 changes: 3 additions & 3 deletions tutorial/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ More information can be found on the [BiDS'23](https://www.bigdatafromspace2023.
### Pangeo & OpenEO tutorial

The tutorials are divided in 3 parts:
1. Introduction to the Pangeo ecosystem
2. Introduction to the OpenEO platform
3. Scaling Big Data Analysis with Pangeo and OpenEO: Unlocking the Power of Space Data
1. Introduction to Pangeo
2. Introduction to OpenEO
3. Unlocking the Power of Space Data with Pangeo & OpenEO

The workshop timelines, setup and content are accessible via the left menu of this webpage.

2 changes: 1 addition & 1 deletion tutorial/part2/agenda_and_links.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Part-2: OpenEO (Comment: Intro Session)
# Part-2: Introduction to OpenEO

This Part is going to introduce participants to openEO Platform. Attendees will learn what openEO Platform is, how to find data and run first process graphs.
We will also show attendees how to integrate custom code into your workflow using User Defined Functions.
Expand Down
4 changes: 2 additions & 2 deletions tutorial/part3/data_discovery.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Data discoverability and searchability

## An overview of STAC
## An overview of STAC and the different available sources

## Overview of different sources and platforms
The presentation is available here: https://docs.google.com/presentation/d/1bUTlmvrDMm1affvr8tq5gePqBigCTY9YVN3mtcscpJk/edit?usp=sharing

0 comments on commit 5d44dc7

Please sign in to comment.