Skip to content

Commit

Permalink
deploy: b74c367
Browse files Browse the repository at this point in the history
  • Loading branch information
jdries committed Nov 5, 2023
1 parent a93cb98 commit 04261b8
Show file tree
Hide file tree
Showing 11 changed files with 586 additions and 362 deletions.
Binary file added _images/openeo_networkio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions _sources/part3/scaling_openeo.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,31 @@ An example is shown below.

![Spark graph](../figures/openeo_spark.png)

### Performance: why does it take so long for openEO to retrieve a simple timeseries?

What you may notice when working with an openEO backend, is that it can take up to a few minutes to retrieve a 'simple'
timeseries of for instance Sentinel-2 data. While when loading that timeseries from a netCDF file on your laptop, it's
instantaneous. For many, this is annoying when trying to work interactively, and openEO advices to use the 'local processing'
feature. So what is going on here?

Warning: this answer is about a specific, but fairly common, backend setup. It does not reflect a general limitation in
the openEO design!

The problem is that large EO archives, like Copernicus Sentinel-2 and Sentinel-1, but also Landsat, are stored per 'product'
on large scale storage systems that are accessed over a network. The consequence is that in most EO-workflows, loading
the data (IO) remains the big bottleneck. So while many algorithm writers focus on the processing performance, it is often
reading data from 1000's of files (e.g. 10 bands x 100 observations) over a network that takes most time.

When your multiband timeseries is stored as a single netCDF file on the SSD of your laptop, most of the heavy lifting has
in fact been done, because you then have something that can be read into memory at once in a second or less.

So does this mean that you are better off downloading everything locally and processing on your own resources?
In fact not, the graph below shows the reading speed of an openEO cluster that is processing a number of batch jobs.
As you can see, in this case it was able to read from EO data at speeds between 4 GB/s and 10 GB/s, which will be hard
to achieve when going over the internet.

![openEO IO](../figures/openeo_networkio.png)


### Example architecture

Expand Down
56 changes: 28 additions & 28 deletions part1/visualization.html

Large diffs are not rendered by default.

170 changes: 85 additions & 85 deletions part1/xarray_pitfalls.html

Large diffs are not rendered by default.

186 changes: 93 additions & 93 deletions part3/chunking_introduction.html

Large diffs are not rendered by default.

7 changes: 2 additions & 5 deletions part3/data_exploitability_openEO.html
Original file line number Diff line number Diff line change
Expand Up @@ -500,10 +500,7 @@ <h2>Snow and Cloud Cover Example<a class="headerlink" href="#snow-and-cloud-cove
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
</pre></div>
</div>
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span> 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
</pre></div>
</div>
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>100 1065k 100 1065k 0 0 2507k 0 --:--:-- --:--:-- --:--:-- 2507k
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>100 1065k 100 1065k 0 0 8282k 0 --:--:-- --:--:-- --:--:-- 8326k
</pre></div>
</div>
</div>
Expand All @@ -521,7 +518,7 @@ <h2>Snow and Cloud Cover Example<a class="headerlink" href="#snow-and-cloud-cove
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
</pre></div>
</div>
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>100 530k 100 530k 0 0 1506k 0 --:--:-- --:--:-- --:--:-- 1510k
<div class="output stream highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>100 530k 100 530k 0 0 1588k 0 --:--:-- --:--:-- --:--:-- 1592k
</pre></div>
</div>
</div>
Expand Down
6 changes: 3 additions & 3 deletions part3/data_exploitability_pangeo.html
Original file line number Diff line number Diff line change
Expand Up @@ -1116,11 +1116,11 @@ <h2>Import libraries<a class="headerlink" href="#import-libraries" title="Permal
background-color: transparent !important;
}
</style></div><div class="output text_html"><div id='p1002'>
<div id="b9bb1c88-2295-4e1c-b51f-552d2515c497" data-root-id="p1002" style="display: contents;"></div>
<div id="e2f3b828-ef1c-4749-991e-9cc71ee2ae5a" data-root-id="p1002" style="display: contents;"></div>
</div>
<script type="application/javascript">(function(root) {
var docs_json = {"f0aebc9a-c117-4dbd-b4ac-6b7f499652ab":{"version":"3.3.0","title":"Bokeh Application","roots":[{"type":"object","name":"panel.models.browser.BrowserInfo","id":"p1002"},{"type":"object","name":"panel.models.comm_manager.CommManager","id":"p1003","attributes":{"plot_id":"p1002","comm_id":"192622c3a2aa4ddb82a464e8ba535228","client_comm_id":"be4b249ccb614f99829881d1a44dccb9"}}],"defs":[{"type":"model","name":"ReactiveHTML1"},{"type":"model","name":"FlexBox1","properties":[{"name":"align_content","kind":"Any","default":"flex-start"},{"name":"align_items","kind":"Any","default":"flex-start"},{"name":"flex_direction","kind":"Any","default":"row"},{"name":"flex_wrap","kind":"Any","default":"wrap"},{"name":"justify_content","kind":"Any","default":"flex-start"}]},{"type":"model","name":"FloatPanel1","properties":[{"name":"config","kind":"Any","default":{"type":"map"}},{"name":"contained","kind":"Any","default":true},{"name":"position","kind":"Any","default":"right-top"},{"name":"offsetx","kind":"Any","default":null},{"name":"offsety","kind":"Any","default":null},{"name":"theme","kind":"Any","default":"primary"},{"name":"status","kind":"Any","default":"normalized"}]},{"type":"model","name":"GridStack1","properties":[{"name":"mode","kind":"Any","default":"warn"},{"name":"ncols","kind":"Any","default":null},{"name":"nrows","kind":"Any","default":null},{"name":"allow_resize","kind":"Any","default":true},{"name":"allow_drag","kind":"Any","default":true},{"name":"state","kind":"Any","default":[]}]},{"type":"model","name":"drag1","properties":[{"name":"slider_width","kind":"Any","default":5},{"name":"slider_color","kind":"Any","default":"black"},{"name":"value","kind":"Any","default":50}]},{"type":"model","name":"click1","properties":[{"name":"terminal_output","kind":"Any","default":""},{"name":"debug_name","kind":"Any","default":""},{"name":"clears","kind":"Any","default":0}]},{"type":"model","name":"toggle_value1","properties":[{"name":"active_icons","kind":"Any","default":{"type":"map"}},{"name":"options","kind":"Any","default":{"type":"map","entries":[["favorite","heart"]]}},{"name":"value","kind":"Any","default":[]},{"name":"_reactions","kind":"Any","default":[]},{"name":"_base_url","kind":"Any","default":"https://tabler-icons.io/static/tabler-icons/icons/"}]},{"type":"model","name":"copy_to_clipboard1","properties":[{"name":"value","kind":"Any","default":null},{"name":"fill","kind":"Any","default":"none"}]},{"type":"model","name":"FastWrapper1","properties":[{"name":"object","kind":"Any","default":null},{"name":"style","kind":"Any","default":null}]},{"type":"model","name":"NotificationAreaBase1","properties":[{"name":"js_events","kind":"Any","default":{"type":"map"}},{"name":"position","kind":"Any","default":"bottom-right"},{"name":"_clear","kind":"Any","default":0}]},{"type":"model","name":"NotificationArea1","properties":[{"name":"js_events","kind":"Any","default":{"type":"map"}},{"name":"notifications","kind":"Any","default":[]},{"name":"position","kind":"Any","default":"bottom-right"},{"name":"_clear","kind":"Any","default":0},{"name":"types","kind":"Any","default":[{"type":"map","entries":[["type","warning"],["background","#ffc107"],["icon",{"type":"map","entries":[["className","fas fa-exclamation-triangle"],["tagName","i"],["color","white"]]}]]},{"type":"map","entries":[["type","info"],["background","#007bff"],["icon",{"type":"map","entries":[["className","fas fa-info-circle"],["tagName","i"],["color","white"]]}]]}]}]},{"type":"model","name":"Notification","properties":[{"name":"background","kind":"Any","default":null},{"name":"duration","kind":"Any","default":3000},{"name":"icon","kind":"Any","default":null},{"name":"message","kind":"Any","default":""},{"name":"notification_type","kind":"Any","default":null},{"name":"_destroyed","kind":"Any","default":false}]},{"type":"model","name":"TemplateActions1","properties":[{"name":"open_modal","kind":"Any","default":0},{"name":"close_modal","kind":"Any","default":0}]},{"type":"model","name":"BootstrapTemplateActions1","properties":[{"name":"open_modal","kind":"Any","default":0},{"name":"close_modal","kind":"Any","default":0}]},{"type":"model","name":"MaterialTemplateActions1","properties":[{"name":"open_modal","kind":"Any","default":0},{"name":"close_modal","kind":"Any","default":0}]}]}};
var render_items = [{"docid":"f0aebc9a-c117-4dbd-b4ac-6b7f499652ab","roots":{"p1002":"b9bb1c88-2295-4e1c-b51f-552d2515c497"},"root_ids":["p1002"]}];
var docs_json = {"57ed64f8-5eca-41a4-8ae0-a80110b22a52":{"version":"3.3.0","title":"Bokeh Application","roots":[{"type":"object","name":"panel.models.browser.BrowserInfo","id":"p1002"},{"type":"object","name":"panel.models.comm_manager.CommManager","id":"p1003","attributes":{"plot_id":"p1002","comm_id":"ba1d084fa73c4ea1a67e4073d39536a7","client_comm_id":"8db187b9e3c44bbb97ea722c7458f8f1"}}],"defs":[{"type":"model","name":"ReactiveHTML1"},{"type":"model","name":"FlexBox1","properties":[{"name":"align_content","kind":"Any","default":"flex-start"},{"name":"align_items","kind":"Any","default":"flex-start"},{"name":"flex_direction","kind":"Any","default":"row"},{"name":"flex_wrap","kind":"Any","default":"wrap"},{"name":"justify_content","kind":"Any","default":"flex-start"}]},{"type":"model","name":"FloatPanel1","properties":[{"name":"config","kind":"Any","default":{"type":"map"}},{"name":"contained","kind":"Any","default":true},{"name":"position","kind":"Any","default":"right-top"},{"name":"offsetx","kind":"Any","default":null},{"name":"offsety","kind":"Any","default":null},{"name":"theme","kind":"Any","default":"primary"},{"name":"status","kind":"Any","default":"normalized"}]},{"type":"model","name":"GridStack1","properties":[{"name":"mode","kind":"Any","default":"warn"},{"name":"ncols","kind":"Any","default":null},{"name":"nrows","kind":"Any","default":null},{"name":"allow_resize","kind":"Any","default":true},{"name":"allow_drag","kind":"Any","default":true},{"name":"state","kind":"Any","default":[]}]},{"type":"model","name":"drag1","properties":[{"name":"slider_width","kind":"Any","default":5},{"name":"slider_color","kind":"Any","default":"black"},{"name":"value","kind":"Any","default":50}]},{"type":"model","name":"click1","properties":[{"name":"terminal_output","kind":"Any","default":""},{"name":"debug_name","kind":"Any","default":""},{"name":"clears","kind":"Any","default":0}]},{"type":"model","name":"toggle_value1","properties":[{"name":"active_icons","kind":"Any","default":{"type":"map"}},{"name":"options","kind":"Any","default":{"type":"map","entries":[["favorite","heart"]]}},{"name":"value","kind":"Any","default":[]},{"name":"_reactions","kind":"Any","default":[]},{"name":"_base_url","kind":"Any","default":"https://tabler-icons.io/static/tabler-icons/icons/"}]},{"type":"model","name":"copy_to_clipboard1","properties":[{"name":"value","kind":"Any","default":null},{"name":"fill","kind":"Any","default":"none"}]},{"type":"model","name":"FastWrapper1","properties":[{"name":"object","kind":"Any","default":null},{"name":"style","kind":"Any","default":null}]},{"type":"model","name":"NotificationAreaBase1","properties":[{"name":"js_events","kind":"Any","default":{"type":"map"}},{"name":"position","kind":"Any","default":"bottom-right"},{"name":"_clear","kind":"Any","default":0}]},{"type":"model","name":"NotificationArea1","properties":[{"name":"js_events","kind":"Any","default":{"type":"map"}},{"name":"notifications","kind":"Any","default":[]},{"name":"position","kind":"Any","default":"bottom-right"},{"name":"_clear","kind":"Any","default":0},{"name":"types","kind":"Any","default":[{"type":"map","entries":[["type","warning"],["background","#ffc107"],["icon",{"type":"map","entries":[["className","fas fa-exclamation-triangle"],["tagName","i"],["color","white"]]}]]},{"type":"map","entries":[["type","info"],["background","#007bff"],["icon",{"type":"map","entries":[["className","fas fa-info-circle"],["tagName","i"],["color","white"]]}]]}]}]},{"type":"model","name":"Notification","properties":[{"name":"background","kind":"Any","default":null},{"name":"duration","kind":"Any","default":3000},{"name":"icon","kind":"Any","default":null},{"name":"message","kind":"Any","default":""},{"name":"notification_type","kind":"Any","default":null},{"name":"_destroyed","kind":"Any","default":false}]},{"type":"model","name":"TemplateActions1","properties":[{"name":"open_modal","kind":"Any","default":0},{"name":"close_modal","kind":"Any","default":0}]},{"type":"model","name":"BootstrapTemplateActions1","properties":[{"name":"open_modal","kind":"Any","default":0},{"name":"close_modal","kind":"Any","default":0}]},{"type":"model","name":"MaterialTemplateActions1","properties":[{"name":"open_modal","kind":"Any","default":0},{"name":"close_modal","kind":"Any","default":0}]}]}};
var render_items = [{"docid":"57ed64f8-5eca-41a4-8ae0-a80110b22a52","roots":{"p1002":"e2f3b828-ef1c-4749-991e-9cc71ee2ae5a"},"root_ids":["p1002"]}];
var docs = Object.values(docs_json)
if (!docs) {
return
Expand Down
2 changes: 1 addition & 1 deletion part3/peak_valley.html
Original file line number Diff line number Diff line change
Expand Up @@ -525,7 +525,7 @@ <h2>Downloading the S2 time series<a class="headerlink" href="#downloading-the-s
</div>
</div>
<div class="cell_output docutils container">
<div class="output text_html">Visit <a href="https://aai.egi.eu/device?user_code=XMNS-BQVK" title="Authenticate at https://aai.egi.eu/device?user_code=XMNS-BQVK" target="_blank" rel="noopener noreferrer">https://aai.egi.eu/device?user_code=XMNS-BQVK</a> <a href="#" onclick="navigator.clipboard.writeText('https://aai.egi.eu/device?user_code=XMNS-BQVK');return false;" title="Copy authentication URL to clipboard">&#128203;</a> to authenticate.</div><div class="output text_html"><code>[-------------------------------------]</code> ❌ Timed out</div><div class="output traceback highlight-ipythontb notranslate"><div class="highlight"><pre><span></span><span class="gt">---------------------------------------------------------------------------</span>
<div class="output text_html">Visit <a href="https://aai.egi.eu/device?user_code=INXK-SFKU" title="Authenticate at https://aai.egi.eu/device?user_code=INXK-SFKU" target="_blank" rel="noopener noreferrer">https://aai.egi.eu/device?user_code=INXK-SFKU</a> <a href="#" onclick="navigator.clipboard.writeText('https://aai.egi.eu/device?user_code=INXK-SFKU');return false;" title="Copy authentication URL to clipboard">&#128203;</a> to authenticate.</div><div class="output text_html"><code>[-------------------------------------]</code> ❌ Timed out</div><div class="output traceback highlight-ipythontb notranslate"><div class="highlight"><pre><span></span><span class="gt">---------------------------------------------------------------------------</span>
<span class="ne">OidcDeviceCodePollTimeout</span><span class="g g-Whitespace"> </span>Traceback (most recent call last)
<span class="n">Cell</span> <span class="n">In</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="n">line</span> <span class="mi">1</span>
<span class="ne">----&gt; </span><span class="mi">1</span> <span class="n">connection</span> <span class="o">=</span> <span class="n">openeo</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s2">&quot;openeo.cloud&quot;</span><span class="p">)</span><span class="o">.</span><span class="n">authenticate_oidc</span><span class="p">()</span>
Expand Down
472 changes: 326 additions & 146 deletions part3/scaling_dask.html

Large diffs are not rendered by default.

22 changes: 22 additions & 0 deletions part3/scaling_openeo.html
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,7 @@ <h2> Contents </h2>
<nav aria-label="Page">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#under-the-hood">Under the hood</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#performance-why-does-it-take-so-long-for-openeo-to-retrieve-a-simple-timeseries">Performance: why does it take so long for openEO to retrieve a simple timeseries?</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#example-architecture">Example architecture</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#understanding-logs">Understanding logs</a></li>
</ul>
Expand Down Expand Up @@ -455,6 +456,26 @@ <h2>Under the hood<a class="headerlink" href="#under-the-hood" title="Permalink
<p>In a real world backend, these concepts result in a graph of processing steps (‘stages’ in Spark), and a number of tasks per step that can be executed in parallel.
An example is shown below.</p>
<p><img alt="Spark graph" src="../_images/openeo_spark.png" /></p>
<section id="performance-why-does-it-take-so-long-for-openeo-to-retrieve-a-simple-timeseries">
<h3>Performance: why does it take so long for openEO to retrieve a simple timeseries?<a class="headerlink" href="#performance-why-does-it-take-so-long-for-openeo-to-retrieve-a-simple-timeseries" title="Permalink to this heading">#</a></h3>
<p>What you may notice when working with an openEO backend, is that it can take up to a few minutes to retrieve a ‘simple’
timeseries of for instance Sentinel-2 data. While when loading that timeseries from a netCDF file on your laptop, it’s
instantaneous. For many, this is annoying when trying to work interactively, and openEO advices to use the ‘local processing’
feature. So what is going on here?</p>
<p>Warning: this answer is about a specific, but fairly common, backend setup. It does not reflect a general limitation in
the openEO design!</p>
<p>The problem is that large EO archives, like Copernicus Sentinel-2 and Sentinel-1, but also Landsat, are stored per ‘product’
on large scale storage systems that are accessed over a network. The consequence is that in most EO-workflows, loading
the data (IO) remains the big bottleneck. So while many algorithm writers focus on the processing performance, it is often
reading data from 1000’s of files (e.g. 10 bands x 100 observations) over a network that takes most time.</p>
<p>When your multiband timeseries is stored as a single netCDF file on the SSD of your laptop, most of the heavy lifting has
in fact been done, because you then have something that can be read into memory at once in a second or less.</p>
<p>So does this mean that you are better off downloading everything locally and processing on your own resources?
In fact not, the graph below shows the reading speed of an openEO cluster that is processing a number of batch jobs.
As you can see, in this case it was able to read from EO data at speeds between 4 GB/s and 10 GB/s, which will be hard
to achieve when going over the internet.</p>
<p><img alt="openEO IO" src="../_images/openeo_networkio.png" /></p>
</section>
<section id="example-architecture">
<h3>Example architecture<a class="headerlink" href="#example-architecture" title="Permalink to this heading">#</a></h3>
<p>There is no single openEO architecture, as openEO is just an API that can be implemented on top of many technologies. That is also
Expand Down Expand Up @@ -571,6 +592,7 @@ <h2>Continental scale processing<a class="headerlink" href="#continental-scale-p
<nav class="bd-toc-nav page-toc">
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#under-the-hood">Under the hood</a><ul class="nav section-nav flex-column">
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#performance-why-does-it-take-so-long-for-openeo-to-retrieve-a-simple-timeseries">Performance: why does it take so long for openEO to retrieve a simple timeseries?</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#example-architecture">Example architecture</a></li>
<li class="toc-h3 nav-item toc-entry"><a class="reference internal nav-link" href="#understanding-logs">Understanding logs</a></li>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion searchindex.js

Large diffs are not rendered by default.

0 comments on commit 04261b8

Please sign in to comment.