Skip to content

Commit

Permalink
Merge pull request #27 from tinaok/dask_update
Browse files Browse the repository at this point in the history
update dask
  • Loading branch information
tinaok authored Oct 25, 2023
2 parents eb84b9a + 61ab51a commit e2db9ee
Showing 1 changed file with 14 additions and 8 deletions.
22 changes: 14 additions & 8 deletions tutorial/part3/scaling_dask.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -317,14 +317,16 @@
"import xarray as xr\n",
"\n",
"catalogue=\"https://object-store.cloud.muni.cz/swift/v1/foss4g-catalogue/c_gls_NDVI-LTS_1999-2019.json\"\n",
"LTS = xr.open_mfdataset(\n",
"#catalogue=\"test.json\"\n",
"\n",
"LTS = xr.open_dataset(\n",
" \"reference://\", engine=\"zarr\",\n",
" backend_kwargs={\n",
" \"storage_options\": {\n",
" \"fo\":catalogue\n",
" },\n",
" \"consolidated\": False\n",
" }\n",
" },chunks={}\n",
")\n",
"LTS"
]
Expand All @@ -347,7 +349,7 @@
"outputs": [],
"source": [
"save = LTS.sel(lat=45.50, lon=9.36, method='nearest')['min'].mean()\n",
"save.data"
"save"
]
},
{
Expand All @@ -359,7 +361,9 @@
"\n",
"We didn't 'compute' anything. We just built a Dask task graph with it's size indicated as count above, but did not ask Dask to return a result.\n",
"\n",
"But the 'task Count' we see above is more than 6000 for just computing a mean on 36 temporal steps. This is too much. If you have such case, to avoid unecessary operations, you can optimize the task using `dask.optimize`. \n",
"Here, you can check 'Dask graph' with how many layers of graph you have, to estimate the complexity of your computation.\n",
"\n",
"It is indicated that you have '7 graph'. this can be optimised with following step \n",
"\n",
"Lets try to plot the dask graph before computation and understand what dask workers will do to compute the value we asked for. "
]
Expand All @@ -375,8 +379,10 @@
{
"cell_type": "code",
"execution_count": null,
"id": "22c6888b-de87-4989-8975-50a0d2a1fcbe",
"metadata": {},
"id": "c0a8c5ab-eda3-4d1c-a2dd-2e616c0d9ade",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import dask\n",
Expand All @@ -389,7 +395,7 @@
"id": "537cd461-8f9d-4651-9190-73d5eb6a40ef",
"metadata": {},
"source": [
"Now our task is reduced to about 100. Lets try to visualise it:"
"Now our graph is reduced 1. Lets try to visualise it:"
]
},
{
Expand Down Expand Up @@ -976,7 +982,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
"version": "3.11.6"
}
},
"nbformat": 4,
Expand Down

0 comments on commit e2db9ee

Please sign in to comment.