Skip to content

Commit

Permalink
More formatting fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
jacobtomlinson committed Nov 23, 2023
1 parent 9a299ea commit af1459b
Showing 1 changed file with 6 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -177,10 +177,7 @@
"security_group = \"rapidsaiclouddeploymenttest-nsg\"\n",
"vm_size = \"Standard_NC12s_v3\" # or choose a different GPU enabled VM type\n",
"\n",
"docker_image = (\n",
" \"rapidsai/base:23.08-cuda12.0-py3.10\"\n",
" # nvcr.io/nvidia/rapidsai/base:23.08-cuda12.0-py3.10\n",
")\n",
"docker_image = \"{{rapids_container}}\"\n",
"docker_args = \"--shm-size=256m\"\n",
"worker_class = \"dask_cuda.CUDAWorker\""
]
Expand Down Expand Up @@ -349,7 +346,7 @@
"source": [
"#### a. Install `packer`\n",
"\n",
"Follow the guidelines in https://learn.hashicorp.com/tutorials/packer/get-started-install-cli?in=packer/azure-get-started to download the necessary binary according to your platform and install it. "
"Follow the [getting started guide](https://learn.hashicorp.com/tutorials/packer/get-started-install-cli?in=packer/azure-get-started) to download the necessary binary according to your platform and install it. "
]
},
{
Expand Down Expand Up @@ -617,7 +614,7 @@
"```\n",
"\n",
"```{note}\n",
"**NOTE on $n\\_workers$ as a parameter:** The number of actual workers that our cluster would have is not always equal to the number of VMs spawned i.e. the value of $n\\_workers$ passed in. If the number of GPUs in the chosen `vm_size` is $G$ and number of VMs spawned is $n\\_workers$, then we have then number of actual workers $W = n\\_workers \\times G$. For example, for Standard_NC12s_v3 VMs that have 2 V100 GPUs per VM, for $n\\_workers=2$, we have $W = 2 \\times 2=4$.\n",
"The number of actual workers that our cluster would have is not always equal to the number of VMs spawned i.e. the value of $n\\_workers$ passed in. If the number of GPUs in the chosen `vm_size` is $G$ and number of VMs spawned is $n\\_workers$, then we have then number of actual workers $W = n\\_workers \\times G$. For example, for `Standard_NC12s_v3` VMs that have 2 V100 GPUs per VM, for $n\\_workers=2$, we have $W = 2 \\times 2=4$.\n",
"```"
]
},
Expand Down Expand Up @@ -1446,7 +1443,9 @@
"\n",
"`add_features` function combines the two to produce a new dataframe that has the added features.\n",
"\n",
"**NOTE:** In the function `persist_train_infer_split`, We will also persist the test dataset in the workers. If the `X_infer` i.e. the test dataset is small enough, we can call `compute()` on it to bring the test dataset to the local machine and then perform predict on it. But in general, if the `X_infer` is large, it may not fit in the GPU(s) of the local machine. Moreover, moving around a large amount of data will also add to the prediction latency. Therefore it is better to persist the test dataset on the dask workers, and then call the predict functionality on the individual workers. Finally we collect the prediction results from the dask workers. "
"```{note}\n",
"In the function `persist_train_infer_split`, We will also persist the test dataset in the workers. If the `X_infer` i.e. the test dataset is small enough, we can call `compute()` on it to bring the test dataset to the local machine and then perform predict on it. But in general, if the `X_infer` is large, it may not fit in the GPU(s) of the local machine. Moreover, moving around a large amount of data will also add to the prediction latency. Therefore it is better to persist the test dataset on the dask workers, and then call the predict functionality on the individual workers. Finally we collect the prediction results from the dask workers. \n",
"```"
]
},
{
Expand Down

0 comments on commit af1459b

Please sign in to comment.