forked from rapidsai/notebooks
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request rapidsai#3 from mt-jones/cudf-refactor
[REVIEW] adding notebooks and utilities for the mortgage workflow
- Loading branch information
Showing
8 changed files
with
979 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,4 @@ | ||
# notebooks | ||
RAPIDS Sample Notebooks | ||
# RAPIDS Notebooks and Utilities | ||
|
||
* `mortgage`: contains the notebook which runs ETL + ML on the Mortgage Dataset derived from [Fannie Mae’s Single-Family Loan Performance Data](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html) ... download the mortgage dataset for use with the notebook [here](https://rapidsai.github.io/datasets/) | ||
* `utils`: contains a set of useful scripts for interacting with RAPIDS |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# Utility Scripts | ||
|
||
## Summary | ||
|
||
* `start-jupyter.sh`: starts a JupyterLab environment for interacting with, and running, notebooks | ||
* `stop-jupyter.sh`: identifies all process IDs associated with Jupyter and kills them | ||
* `dask-cluster.py`: launches a configured Dask cluster (a set of nodes) for use within a notebook | ||
* `dask-setup.sh`: a low-level script for constructing a set of Dask workers on a single node | ||
|
||
## start-jupyter | ||
|
||
Typical output for `start-jupyter.sh` will be of the following form: | ||
|
||
```bash | ||
|
||
jupyter-lab --allow-root --ip=0.0.0.0 --no-browser --NotebookApp.token='' | ||
|
||
|
||
[I 09:58:01.481 LabApp] Writing notebook server cookie secret to /run/user/10060/jupyter/notebook_cookie_secret | ||
[W 09:58:01.928 LabApp] All authentication is disabled. Anyone who can connect to this server will be able to run code. | ||
[I 09:58:01.945 LabApp] JupyterLab extension loaded from /conda/envs/cudf/lib/python3.6/site-packages/jupyterlab | ||
[I 09:58:01.945 LabApp] JupyterLab application directory is /conda/envs/cudf/share/jupyter/lab | ||
[W 09:58:01.946 LabApp] JupyterLab server extension not enabled, manually loading... | ||
[I 09:58:01.949 LabApp] JupyterLab extension loaded from /conda/envs/cudf/lib/python3.6/site-packages/jupyterlab | ||
[I 09:58:01.949 LabApp] JupyterLab application directory is /conda/envs/cudf/share/jupyter/lab | ||
[I 09:58:01.950 LabApp] Serving notebooks from local directory: /workspace/notebooks/notebooks | ||
[I 09:58:01.950 LabApp] The Jupyter Notebook is running at: | ||
[I 09:58:01.950 LabApp] http://(dgx15 or 127.0.0.1):8888/ | ||
[I 09:58:01.950 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). | ||
``` | ||
|
||
`jupyter-lab` will expose a JupyterLab server on port `:8888`. Opening a web-browser, and navigating to `http://YOUR.IP.ADDRESS:8888` provides a GUI which can used to edit/run code. | ||
|
||
## stop-jupyter | ||
|
||
Sometimes a server needs to be forcibly shut down. Running | ||
|
||
```bash | ||
notebooks$ bash utils/stop-jupyter.sh | ||
``` | ||
|
||
will kill any and all JupyterLab servers running on the machine. | ||
|
||
## dask-cluster | ||
|
||
This is a Python script used to launch a Dask cluster. A configuration file is provided at `/path/to/notebooks/utils/dask.conf`. | ||
|
||
```bash | ||
notebooks$ cat utils/dask.conf | ||
|
||
ENVNAME cudf | ||
|
||
NWORKERS 8 | ||
|
||
12.34.567.890 MASTER | ||
|
||
DASK_SCHED_PORT 8786 | ||
DASK_SCHED_BOKEH_PORT 8787 | ||
DASK_WORKER_BOKEH_PORT 8790 | ||
|
||
DEBUG | ||
``` | ||
|
||
* `ENVNAME cudf`: a keyword to tell `dask-cluster.py` the name of the virtual environment where `cudf` is installed | ||
* `NWORKERS 8`: a keyword to tell `dask-cluster.py` how many workers to instantiate on the node which called `dask-cluster.py` | ||
* `12.34.567.890 MASTER`: a map of `IP.ADDRESS {WORKER/MASTER}` | ||
* `DASK_SCHED_PORT 8786`: a keyword to tell `dask-cluster.py` which port is assigned to the Dask scheduler | ||
* `DASK_SCHED_BOKEH_PORT 8787`: a keyword to tell `dask-cluster.py` which port is assigned to the scheduler's visual front-end | ||
* `DASK_WORKER_BOKEH_PORT 8790`: a keyword to tell `dask-cluster.py` which port is assigned to the worker's visual front-end | ||
* `DEBUG`: a keyword to tell `dask-cluster.py` to launch all Dask workers with log-level set to DEBUG | ||
|
||
## dask-setup | ||
|
||
`dask-setup.sh` expects several inputs, and order matters: | ||
|
||
* `ENVNAME`: name of the virtual environment where `cudf` is installed | ||
* `NWORKERS`: number of workers to create | ||
* `DASK_SCHED_PORT`: port to assign the scheduler | ||
* `DASK_SCHED_BOKEH_PORT`: port to assign the scheduler's front-end | ||
* `DASK_WORKER_BOKEH_PORT`: port to assign the worker's front-end | ||
* `YOUR.IP.ADDRESS`: machine's IP address | ||
* `{WORKER/MASTER}`: the node's title | ||
* `DEBUG`: log-level (optional, case-sensitive) | ||
|
||
The script is called as follows: | ||
|
||
```bash | ||
notebooks$ bash utils/dask-setup.sh 8 8786 8787 8790 12.34.567.890 MASTER DEBUG | ||
``` | ||
|
||
Note: `DEBUG` is optional. This script was designed to be called by `dask-cluster.py`. It is not meant to be called directly by a user other than to kill all present Dask workers: | ||
|
||
```bash | ||
notebooks$ bash utils/dask-setup.sh 0 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
import subprocess | ||
|
||
dask_conf_path = "../utils/dask.conf" | ||
with open(dask_conf_path, "r") as file: | ||
dask_conf = file.read() | ||
|
||
_dask_conf = dask_conf.split("\n") | ||
dask_conf = list() | ||
for i, line in enumerate(_dask_conf): | ||
line = line.split() | ||
if 0 < len(line): | ||
dask_conf.append(line) | ||
|
||
cmd = "bash ../utils/dask-setup.sh 0" | ||
|
||
print(cmd) | ||
|
||
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) | ||
output, error = process.communicate() | ||
|
||
cmd = "hostname --all-ip-addresses" | ||
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) | ||
output, error = process.communicate() | ||
IPADDR = str(output.decode()).split()[0] | ||
|
||
ENVNAME = None | ||
NWORKERS = None | ||
DASK_SCHED_PORT = None | ||
DASK_SCHED_BOKEH_PORT = None | ||
DASK_WORKER_BOKEH_PORT = None | ||
MASTER_IPADDR = None | ||
WHOAMI = None | ||
DEBUG = None | ||
|
||
for line in dask_conf: | ||
if line[0] == "ENVNAME": | ||
ENVNAME = line[1] | ||
if line[0] == "NWORKERS": | ||
NWORKERS = line[1] | ||
if line[0] == "DASK_SCHED_PORT": | ||
DASK_SCHED_PORT = line[1] | ||
if line[0] == "DASK_SCHED_BOKEH_PORT": | ||
DASK_SCHED_BOKEH_PORT = line[1] | ||
if line[0] == "DASK_WORKER_BOKEH_PORT": | ||
DASK_WORKER_BOKEH_PORT = line[1] | ||
if line[1] == "MASTER": | ||
MASTER_IPADDR = line[0] | ||
if line[0] == IPADDR: | ||
WHOAMI = line[1] | ||
if line[0] == "DEBUG" | ||
DEBUG = "DEBUG" | ||
|
||
cmd = "bash ../utils/dask-setup.sh " + str(ENVNAME) | ||
cmd = cmd + " " + str(NWORKERS) | ||
cmd = cmd + " " + str(DASK_SCHED_PORT) | ||
cmd = cmd + " " + str(DASK_SCHED_BOKEH_PORT) | ||
cmd = cmd + " " + str(DASK_WORKER_BOKEH_PORT) | ||
cmd = cmd + " " + str(MASTER_IPADDR) | ||
cmd = cmd + " " + str(WHOAMI) | ||
if DEBUG != None: | ||
cmd = cmd + " " + str(DEBUG) | ||
|
||
print(cmd) | ||
|
||
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) | ||
output, error = process.communicate() | ||
|
||
cmd = "screen -list" | ||
|
||
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE) | ||
output, error = process.communicate() | ||
print(output.decode()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
#!/bin/bash | ||
export NCCL_P2P_DISABLE=1 | ||
# export NCCL_SOCKET_IFNAME=ib | ||
|
||
export DASK_DISTRIBUTED__SCHEDULER__WORK_STEALING=False | ||
export DASK_DISTRIBUTED__SCHEDULER__BANDWIDTH=1 | ||
|
||
ENVNAME=$1 | ||
NWORKERS=$2 | ||
DASK_SCHED_PORT=$3 | ||
DASK_SCHED_BOKEH_PORT=$4 | ||
DASK_WORKER_BOKEH_PORT=$5 | ||
MASTER_IPADDR=$6 | ||
WHOAMI=$7 | ||
DEBUG=$8 | ||
|
||
DASK_LOCAL_DIR=./.dask | ||
NUM_GPUS=$(nvidia-smi --list-gpus | wc --lines) | ||
MY_IPADDR=($(hostname --all-ip-addresses)) | ||
|
||
mkdir -p $DASK_LOCAL_DIR | ||
|
||
echo -e "\n" | ||
|
||
echo "shutting down current dask cluster if it exists..." | ||
NUM_SCREENS=$(screen -list | grep --only-matching --extended-regexp '[0-9]\ Socket|[0-9]{1,10}\ Sockets' | grep --only-matching --extended-regexp '[0-9]{1,10}') | ||
SCREENS=($(screen -list | grep --only-matching --extended-regexp '[0-9]{1,10}\.dask|[0-9]{1,10}\.gpu' | grep --only-matching --extended-regexp '[0-9]{1,10}')) | ||
if [[ $NUM_SCREENS > 0 ]]; then | ||
screen -wipe | ||
for screen_id in $(seq 1 $NUM_SCREENS); | ||
do | ||
index=$screen_id-1 | ||
echo ${SCREENS[$index]} | ||
screen -S ${SCREENS[$index]} -X quit | ||
done | ||
fi | ||
echo "... cluster shut down" | ||
|
||
echo -e "\n" | ||
|
||
if [[ "0" -lt "$NWORKERS" ]] && [[ "$NWORKERS" -le "$NUM_GPUS" ]]; then | ||
|
||
if [[ "$WHOAMI" = "MASTER" ]]; then | ||
echo "initializing dask scheduler..." | ||
screen -dmS dask_scheduler bash -c "source activate $ENVNAME && dask-scheduler" | ||
sleep 5 | ||
echo "... scheduler started" | ||
fi | ||
|
||
echo -e "\n" | ||
|
||
echo "starting $NWORKERS worker(s)..." | ||
declare -a WIDS | ||
for worker_id in $(seq 1 $NWORKERS); | ||
do | ||
start=$(( worker_id - 1 )) | ||
end=$(( NWORKERS - 1 )) | ||
other=$(( start - 1 )) | ||
devs=$(seq --separator=, $start $end) | ||
second=$(seq --separator=, 0 $other) | ||
if [ "$second" != "" ]; then | ||
devs="$devs,$second" | ||
fi | ||
echo "... starting gpu worker $worker_id" | ||
|
||
if [[ "$DEBUG" = "DEBUG" ]]; then | ||
export create_worker="source activate $ENVNAME && \ | ||
cuda-memcheck dask-worker $MASTER_IPADDR:$DASK_SCHED_PORT \ | ||
--host=${MY_IPADDR[0]} --no-nanny \ | ||
--nprocs=1 --nthreads=1 \ | ||
--memory-limit=0 --name ${MY_IPADDR[0]}_gpu_$worker_id \ | ||
--local-directory $DASK_LOCAL_DIR/$name" | ||
export logfile="${DASK_LOCAL_DIR}/gpu_worker_${worker_id}_log.txt" | ||
env CUDA_VISIBLE_DEVICES=$devs screen -dmS gpu_worker_$worker_id \ | ||
bash -c 'script -c "$create_worker" "$logfile"' | ||
else | ||
export create_worker="source activate $ENVNAME && \ | ||
dask-worker $MASTER_IPADDR:$DASK_SCHED_PORT \ | ||
--host=${MY_IPADDR[0]} --no-nanny \ | ||
--nprocs=1 --nthreads=1 \ | ||
--memory-limit=0 --name ${MY_IPADDR[0]}_gpu_$worker_id \ | ||
--local-directory $DASK_LOCAL_DIR/$name" | ||
env CUDA_VISIBLE_DEVICES=$devs screen -dmS gpu_worker_$worker_id \ | ||
bash -c "$create_worker" | ||
fi | ||
|
||
WIDS[$id]=$! | ||
done | ||
sleep 5 | ||
|
||
echo -e "\n" | ||
|
||
echo "... $NWORKERS worker(s) successfully started" | ||
|
||
echo -e "\n" | ||
fi | ||
|
||
if [[ "$NWORKERS" -eq "0" ]]; then | ||
NUM_SCREENS=$(screen -list | grep --only-matching --extended-regexp '[0-9]\ Socket|[0-9]{1,10}\ Sockets' | grep --only-matching --extended-regexp '[0-9]{1,10}') | ||
if [[ $NUM_SCREENS == "" ]]; then | ||
echo "cluster shut down successfully" | ||
echo "verifying status:" | ||
screen -list | ||
fi | ||
fi | ||
|
||
if [[ "0" -lt "$NWORKERS" ]]; then | ||
echo "printing status ..." | ||
echo -e "\n" | ||
screen -list | ||
echo -e "\n" | ||
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
ENVNAME cudf | ||
|
||
NWORKERS 8 | ||
|
||
12.34.567.890 MASTER | ||
|
||
DASK_SCHED_PORT 8786 | ||
DASK_SCHED_BOKEH_PORT 8787 | ||
DASK_WORKER_BOKEH_PORT 8790 | ||
|
||
DEBUG |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
#!/bin/bash | ||
echo -e "\n" | ||
echo "jupyter-lab --allow-root --ip=0.0.0.0 --no-browser --NotebookApp.token=''" | ||
echo -e "\n" | ||
jupyter-lab --allow-root --ip=0.0.0.0 --no-browser --NotebookApp.token='' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#!/bin/bash | ||
ps aux | grep jupyter | \ | ||
grep --extended-regexp "$USER[\ ]{1,10}[0-9]{1,10}" | \ | ||
grep --only-matching --extended-regexp "$USER[\ ]{1,10}[0-9]{1,10}" | \ | ||
grep --only-matching --extended-regexp "[\ ]{1,10}[0-9]{1,10}" | \ | ||
xargs kill -9 | ||
sleep 2 |