Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate OWP Inundation Mapping Repo #101

Closed
rajadain opened this issue Oct 5, 2022 · 1 comment
Closed

Investigate OWP Inundation Mapping Repo #101

rajadain opened this issue Oct 5, 2022 · 1 comment
Assignees

Comments

@rajadain
Copy link
Collaborator

rajadain commented Oct 5, 2022

Investigate this repo: https://github.com/NOAA-OWP/inundation-mapping

Try to execute one of their workflows, see where the NWM fits in, and summarize how the NWM is used in that workflow.

Add discovery notes throughout the investigation.

Connects #37

@vlulla
Copy link
Contributor

vlulla commented Oct 7, 2022

Fernando had shared some test data with us that we could use to try the flood inundation mapping (fim) workflow from the https://github.com/NOAA-OWP/inundation-mapping repo. And, he had emailed us the instructions of how to generate the fim maps. Here are the steps, and the minor modifications I had to make, to generate the fim tiffs.

  1. https://github.com/NOAA-OWP/inundation-mapping/wiki/FIM-Hydrofabric is an excellent place to learn about FIM hydrofabric. The page for the Input Data Dictionary is quite useful. This input data dictionary page, in addition to describing the files/folders that make up the hydrofabric, contains links to the Original Data Source for each of the files/folders in the hydrofabric!

  2. Create an ec2 instance (i used r5a.xlarge) with necessary software. Ensure that the ec2 has the following programs installed: docker, conda, python3, p7zip-full, aws-cli. Please ensure that the ec2 instance has at least 400 gb of storage space.

  3. From the ec2 instance copy the test data that Fernando shared with us. These data are stored in s3://azavea-noaa-hydro-data-fim/FIM-example-data-from-Fernando/ for the us-west-2 (Oregon) and s3://azavea-noaa-hydro-data/FIM-example-data-from-Fernando/ for the us-east-1 (N. Virginia) regions. You can ensure these regions by going to the main s3 page. Below are the steps to copy the data to your ec2 instance:

    ubuntu@ec2:~ $ mkdir -p ${HOME}/data/{temp,outputs}/ && cd ${HOME}/data/outputs
    ubuntu@ec2:~/data/outputs $ aws s3 cp s3://azavea-noaa-hydro-data-fim/FIM-example-data-from-Fernando/3dep_test_1202_10m_FR.7z .
    ubuntu@ec2:~/data/outputs $ aws s3 cp s3://azavea-noaa-hydro-data-fim/FIM-example-data-from-Fernando/3dep_test_1202_10m_GMS.7z .
    ubuntu@ec2:~/data/outputs $ aws s3 cp s3://azavea-noaa-hydro-data-fim/FIM-example-data-from-Fernando/ .. --recursive --exclude '*' --include '*.csv'
    ubuntu@ec2:~/data/outputs $ 7z x 3dep_test_1202_10m_FR.7z
    ubuntu@ec2:~/data/outputs $ 7z x 3dep_test_1202_10m_GMS.7z # takes 2.25 hrs!

    NOTE: Please ensure that you are using correct s3 bucket. There is no cost of transferring from s3 to ec2, or vice-versa, if the bucket and the instance are in the same region. If s3 and ec2 are in different region they incur region-to-region cost.

  4. Build the docker image. This takes about 10 minutes. Here are the steps that I followed.

    ubuntu@ec2:~ $ cd ${HOME} && git clone https://github.com/NOAA-OWP/inundation-mapping
    ubuntu@ec2:~ $ sudo groupadd -g $(awk -v FS=$'=' '/ARG GroupID/{print $2}' ./Dockerfile ) fim ## slightly modified from README.md
    ubuntu@ec2:~ $ sudo usermod -a -G fim ${USER} ## can only chgrp if I belong to that group! Ensure that fim shows up in the output of `id` command. If not, logout and login.
    ubuntu@ec2:~ $ chgrp -R fim ${HOME}/inundation-mapping ${HOME}/data/
    ubuntu@ec2:~ $ ## cd inundation-mapping && git checkout dev-lidar ## Docker fails in dev-lidar!
    ubuntu@ec2:~ $ cd inundation-mapping
    ubuntu@ec2:~/inundation-mapping $ docker buildx build -t owp-fim:$(git rev-parse --short HEAD) -f ./Dockerfile --progress=plain . > docker-build-$(\date +"%Y%m%d").log 2>&1
    ubuntu@ec2:~/inundation-mapping $ docker rmi $(docker images --filter dangling=true -q) 2>/dev/null

    Now run the container:

    ubuntu@ec2:~ $ docker run -ti --name fim --hostname vl-fim --rm \
                              -v ${HOME}/inundation-mapping:/foss_fim -v ${HOME}/data:/data \
                              $(docker image ls --filter=reference="owp-fim" --format='{{.Repository}}:{{.Tag}}')

    Ensure that /foss_fim/ and /data/outputs/ dir in the container have the appropriate files. ls -lh /foss_fim/ and ls -lh /data/ ought to list all the files.

  5. From within the container run the following commands. The parenthesis is just to start a subshell so that all these envvars are set only temporarily.

    root@vl-fim # (
        export HUC=12020001 
        export DAT=${outputDataDir%/*}
        export FR=${outputDataDir}/3dep_test_1202_10m_FR 
        /foss_fim/tools/inundation.py \
            -r ${FR}/${HUC}/rem_zeroed_masked.tif \
            -c ${FR}/${HUC}/gw_catchments_reaches_filtered_addedAttributes.tif \
            -b ${FR}/${HUC}/gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg \
            -t ${FR}/${HUC}/hydroTable.csv \
            -f ${DAT}/ble_huc_${HUC}_flows_500yr.csv \
            -i ${DAT}/temp/testing_inundation_fr_500yr_${HUC}.tif
      )

    This script above is for the FR dataset. This was the smaller of the two test datasets that Fernando shared with us.

    Generating the fim for GMS dataset is a two step process! NOTE: Ensure that /data/temp directory is already present otherwise you'll get IndexError from /foss_fim/tools/gms_tools/inundate_gms.py script!

    ## Step 1
    root@vl-fim # (
        export HUC=12020001 
        export DAT=${outputDataDir%/*}
        export GMS=${outputDataDir}/3dep_test_1202_10m_GMS 
        export DAT=/data 
        export GMS=/data/outputs/3dep_test_1202_10m_GMS 
        /foss_fim/tools/gms_tools/inundate_gms.py \
           -y ${GMS}/ \
           -u ${HUC} \
           -f ${DAT}/ble_huc_${HUC}_flows_500yr.csv \
           -i ${DAT}/temp/testing_inundation_gms_500yr_${HUC}.tif \
           -o ${DAT}/temp/testing_inundation_gms_500yr_${HUC}_map_file.csv \
           -w $(( $(nproc) - 2 )) \
           -v
        )
    
    ## Step 2
    root@vl-fim # (
        export HUC=12020001 
        export DAT=${outputDataDir%/*}
        export GMS=${outputDataDir}/3dep_test_1202_10m_GMS 
        /foss_fim/tools/gms_tools/mosaic_inundation.py \
           -i ${DAT}/temp/testing_inundation_gms_500yr_${HUC}_map_file.csv \
           -m ${DAT}/temp/testing_inundation_gms_500yr_${HUC}.tif \
           -t inundation_rasters \
           -w 0 \
           -v
        )

    NOTE: It appears that something is wrong with the /foss_fim/tools/gms_tools/mosaic_inundation.py script. The threaded version (i.e., without the -w 0 option) generates a tiff (with correct dimensions/shape) but has NaN for all the values. It appears that the code branches in that script are doing some different steps.... This took quite a bit of time to figure out! (2022.10.06)

    To generate for all the 7 hucs I just ran a for loop over an array of these 7 hucs!

    ## Combined steps...
    root@vl-fim # (
          export HUCS=( 12020001 12020002 12020003 12020004 12020005 12020006 12020007 )
          for HUC in ${HUCS[@]}; do
            export DAT=${outputDataDir%/*}
            export GMS=${outputDataDir}/3dep_test_1202_10m_GMS 
            /foss_fim/tools/gms_tools/inundate_gms.py \
               -y ${GMS}/ \
               -u ${HUC} \
               -f ${DAT}/ble_huc_${HUC}_flows_500yr.csv \
               -i ${DAT}/temp/testing_inundation_gms_500yr_${HUC}.tif \
               -o ${DAT}/temp/testing_inundation_gms_500yr_${HUC}_map_file.csv \
               -w $(( $(nproc) - 2 )) \
               -v
    
            /foss_fim/tools/gms_tools/mosaic_inundation.py \
               -i ${DAT}/temp/testing_inundation_gms_500yr_${HUC}_map_file.csv \
               -m ${DAT}/temp/testing_inundation_gms_500yr_${HUC}.tif \
               -t inundation_rasters \
               -w 0 \
               -v
            echo "Done ${HUC}..."
          done
        )
    root@vl-fim # ls -lh /data/temp/testing_inundation_gms_500yr_1202000{1..7}.tif

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants