diff --git a/doc/sphinx/dev_cheatsheet.rst b/doc/sphinx/dev_cheatsheet.rst index 0bc32852d..20f2918d8 100644 --- a/doc/sphinx/dev_cheatsheet.rst +++ b/doc/sphinx/dev_cheatsheet.rst @@ -8,31 +8,31 @@ Creating and submitting a POD - Run the unmodified MDTF-diagnostics package to make sure that your conda installation, directory structure, etc... are set up properly - Modify the conda environment to work for your POD by adding a configuration file ``MDTF_diagnostics/src/conda/env_[YOUR POD NAME].yml`` with any new required modules. Be sure to re-run ``MDTF-diagnostics/src/conda/conda_env_setup.sh`` to install your POD's environment if it requires a separate YAML file with additional modules. - Name your POD, make a directory for your POD in MDTF-diagnostics/diagnostics, and move your code to your POD directory - - ``cp`` your observational data to ``MDTF_diagnostics/../inputdata/obs_data/[YOUR POD NAME]`` - - If necessary, ``cp`` your model data to ``MDTF_diagnostics/../inputdata/model/[model dataset name]`` -2. Link your POD code into the framework - + - ``cp`` your observational data to ``MDTF_diagnostics/../inputdata/obs_data/[YOUR POD NAME]`` +2. Link your POD code into the framework - Modify your POD's driver script (e.g, ``driver.py``) to interface with your code - Modify pod's ``settings.jsonc`` to specify variables that will be passed to the framework - Modify your code to use ``ENV_VARS`` provided by the framework (see the *Notes* for descriptions of the available environment variables) - Input files: - - model input data: ``MDTF-diagnostics/../inputdata/model/[dataset name]/[output frequency]`` + - model input data: specified in an ESM-intake catalog - observational input data: ``MDTF-diagnostics/../inputdata/obs_data/[POD name]`` - - Sample datasets should be submitted using the default directory structure - - You may re-define input data locations in the ``MODEL_DATA_ROOT`` and ``OBS_DATA_ROOT`` definitions in the ``default_tests.jsonc`` file (or whatever the name of your runtime settings jsonc file is). + - You may re-define input data locations in the ``OBS_DATA_ROOT`` setting in your runtime configuration file (or whatever the name of your runtime settings jsonc file is). - Working files: - - ``${WK_DIR}`` is a framework environment variable defining the working directory. It is set to ``MDTF-diagnostics/../wkdir`` by default. - - ``${WK_DIR}`` contains temporary files and logs. - - You can modify ``${WK_DIR}`` by changing "WORKING_DIR" to the desired location in ``default_tests.jsonc`` + - ``${WORK_DIR}`` is a framework environment variable defining the working directory. It is set to ``MDTF-diagnostics/../wkdir`` by default. + - ``${WORK_DIR}`` contains temporary files and logs. + - You can modify ``${WORK_DIR}`` by changing "WORK_DIR" to the desired location in ``templates/runtime.[jsonc |yml}`` - Output files: - POD output files are written to the following locations by the framework: - - Postscript files: ``${WK_DIR}/[POD NAME]/[model,obs]/PS`` - - Other files, including PNG plots: ``${WK_DIR}/[POD NAME]/[model,obs]`` - - Set the "OUTPUT_DIR" option in default_tests.jsonc to write output files to a different location; "OUTPUT_DIR" defaults to "WORKING_DIR" if it is not defined. + - Postscript files: ``${WORK_DIR}/[POD NAME]/[model,obs]/PS`` + - Other files, including PNG plots: ``${WORK_DIR}/[POD NAME]/[model,obs]`` + - Set the "OUTPUT_DIR" option in default_tests.jsonc to write output files to a different location; +"OUTPUT_DIR" defaults to "WORK_DIR" if it is not defined. - Output figure locations: - - PNG files should be placed directly in ``$WK_DIR/obs/`` and ``$WK_DIR/model/`` - - If a POD chooses to save vector-format figures, they should be written into the ``$WK_DIR/obs/PS`` and ``$WK_DIR/model/PS`` directories. Files in these locations will be converted by the framework to PNG, so use those names in the html file. - - If a POD uses matplotlib, it is recommended to write as figures as EPS instead of PS because of potential bugs + - PNG files should be placed directly in ``$WORK_DIR/obs/`` and ``$WORK_DIR/model/`` + - If a POD chooses to save vector-format figures, they should be written into the ``$WORK_DIR/obs/PS`` and ``$WORK_DIR/model/PS`` directories. +Files in these locations will be converted by the framework to PNG, so use those names in the html file. + - If a POD uses matplotlib, it is recommended to write as figures as EPS instead of PS because +of potential bugs - Modify html files to point to the figure names @@ -42,25 +42,39 @@ Creating and submitting a POD Notes: ------ -- **Make sure that WORKING_DIR and OUTPUT_DIR have enough space to hold datafor your POD(s) AND any PODs included in the package.** +- **Make sure that WORK_DIR and OUTPUT_DIR have enough space to hold data for your POD(s) AND any PODs included in the package.** - Defining POD variables - - Add variables to the ``varlist`` block in the ``MDTF-diagnostics/diagnostics/[POD name]/settings.jsonc`` and define the following: - - the variable name: the short name that will generate the corresponding ``${ENV_VAR}`` (e.g., "zg500" generates the ``${ENV_VAR}`` "zg500_var") + - Add variables to the ``varlist`` block in the ``MDTF-diagnostics/diagnostics/[POD name]/settings.jsonc`` +and define the following: + - the variable name: the short name that will generate the corresponding ``${ENV_VAR}`` +(e.g., "zg500" generates the ``${ENV_VAR}`` "zg500_var") - the standard name with a corresponding entry in the appropriate fieldlist file(s) - variable units - variable dimensions (e.g., [time, lat, lon]) - - scalar coordinates for variables defined on a specific atmospheric pressure level (e.g. ``{"lev": 250}`` for a field on the 250-hPa p level). - - If your variable is not in the necessary fieldlist file(s), add them to the file(s), or open an issue on GitHub requesting that the framework team add them. Once the files are updated, merge the changes from the main branch into your POD branch. + _ variable realm (e.g., atmos, ocean ice, land) + - scalar coordinates for variables defined on a specific atmospheric pressure level (e.g. ``{"lev": 250}`` +for a field on the 250-hPa p level). + - If your variable is not in the necessary fieldlist file(s), add them to the file(s), +or open an issue on GitHub requesting that the framework team add them. +Once the files are updated, merge the changes from the main branch into your POD branch. - Note that the variable name and the standard name must be unique fieldlist entries - Environment variables - - To define an environment variable specific to your POD, add a ``"pod_env_vars"`` block to the ``"settings"`` block in your POD's ``settings.jsonc`` file and define the desired variables - - Reference an environment variable in Python by calling ``os.environ["VARIABLE NAME"]`` + - To define an environment variable specific to your POD, add a ``"pod_env_vars"`` block to the ``"settings"`` +block in your POD's ``settings.jsonc`` file and define the desired variables + - Reference an environment variable associated with a specific case in Python by calling ``os.environ[case_env_file]``, +reading the file contents into a Python dictionary, and getting value associated with the first case (assuming variable +names and coordinates are identical for each case), e.g. +``tas_var = [case['tas_var'] for case in case_list.values()][0]``. See ``example_multicase.py`` for more information. - NCL code can reference environment variables by calling ``getenv("VARIABLE NAME")`` - Framework-specific environment variables include: - - OBS_DATA : path to the top-level directory containing any observational or reference data for your POD - - POD_HOME : Path to the top-level directory containing your POD's source code - - WK_DIR : path to the POD working directory - - DATADIR : Path to directory containing input data files for one case/experiment - - CASENAME : User-provided label describing the run of model data being analyzed - - FIRSTYR: Four-digit year describing the first year of the analysis period - - LASTYR: Four-digit year describing the last year of the analysis period + - case_env_file: path to yaml file with case-specific environment variables: + - DATA_CATALOG: path to the ESM-intake catalog with model input files and metadata + - CASELIST: list of case identfiers corresponding to each model simulation + - startdate: string in yyyymmdd or yyyymmddHHMMSS specifying the start date of the analysis period + - enddate: string in yyyymmdd or yyyymmddHHMMSS specifying the end date of the analysis period + - [variable id]_var: environment variable name assigned to variable + - time_coord: time coordinate + - lat_coord: latitude coordinate + - lon_coord: longitude coordinate + - OBS_DATA: path to the top-level directory containing any observational or reference data for your POD + - WORK_DIR: path to the POD working directory diff --git a/doc/sphinx/dev_checklist.rst b/doc/sphinx/dev_checklist.rst deleted file mode 100644 index 8eae4ffdd..000000000 --- a/doc/sphinx/dev_checklist.rst +++ /dev/null @@ -1,138 +0,0 @@ -.. _ref-dev-checklist: - -POD development checklist -========================= - -This section lists all the steps that need to be taken in order to submit a POD for inclusion in the MDTF framework. - -Code and documentation submission ---------------------------------- - -The material in this section must be submitted though a `pull request `__ to the `NOAA-GFDL GitHub repo `__. This is described in :doc:`dev_git_intro`. - -The `example POD `__ should be used as a reference for how each component of the submission should be structured. - -The POD feature must be up-to-date with the NOAA-GFDL main branch, with no outstanding merge conflicts. See :doc:`dev_git_intro` for instructions on syncing your fork with NOAA-GFDL, and pulling updates from the NOAA-GFDL main branch into your feature branch. - -POD source code -^^^^^^^^^^^^^^^ - -All scripts should be placed in a subdirectory of ``diagnostics/``. Among the scripts, there should be 1) a main driver script, 2) a template html, and 3) a ``settings.jsonc`` file. The POD directory and html template should be named after your POD's short name. - - - For instance, ``diagnostics/convective_transition_diag/`` contains its driver script ``convective_transition_diag.py``, ``convective_transition_diag.html``, and ``settings.jsonc``, etc. - - - The framework will call the driver script, which calls the other scripts in the same POD directory. - - - If you need a new Conda environment, add a new .yml file to ``src/conda/``, and install the environment using the ``conda_env_setup.sh`` script as described in the :doc:`Getting Started `. - - -POD settings file -^^^^^^^^^^^^^^^^^ - -The format of this file is described in :doc:`dev_settings_quick` and in more detail in :doc:`ref_settings`. - -POD html template for output -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -- The html template will be copied by the framework into the output directory to display the figures generated by the POD. You should be able to create a new html template by simply copying and modifying the example templates from existing PODs even without prior knowledge about html syntax. - -Preprocessing scripts for digested data -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The "digested" supporting data policy is described in :numref:`ref-pod-digested-data`. - -For maintainability and provenance purposes, we request that you include the code used to generate your POD's "digested" data from raw data sources (any source of data that's permanently hosted). This code will not be called by the framework and will not be used by end users, so the restrictions and guidelines concerning the POD code don't apply. - - -POD documentation -^^^^^^^^^^^^^^^^^ - -- The documentation for the framework is automatically generated using `sphinx `__, which works with files in `reStructured text `__ (reST, ``.rst``) format. In order to include :doc:`documentation for your POD `, we require that it be in this format. - - + Use the `example POD documentation `__ as a template for the information required for your POD, by modifying its .rst `source code `__. This should include a one-paragraph synopsis of the POD, developers’ contact information, required programming language and libraries, and model output variables, a brief summary of the presented diagnostics as well as references in which more in-depth discussions can be found. - + The .rst files and all linked figures should be placed in a ``doc`` subdirectory under your POD directory (e.g., ``diagnostics/convective_transition_diag/doc/``) and put the .rst file and figures inside. - + The most convenient way to write and debug reST documentation is with an online editor. We recommend `https://livesphinx.herokuapp.com/ `__ because it recognizes sphinx-specific commands as well. - + For reference, see the reStructured text `introduction `__, `quick reference `__ and `in-depth guide `__. - + Also see a reST `syntax comparison `__ to other text formats you may be familiar with. - -- For maintainability, all scripts should be self-documenting by including in-line comments. The main driver script (e.g., ``convective_transition_diag.py``) should contain a comprehensive header providing information that contains the same items as in the POD documentation, except for the "More about this diagnostic" section. - -- The one-paragraph POD synopsis (in the POD documentation) as well as a link to the full documentation should be placed at the top of the html template (e.g., ``convective_transition_diag.html``). - -Preprocessing script documentation -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -The "digested" supporting data policy is described in :numref:`ref-pod-digested-data`. - -For maintainability purposes, include all information needed for a third party to reproduce your POD's digested data from its raw sources in the ``doc`` directory. This information is not published on the documentation website and can be in any format. In particular, please document the raw data sources used (DOIs/versioned references preferred) and the dependencies/build instructions (eg. conda environment) for your preprocessing script. - - -Sample and supporting data submission -------------------------------------- - -Data hosting for the MDTF framework is currently managed manually. The data -is hosted via anonymous FTP on UCAR's servers. - - -Digested observational or supporting data -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Create a directory under ``inputdata/obs_data/`` named after the short name -of your POD, and put all your *digested* observation data in (or more -generally, any quantities that are independent of the model being -analyzed). The "digested" data policy is described in :numref:`ref-pod-digested-data`. - -- Requirements - - Digested data should be in the form of numerical data, not figures. - - The data files should be small (preferably a few MB) and just enough for producing figures for model comparison. If you really cannot reduce the data size and your POD requires more than 1GB of space, consult with the lead team. - - Include in the directory a “README.txt” description file with original source info. - - Include in the directory any necessary licensing information, files, etc. (if applicable) - -- Create a tar file of your obs_data directory: - - Use the --hard_dereference flag so that all users can read your file. - - Naming convention: $pod_name.yyyymmdd.tar, where yyyymmdd is the file creation date. Alternatively, you may use some other version tag to allow the framework to check compatibiity between the POD code and data provided. - - Create the tar file from the inputdata directory so the file paths start with obs_data. - - Example (c-shell): - - .. code-block:: console - - set pod_name = MJO_suite - set tartail = `date +'%Y%m%d'` - cd inputdata/obs_data - tar cfh $pod_name.$tartail.tar --hard-dereference $pod_name - - - To check: - - .. code-block:: console - - % tar tf $pod_name.$tartail.tar - MJO_suite/ - MJO_suite/ERA.v200.EOF.summer-0.png - MJO_suite/ERA.u200.EOF.summer-1.png - -After following the above instructions, please refer to -`the GitHub Discussion on transfering obs_data `__ -or email Dani Coleman at bundy at ucar dot edu or contact your liason on the -MDTF Leads Team. - -Files will be posted for Guest/anonymous access : -ftp://ftp.cgd.ucar.edu/archive/mdtf/obs_data_latest/{$pod_name}.latest.tar -with 'latest' pointing to the date-or-version-tagged tar file - - -Note that prior to version 3, obs_data from all PODs was consolidated in one -tar file. To assist in usability as the number of PODs grow, they will now -be available individually, with the responsiblity for creating the tar -files on the developer. - - - - - -Sample model data -^^^^^^^^^^^^^^^^^ - -For PODs dealing with atmospheric phenomena, we recommend that you use sample data from the following sources, if applicable: - -- A timeslice run of `NCAR CAM5 `__ -- A timeslice run of `GFDL AM4 `__ (contact the leads for password). diff --git a/doc/sphinx/dev_overview.rst b/doc/sphinx/dev_overview.rst deleted file mode 100644 index 9e850a06a..000000000 --- a/doc/sphinx/dev_overview.rst +++ /dev/null @@ -1,41 +0,0 @@ -Introduction for POD developers -=============================== - -This walkthrough contains information for developers wanting to contribute a process-oriented diagnostic (POD) module to the MDTF framework. There are two tracks through the material: one for developers who have an existing analysis script they want to adapt for use in the framework, and one for developers who are writing a POD from scratch. - -:numref:`ref-dev-start` provides instructions for setting up POD development, in particular managing language and library dependencies through conda. For developers already familiar with version 2.0 of the framework, :numref:`ref-dev-migration` summarizes changes from v2.0 to facilitate migration to v3.0. New developers can skip this section, as the rest of this walkthrough is self-contained. - -:numref:`ref-dev-checklist` Provides a list of instructions for submitting a POD for inclusion in the framework. We require developers to submit PODs through `GitHub `__. See :doc:`dev_git_intro` for how to manage code through the GitHub website. - -:numref:`ref-dev-guidelines` provides overall guidelines for POD development. :numref:`ref-dev-settings-quick` is a reference for the POD's settings file format. In :numref:`ref-dev-walkthrough`, we walk the developers through the workflow of the framework, focusing on aspects that are relevant for the operation of individual PODs, and using the `Example Diagnostic POD `__ as a concrete example to illustrate how a POD works under the framework :numref:`ref-dev-coding-tips` provides coding best practices to address common issues encountered in submitted PODs.. - - - -Scope of a process-oriented diagnostic --------------------------------------- - -The MDTF framework imposes requirements on the types of data your POD outputs and takes as input. In addition to the scientific scope of process-oriented diagnostics, the analysis that you intend to do needs to fit the following model: - -Your POD should accept model data as input and express the results of its analysis in a series of figures, which are presented to the user in a web page. Input model data will be in the form of one NetCDF file (with accompanying dimension information) per variable, as requested in your POD’s :doc:`settings file `. Because your POD may be run on the output of any model, you should be careful about the assumptions your code makes about the layout of these files (eg, the range of longitude or the `positive `__ convention for vertical coordinates). Supporting data may be in any format and will not be modified by the framework (see next section). - -The above data sources are your POD’s only input: your POD should not access the internet or other networked resources. You may provide options in the settings file for the user to configure when the POD is installed, but these cannot be changed each time the POD is run. - -To achieve portability, the MDTF cannot accept PODs written in closed-source languages (eg, MATLAB or IDL). We also cannot accept PODs written in compiled languages (eg, C or Fortran): installation would rapidly become impractical if users had to check compilation options for each POD. - -The output of your POD should be a series of figures in vector format (.eps or .ps). Optionally, we encourage POD developers to also save relevant output data (e.g., the output data being plotted) as netcdf files, to give users the ability to take the POD’s output and perform further analysis on it. - -.. _ref-pod-digested-data: - -POD code organization and supporting data ------------------------------------------ - -.. figure:: ../img/dev_obs_data.jpg - :align: center - :width: 100 % - -In order to make your code run faster for the users, we request that you separate any calculations that don’t depend on the model data (e.g., pre-processing of observational data), and instead save the end result of these calculations in data files for your POD to read when it is run. We refer to this as “digested observational data,” but it refers to any quantities that are independent of the model being analyzed. For purposes of data provenance, reproducibility, and code maintenance, we request that you include all the pre-processing/data reduction scripts used to create the digested data in your POD’s code base, along with references to the sources of raw data these scripts take as input (yellow box in the figure). - -Digested data should be in the form of numerical data, not figures, even if the only thing the POD does with the data is produce an unchanging reference plot. We encourage developers to separate their “number-crunching code” and plotting code in order to give end users the ability to customize output plots if needed. In order to keep the amount of supporting data needed by the framework manageable, we request that you limit the total amount of digested data you supply to no more than a few gigabytes. - -In collaboration with PCMDI, a framework is being advanced that can help systematize the provenance of observational data used for POD development. This section will be updated when this data source is ready for public use. - diff --git a/doc/sphinx/dev_toc.rst b/doc/sphinx/dev_toc.rst deleted file mode 100644 index 052eb1b28..000000000 --- a/doc/sphinx/dev_toc.rst +++ /dev/null @@ -1,15 +0,0 @@ -Developer information ---------------------- - -.. toctree:: - :maxdepth: 1 - :numbered: 2 - - dev_overview - dev_checklist - dev_start - dev_guidelines - dev_settings_quick - dev_walkthrough - dev_coding_tips - dev_git_intro diff --git a/doc/sphinx/pod_requirements.rst b/doc/sphinx/pod_requirements.rst new file mode 100644 index 000000000..c91f66e50 --- /dev/null +++ b/doc/sphinx/pod_requirements.rst @@ -0,0 +1,180 @@ +.. _ref-pod-requirements: + +POD requirements +========================= + +This section lists all the steps that need to be taken in order to submit a POD for inclusion in the MDTF framework. + +Code and documentation submission +--------------------------------- + +The material in this section must be submitted though a +`pull request `__ +to the `NOAA-GFDL GitHub repo `__. +This is described in :doc:`dev_git_intro`. + +Use the `example_multicase POD `__ + as a reference for how each component of the submission should be structured. + +The POD feature must be up-to-date with the NOAA-GFDL main branch, with no outstanding merge conflicts. +See :doc:`dev_git_intro` for instructions on syncing your fork with NOAA-GFDL, and pulling updates from +the NOAA-GFDL main branch into your feature branch. + +POD source code +^^^^^^^^^^^^^^^ + +All scripts should be placed in a subdirectory of ``diagnostics/``. Among the scripts, there should be 1) a main driver +script, 2) a template html, and 3) a ``settings.jsonc`` file. The POD directory and html template should be named +after your POD's short name. + +For instance, ``diagnostics/convective_transition_diag/`` contains its driver script +``convective_transition_diag.py``, ``convective_transition_diag.html``, and ``settings.jsonc``, etc. + +The framework will call the driver script, which calls the other scripts in the same POD directory. + +If you need a new Conda environment, add a new .yml file to ``src/conda/``, and install the environment using the +``conda_env_setup.sh`` or ``micromamba_env_setup.sh`` scripts as described in the :doc:`Getting Started `. + + +POD settings file +^^^^^^^^^^^^^^^^^ + +The format of this file is described in :doc:`dev_settings_quick` and in more detail in :doc:`ref_settings`. + +POD html template for output +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The html template will be copied by the framework into the output directory to display the figures generated by the POD. +You should be able to create a new html template by simply copying and modifying the example templates from existing +PODs even without prior knowledge about html syntax. If you have a + +Preprocessing scripts for digested data +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The "digested" supporting data policy is described in :numref:`ref-pod-digested-data`. + +For maintainability and provenance purposes, we request that you include the code used to generate your POD's +"digested" data from raw data sources (any source of data that's permanently hosted). +This code will not be called by the framework and will not be used by end users, so the restrictions +and guidelines concerning the POD code don't apply. + + +POD documentation +^^^^^^^^^^^^^^^^^ + +The documentation for the framework is automatically generated using +`sphinx `__, which works with files in +`reStructured text `__ (reST, ``.rst``) format. +In order to include :doc:`documentation for your POD `, we require that it be in this format. + +Use the `example_multicase POD documentation `__ +as a template for the information required for your POD, by modifying its .rst +`source code `__. +The documentation should include the following information: + - a one-paragraph synopsis of the POD + - the developers’ contact information + - required programming language and libraries + - a brief summary of the presented diagnostics + - references in which more in-depth discussions can be found. + +The .rst files and all linked figures should be placed in a ``doc`` subdirectory under your POD directory +(e.g., ``diagnostics/example_multicase/doc/``) and put the .rst file and figures inside. + +The most convenient way to write and debug reST documentation is with an online editor. +We recommend `https://livesphinx.herokuapp.com/ `__ +because it recognizes sphinx-specific commands as well. + +For reference, see the reStructured text +`introduction `__, +`quick reference `__ and +`in-depth guide `__. + +Also see a reST `syntax comparison `__ +to other text formats you may be familiar with. + +- For maintainability, all scripts should be self-documenting by including in-line comments. +The main driver script (e.g., ``example_multicase.py``) should contain a comprehensive header providing information +that contains the same items as in the POD documentation, except for the "More about this diagnostic" section. + +- The one-paragraph POD synopsis (in the POD documentation) as well as a link to the full documentation should be +placed at the top of the html template (e.g., ``example_multicase.html``). + +Preprocessing script documentation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The "digested" supporting data policy is described in :numref:`ref-pod-digested-data`. + +For maintainability purposes, include all information needed for a third party to reproduce your POD's digested data +from its raw sources in the ``doc`` directory. This information is not published on the documentation website +and can be in any format. In particular, please document the raw data sources used (DOIs/versioned references preferred) +and the dependencies/build instructions (eg. conda environment) for your preprocessing script. + + +Sample and supporting data submission +------------------------------------- + +Data hosting for the MDTF framework is currently managed manually. The data +is hosted via anonymous FTP on UCAR's servers. + + +Digested observational or supporting data +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Create a directory under ``inputdata/obs_data/`` named after the short name +of your POD, and put all your *digested* observation data in (or more +generally, any quantities that are independent of the model being +analyzed). The "digested" data policy is described in :numref:`ref-pod-digested-data`. + +- Requirements + - Digested data should be in the form of numerical data, not figures. + - The data files should be small (preferably a few MB) and just enough for producing figures for model comparison. +If you really cannot reduce the data size and your POD requires more than 1GB of space, consult with the lead team. + - Include in the directory a “README.txt” description file with original source info. + - Include in the directory any necessary licensing information, files, etc. (if applicable) + +- Create a tar file of your obs_data directory: + - Use the --hard_dereference flag so that all users can read your file. + - Naming convention: $pod_name.yyyymmdd.tar, where yyyymmdd is the file creation date. +Alternatively, you may use some other version tag to allow the framework to check compatibiity between the POD +code and data provided. + - Create the tar file from the inputdata directory so the file paths start with obs_data. + - Example (c-shell): + + .. code-block:: console + + set pod_name = MJO_suite + set tartail = `date +'%Y%m%d'` + cd inputdata/obs_data + tar cfh $pod_name.$tartail.tar --hard-dereference $pod_name + + - To check: + + .. code-block:: console + + % tar tf $pod_name.$tartail.tar + MJO_suite/ + MJO_suite/ERA.v200.EOF.summer-0.png + MJO_suite/ERA.u200.EOF.summer-1.png + +After following the above instructions, please refer to +`the GitHub Discussion on transfering obs_data `__ +or email Dani Coleman at bundy at ucar dot edu or contact your liason on the +MDTF Leads Team. + +Files will be posted for Guest/anonymous access : +ftp://ftp.cgd.ucar.edu/archive/mdtf/obs_data_latest/{$pod_name}.latest.tar +with 'latest' pointing to the date-or-version-tagged tar file + +Note that prior to version 3, obs_data from all PODs was consolidated in one +tar file. To assist in usability as the number of PODs grow, they will now +be available individually, with the responsiblity for creating the tar +files on the developer. + +Sample model data +^^^^^^^^^^^^^^^^^ + +For PODs dealing with atmospheric phenomena, we recommend that you use sample data from the following sources, +if applicable: + +- A timeslice run of `NCAR CAM5 `__ +- A timeslice run of `GFDL AM4 `__ (contact the leads for password). diff --git a/templates/runtime_config.jsonc b/templates/runtime_config.jsonc index 1a593b905..902d3346a 100755 --- a/templates/runtime_config.jsonc +++ b/templates/runtime_config.jsonc @@ -52,7 +52,9 @@ // code directory. Environment variables (eg, $HOME) can be referenced with a // "$" and will be expended to their current values when the framework runs. // Full or relative path to model data ESM-intake catalog header file + "DATA_CATALOG": "./diagnostics/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.json", + // Backwards compatibility "MODEL_DATA_ROOT": "../inputdata/mdtf_test_data", // Parent directory containing observational data used by individual PODs. diff --git a/templates/runtime_config.yml b/templates/runtime_config.yml index cfeec9280..e5c6496cb 100755 --- a/templates/runtime_config.yml +++ b/templates/runtime_config.yml @@ -64,6 +64,7 @@ overwrite: False # Set to True to run the preprocessor run_pp: True # Additional custom preprocessing scripts to run on data -# place these scripts in the user_scripts directory of your copy of the MDTF-diagnostics package +# place these scripts in the MDTF-diagnostics/user_scripts directory +# The framework will run the specified scripts whether run_pp is set to True or False user_pp_scripts: - ""