From 3f364f9a90e7c956f273075fdaa205d926c71800 Mon Sep 17 00:00:00 2001 From: vivekyadav26 Date: Wed, 13 Sep 2023 00:46:24 +0000 Subject: [PATCH] deploy: f90f2a17307c19eaa7c7bd4aef3d8249dda2bc83 --- development.html | 4 +--- development.md | 4 +--- search/search_index.json | 2 +- sitemap.xml | 12 ++++++------ sitemap.xml.gz | Bin 258 -> 258 bytes userguide.html | 4 ++++ userguide.md | 2 ++ 7 files changed, 15 insertions(+), 13 deletions(-) diff --git a/development.html b/development.html index cd1aa18..6f3216c 100644 --- a/development.html +++ b/development.html @@ -897,10 +897,8 @@

Intelligent SamplerIntelligent Assembler

diff --git a/development.md b/development.md index b70d16c..49ec94c 100644 --- a/development.md +++ b/development.md @@ -90,9 +90,7 @@ The sampler function follows these primary steps: 3. **Households and Persons Selection**: The function selects households based on the calculated sampling rates. It also selects persons associated with the sampled households. -4. **Output**: - - The selected households and persons are written to output CSV files in the specified output directory. - - The function also computes and logs the total sampling rate, representing the proportion of selected households relative to the total number of households. +4. **Output**: The selected households and persons are written to output CSV files in the specified output directory. The function also computes and logs the total sampling rate, representing the proportion of selected households relative to the total number of households. Note that in the current RSM deployment, sampler is set to use 25% default sampling rate. The intelligent sampler needs further testing to be used to sample households using the accessibility change. diff --git a/search/search_index.json b/search/search_index.json index bec6c72..bd56daf 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"index.html","title":"SANDAG Rapid Strategic Model","text":"

Welcome to the SANDAG Rapid Strategic Model documentation site!

"},{"location":"index.html#introduction","title":"Introduction","text":"

The travel demand model SANDAG used for the 2021 regional plan, referred to as ABM2+, is one of the most sophisticated modeling tools used anywhere in the world. Its activity-based approach to representing travel is behaviorally rich; the representations of land development and transportation infrastructure are represented in high fidelity spatial detail. An operational shortcoming of ABM2+ is it requires significant computational resources to carry out a simulation. A typical forecast year simulation of ABM2+ takes over 40 hours to complete on a high end workstation (e.g., 48 physical computing cores and 256 gigabytes of RAM). The components of this runtime include:

The computational time of ABM2+, and the likely computational time of the successor to ABM2+ (ABM3), hinders SANDAG\u2019s ability to carry out certain analyses in a timely manner. For example, if an analyst wants to explore 10 different roadway pricing schemes for a select corridor, a month of computation time would be required.

SANDAG requires a tool capable of quickly approximating the outcomes of ABM2+. Therefore, a tool was built for this purpose, referred to henceforth as the Rapid Strategic Model (RSM). The primary objective of the RSM was to enhance the speed of the resident passenger component within the broader modeling system and produce results that closely aligned with ABM2+ for policy planning requirements.

"},{"location":"index.html#use-cases-and-key-limitations","title":"Use Cases and Key Limitations","text":"

Based on set of tests done as part of this project, RSM performs well for regional scale roadway projects (e.g., auto operating costs and mileage fee, TNC costs and wait times etc.) and regional scale transit projects (transit fare, headway changes etc.). RSM also performed well for land-use change policies. Lastly, RSM was also tested for local roadway changes (e.g., managed lanes conversion) and local transit changes (e.g., new BRT line), and the results indicate that those policies are reasonably represented by RSM as well.

Here are some of the current limitations of RSM:

"},{"location":"api.html","title":"Application Programming Interface","text":""},{"location":"api.html#rsm.zone_agg.aggregate_zones","title":"aggregate_zones(mgra_gdf, method='kmeans', n_zones=2000, random_state=0, cluster_factors=None, cluster_factors_onehot=None, use_xy=True, explicit_agg=(), explicit_col='mgra', agg_instruction=None, start_cluster_ids=13)","text":"

Aggregate zones.

"},{"location":"api.html#rsm.zone_agg.aggregate_zones--parameters","title":"Parameters","text":"

mgra_gdf : mgra_gdf (GeoDataFrame) Geometry and attibutes of MGRAs method : method (array) default {\u2018kmeans\u2019, \u2018agglom\u2019, \u2018agglom_adj\u2019} n_zones : n_zones (int) random_state : random_state (RandomState or int) cluster_factors : cluster_factors (dict) cluster_factors_onehot : cluster_factors_onehot (dict) use_xy : use_xy (bool or float) Use X and Y coordinates as a cluster factor, use a float to scale the x-y coordinates from the CRS if needed. explicit_agg : explicit_agg (list[int or list]) A list containing integers (individual MGRAs that should not be aggregated) or lists of integers (groups of MGRAs that should be aggregated exactly as given, with no less and no more) explicit_col : explicit_col (str) The name of the column containing the ID\u2019s from explicit_agg, usually \u2018taz\u2019 agg_instruction : agg_instruction (dict) Dictionary passed to pandas agg that says how to aggregate data columns. start_cluster_ids : start_cluster_ids (int, default 13) Cluster id\u2019s start at this value. Can be 1, but typically SANDAG has the smallest id\u2019s reserved for external zones, so starting at a greater value is typical.

"},{"location":"api.html#rsm.zone_agg.aggregate_zones--returns","title":"Returns","text":"

GeoDataFrame

Source code in rsm/zone_agg.py
def aggregate_zones(\n    mgra_gdf,\n    method=\"kmeans\",\n    n_zones=2000,\n    random_state=0,\n    cluster_factors=None,\n    cluster_factors_onehot=None,\n    use_xy=True,\n    explicit_agg=(),\n    explicit_col=\"mgra\",\n    agg_instruction=None,\n    start_cluster_ids=13,\n):\n\"\"\"\n    Aggregate zones.\n\n    Parameters\n    ----------\n    mgra_gdf : mgra_gdf (GeoDataFrame)\n        Geometry and attibutes of MGRAs\n    method : method (array)\n        default {'kmeans', 'agglom', 'agglom_adj'}\n    n_zones : n_zones (int)\n    random_state : random_state (RandomState or int)\n    cluster_factors : cluster_factors (dict)\n    cluster_factors_onehot : cluster_factors_onehot (dict)\n    use_xy : use_xy (bool or float)\n        Use X and Y coordinates as a cluster factor, use a float to scale the\n        x-y coordinates from the CRS if needed.\n    explicit_agg : explicit_agg (list[int or list])\n        A list containing integers (individual MGRAs that should not be aggregated)\n        or lists of integers (groups of MGRAs that should be aggregated exactly as\n        given, with no less and no more)\n    explicit_col : explicit_col (str)\n        The name of the column containing the ID's from `explicit_agg`, usually 'taz'\n    agg_instruction : agg_instruction (dict)\n        Dictionary passed to pandas `agg` that says how to aggregate data columns.\n    start_cluster_ids : start_cluster_ids (int, default 13)\n        Cluster id's start at this value.  Can be 1, but typically SANDAG has the\n        smallest id's reserved for external zones, so starting at a greater value\n        is typical.\n\n    Returns\n    -------\n    GeoDataFrame\n    \"\"\"\n\n    if cluster_factors is None:\n        cluster_factors = {}\n\n    n = start_cluster_ids\n    if explicit_agg:\n        explicit_agg_ids = {}\n        for i in explicit_agg:\n            if isinstance(i, Number):\n                explicit_agg_ids[i] = n\n            else:\n                for j in i:\n                    explicit_agg_ids[j] = n\n            n += 1\n        if explicit_col == mgra_gdf.index.name:\n            mgra_gdf = mgra_gdf.reset_index()\n            mgra_gdf.index = mgra_gdf[explicit_col]\n        in_explicit = mgra_gdf[explicit_col].isin(explicit_agg_ids)\n        mgra_gdf_algo = mgra_gdf.loc[~in_explicit].copy()\n        mgra_gdf_explicit = mgra_gdf.loc[in_explicit].copy()\n        mgra_gdf_explicit[\"cluster_id\"] = mgra_gdf_explicit[explicit_col].map(\n            explicit_agg_ids\n        )\n        n_zones_algorithm = n_zones - len(\n            mgra_gdf_explicit[\"cluster_id\"].value_counts()\n        )\n    else:\n        mgra_gdf_algo = mgra_gdf.copy()\n        mgra_gdf_explicit = None\n        n_zones_algorithm = n_zones\n\n    if use_xy:\n        geometry = mgra_gdf_algo.centroid\n        X = list(geometry.apply(lambda p: p.x))\n        Y = list(geometry.apply(lambda p: p.y))\n        factors = [np.asarray(X) * use_xy, np.asarray(Y) * use_xy]\n    else:\n        factors = []\n    for cf, cf_wgt in cluster_factors.items():\n        factors.append(cf_wgt * mgra_gdf_algo[cf].values.astype(np.float32))\n    if cluster_factors_onehot:\n        for cf, cf_wgt in cluster_factors_onehot.items():\n            factors.append(cf_wgt * OneHotEncoder().fit_transform(mgra_gdf_algo[[cf]]))\n        from scipy.sparse import hstack\n\n        factors2d = []\n        for j in factors:\n            if j.ndim < 2:\n                factors2d.append(np.expand_dims(j, -1))\n            else:\n                factors2d.append(j)\n        data = hstack(factors2d).toarray()\n    else:\n        data = np.array(factors).T\n\n    if method == \"kmeans\":\n        kmeans = KMeans(n_clusters=n_zones_algorithm, random_state=random_state)\n        kmeans.fit(data)\n        cluster_id = kmeans.labels_\n    elif method == \"agglom\":\n        agglom = AgglomerativeClustering(\n            n_clusters=n_zones_algorithm, affinity=\"euclidean\", linkage=\"ward\"\n        )\n        agglom.fit_predict(data)\n        cluster_id = agglom.labels_\n    elif method == \"agglom_adj\":\n        from libpysal.weights import Rook\n\n        w_rook = Rook.from_dataframe(mgra_gdf_algo)\n        adj_mat = nx.adjacency_matrix(w_rook.to_networkx())\n        agglom = AgglomerativeClustering(\n            n_clusters=n_zones_algorithm,\n            affinity=\"euclidean\",\n            linkage=\"ward\",\n            connectivity=adj_mat,\n        )\n        agglom.fit_predict(data)\n        cluster_id = agglom.labels_\n    else:\n        raise NotImplementedError(method)\n    mgra_gdf_algo[\"cluster_id\"] = cluster_id\n\n    if mgra_gdf_explicit is None or len(mgra_gdf_explicit) == 0:\n        combined = merge_zone_data(\n            mgra_gdf_algo,\n            agg_instruction,\n            cluster_id=\"cluster_id\",\n        )\n        combined[\"cluster_id\"] = list(range(n, n + n_zones_algorithm))\n    else:\n        pending = []\n        for df in [mgra_gdf_algo, mgra_gdf_explicit]:\n            logger.info(f\"... merging {len(df)}\")\n            pending.append(\n                merge_zone_data(\n                    df,\n                    agg_instruction,\n                    cluster_id=\"cluster_id\",\n                ).reset_index()\n            )\n\n        pending[0][\"cluster_id\"] = list(range(n, n + n_zones_algorithm))\n\n        pending[0] = pending[0][\n            [c for c in pending[1].columns if c in pending[0].columns]\n        ]\n        pending[1] = pending[1][\n            [c for c in pending[0].columns if c in pending[1].columns]\n        ]\n        combined = pd.concat(pending, ignore_index=False)\n    combined = combined.reset_index(drop=True)\n\n    return combined\n
"},{"location":"api.html#rsm.input_agg.agg_input_files","title":"agg_input_files(model_dir='.', rsm_dir='.', taz_cwk_file='taz_crosswalk.csv', mgra_cwk_file='mgra_crosswalk.csv', agg_zones=2000, ext_zones=12, input_files=['microMgraEquivMinutes.csv', 'microMgraTapEquivMinutes.csv', 'walkMgraTapEquivMinutes.csv', 'walkMgraEquivMinutes.csv', 'bikeTazLogsum.csv', 'bikeMgraLogsum.csv', 'zone.term', 'zones.park', 'tap.ptype', 'accessam.csv', 'ParkLocationAlts.csv', 'CrossBorderDestinationChoiceSoaAlternatives.csv', 'TourDcSoaDistanceAlts.csv', 'DestinationChoiceAlternatives.csv', 'SoaTazDistAlts.csv', 'TripMatrices.csv', 'transponderModelAccessibilities.csv', 'crossBorderTours.csv', 'internalExternalTrips.csv', 'visitorTours.csv', 'visitorTrips.csv', 'householdAVTrips.csv', 'crossBorderTrips.csv', 'TNCTrips.csv', 'airport_out.SAN.csv', 'airport_out.CBX.csv', 'TNCtrips.csv'])","text":""},{"location":"api.html#rsm.input_agg.agg_input_files--parameters","title":"Parameters","text":"

model_dir : model_dir (path_like) path to full model run, default \u201c.\u201d rsm_dir : rsm_dir (path_like) path to RSM, default \u201c.\u201d taz_cwk_file : taz_cwk_file (csv file) default taz_crosswalk.csv taz to aggregated zones file. Should be located in RSM input folder mgra_cwk_file : mgra_cwk_file (csv file) default mgra_crosswalk.csv mgra to aggregated zones file. Should be located in RSM input folder input_files : input_files (csv + other files) list of input files to be aggregated. Should include the following files \u201cmicroMgraEquivMinutes.csv\u201d, \u201cmicroMgraTapEquivMinutes.csv\u201d, \u201cwalkMgraTapEquivMinutes.csv\u201d, \u201cwalkMgraEquivMinutes.csv\u201d, \u201cbikeTazLogsum.csv\u201d, \u201cbikeMgraLogsum.csv\u201d, \u201czone.term\u201d, \u201czones.park\u201d, \u201ctap.ptype\u201d, \u201caccessam.csv\u201d, \u201cParkLocationAlts.csv\u201d, \u201cCrossBorderDestinationChoiceSoaAlternatives.csv\u201d, \u201cTourDcSoaDistanceAlts.csv\u201d, \u201cDestinationChoiceAlternatives.csv\u201d, \u201cSoaTazDistAlts.csv\u201d, \u201cTripMatrices.csv\u201d, \u201ctransponderModelAccessibilities.csv\u201d, \u201ccrossBorderTours.csv\u201d, \u201cinternalExternalTrips.csv\u201d, \u201cvisitorTours.csv\u201d, \u201cvisitorTrips.csv\u201d, \u201chouseholdAVTrips.csv\u201d, \u201ccrossBorderTrips.csv\u201d, \u201cTNCTrips.csv\u201d, \u201cairport_out.SAN.csv\u201d, \u201cairport_out.CBX.csv\u201d, \u201cTNCtrips.csv\u201d

"},{"location":"api.html#rsm.input_agg.agg_input_files--returns","title":"Returns","text":"

Aggregated files in the RSM input/output/uec directory

Source code in rsm/input_agg.py
def agg_input_files(\n    model_dir = \".\", \n    rsm_dir = \".\",\n    taz_cwk_file = \"taz_crosswalk.csv\",\n    mgra_cwk_file = \"mgra_crosswalk.csv\",\n    agg_zones=2000,\n    ext_zones=12,\n    input_files = [\"microMgraEquivMinutes.csv\", \"microMgraTapEquivMinutes.csv\", \n    \"walkMgraTapEquivMinutes.csv\", \"walkMgraEquivMinutes.csv\", \"bikeTazLogsum.csv\",\n    \"bikeMgraLogsum.csv\", \"zone.term\", \"zones.park\", \"tap.ptype\", \"accessam.csv\",\n    \"ParkLocationAlts.csv\", \"CrossBorderDestinationChoiceSoaAlternatives.csv\", \n    \"TourDcSoaDistanceAlts.csv\", \"DestinationChoiceAlternatives.csv\", \"SoaTazDistAlts.csv\",\n    \"TripMatrices.csv\", \"transponderModelAccessibilities.csv\", \"crossBorderTours.csv\", \n    \"internalExternalTrips.csv\", \"visitorTours.csv\", \"visitorTrips.csv\", \"householdAVTrips.csv\", \n    \"crossBorderTrips.csv\", \"TNCTrips.csv\", \"airport_out.SAN.csv\", \"airport_out.CBX.csv\", \n    \"TNCtrips.csv\"]\n    ):\n\n\"\"\"\n        Parameters\n        ----------\n        model_dir : model_dir (path_like)\n            path to full model run, default \".\"\n        rsm_dir : rsm_dir (path_like)\n            path to RSM, default \".\"\n        taz_cwk_file : taz_cwk_file (csv file)\n            default taz_crosswalk.csv\n            taz to aggregated zones file. Should be located in RSM input folder\n        mgra_cwk_file : mgra_cwk_file (csv file)\n            default mgra_crosswalk.csv\n            mgra to aggregated zones file. Should be located in RSM input folder\n        input_files : input_files (csv + other files)\n            list of input files to be aggregated. \n            Should include the following files\n                \"microMgraEquivMinutes.csv\", \"microMgraTapEquivMinutes.csv\", \n                \"walkMgraTapEquivMinutes.csv\", \"walkMgraEquivMinutes.csv\", \"bikeTazLogsum.csv\",\n                \"bikeMgraLogsum.csv\", \"zone.term\", \"zones.park\", \"tap.ptype\", \"accessam.csv\",\n                \"ParkLocationAlts.csv\", \"CrossBorderDestinationChoiceSoaAlternatives.csv\",\n                \"TourDcSoaDistanceAlts.csv\", \"DestinationChoiceAlternatives.csv\", \"SoaTazDistAlts.csv\",\n                \"TripMatrices.csv\", \"transponderModelAccessibilities.csv\", \"crossBorderTours.csv\",\n                \"internalExternalTrips.csv\", \"visitorTours.csv\", \"visitorTrips.csv\", \"householdAVTrips.csv\",\n                \"crossBorderTrips.csv\", \"TNCTrips.csv\", \"airport_out.SAN.csv\", \"airport_out.CBX.csv\",\n                \"TNCtrips.csv\"\n\n        Returns\n        -------\n        Aggregated files in the RSM input/output/uec directory\n    \"\"\"\n\n    df_clusters = pd.read_csv(os.path.join(rsm_dir, \"input\", taz_cwk_file))\n    df_clusters.columns= df_clusters.columns.str.strip().str.lower()\n    dict_clusters = dict(zip(df_clusters['taz'], df_clusters['cluster_id']))\n\n    mgra_cwk = pd.read_csv(os.path.join(rsm_dir, \"input\", mgra_cwk_file))\n    mgra_cwk.columns= mgra_cwk.columns.str.strip().str.lower()\n    mgra_cwk = dict(zip(mgra_cwk['mgra'], mgra_cwk['cluster_id']))\n\n    taz_zones = int(agg_zones) + int(ext_zones)\n    mgra_zones = int(agg_zones)\n\n    # aggregating microMgraEquivMinutes.csv\n    if \"microMgraEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - microMgraEquivMinutes.csv\")\n        df_mm_eqmin = pd.read_csv(os.path.join(model_dir, \"output\", \"microMgraEquivMinutes.csv\"))\n        df_mm_eqmin['i_new'] = df_mm_eqmin['i'].map(mgra_cwk)\n        df_mm_eqmin['j_new'] = df_mm_eqmin['j'].map(mgra_cwk)\n\n        df_mm_eqmin_agg = df_mm_eqmin.groupby(['i_new', 'j_new'])['walkTime', 'dist', 'mmTime', 'mmCost', 'mtTime', 'mtCost',\n       'mmGenTime', 'mtGenTime', 'minTime'].mean().reset_index()\n\n        df_mm_eqmin_agg = df_mm_eqmin_agg.rename(columns = {'i_new' : 'i', 'j_new' : 'j'})\n        df_mm_eqmin_agg.to_csv(os.path.join(rsm_dir, \"input\", \"microMgraEquivMinutes.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"microMgraEquivMinutes.csv\")\n\n\n    # aggregating microMgraTapEquivMinutes.csv\"   \n    if \"microMgraTapEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - microMgraTapEquivMinutes.csv\")\n        df_mm_tap = pd.read_csv(os.path.join(model_dir, \"output\", \"microMgraTapEquivMinutes.csv\"))\n        df_mm_tap['mgra'] = df_mm_tap['mgra'].map(mgra_cwk)\n\n        df_mm_tap_agg = df_mm_tap.groupby(['mgra', 'tap'])['walkTime', 'dist', 'mmTime', 'mmCost', 'mtTime',\n       'mtCost', 'mmGenTime', 'mtGenTime', 'minTime'].mean().reset_index()\n\n        df_mm_tap_agg.to_csv(os.path.join(rsm_dir, \"input\", \"microMgraTapEquivMinutes.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"microMgraTapEquivMinutes.csv\")\n\n    # aggregating walkMgraTapEquivMinutes.csv\n    if \"walkMgraTapEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - walkMgraTapEquivMinutes.csv\")\n        df_wlk_mgra_tap = pd.read_csv(os.path.join(model_dir, \"output\", \"walkMgraTapEquivMinutes.csv\"))\n        df_wlk_mgra_tap[\"mgra\"] = df_wlk_mgra_tap[\"mgra\"].map(mgra_cwk)\n\n        df_wlk_mgra_agg = df_wlk_mgra_tap.groupby([\"mgra\", \"tap\"])[\"boardingPerceived\", \"boardingActual\",\"alightingPerceived\",\"alightingActual\",\"boardingGain\",\"alightingGain\"].mean().reset_index()\n        df_wlk_mgra_agg.to_csv(os.path.join(rsm_dir, \"input\", \"walkMgraTapEquivMinutes.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"walkMgraTapEquivMinutes.csv\")\n\n    # aggregating walkMgraEquivMinutes.csv\n    if \"walkMgraEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - walkMgraEquivMinutes.csv\")\n        df_wlk_min = pd.read_csv(os.path.join(model_dir, \"output\", \"walkMgraEquivMinutes.csv\"))\n        df_wlk_min[\"i\"] = df_wlk_min[\"i\"].map(mgra_cwk)\n        df_wlk_min[\"j\"] = df_wlk_min[\"j\"].map(mgra_cwk)\n\n        df_wlk_min_agg = df_wlk_min.groupby([\"i\", \"j\"])[\"percieved\",\"actual\", \"gain\"].mean().reset_index()\n\n        df_wlk_min_agg.to_csv(os.path.join(rsm_dir, \"input\", \"walkMgraEquivMinutes.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"walkMgraEquivMinutes.csv\")\n\n    # aggregating biketazlogsum\n    if \"bikeTazLogsum.csv\" in input_files:\n        logging.info(\"Aggregating - bikeTazLogsum.csv\")\n        bike_taz = pd.read_csv(os.path.join(model_dir, \"output\", \"bikeTazLogsum.csv\"))\n\n        bike_taz[\"i\"] = bike_taz[\"i\"].map(dict_clusters)\n        bike_taz[\"j\"] = bike_taz[\"j\"].map(dict_clusters)\n\n        bike_taz_agg = bike_taz.groupby([\"i\", \"j\"])[\"logsum\", \"time\"].mean().reset_index()\n        bike_taz_agg.to_csv(os.path.join(rsm_dir, \"input\", \"bikeTazLogsum.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"bikeTazLogsum.csv\")\n\n    # aggregating bikeMgraLogsum.csv\n    if \"bikeMgraLogsum.csv\" in input_files:\n        logging.info(\"Aggregating - bikeMgraLogsum.csv\")\n        bike_mgra = pd.read_csv(os.path.join(model_dir, \"output\", \"bikeMgraLogsum.csv\"))\n        bike_mgra[\"i\"] = bike_mgra[\"i\"].map(mgra_cwk)\n        bike_mgra[\"j\"] = bike_mgra[\"j\"].map(mgra_cwk)\n\n        bike_mgra_agg = bike_mgra.groupby([\"i\", \"j\"])[\"logsum\", \"time\"].mean().reset_index()\n        bike_mgra_agg.to_csv(os.path.join(rsm_dir, \"input\", \"bikeMgraLogsum.csv\"), index = False)\n    else:\n        raise FileNotFoundError(\"bikeMgraLogsum.csv\")\n\n    # aggregating zone.term\n    if \"zone.term\" in input_files:\n        logging.info(\"Aggregating - zone.term\")\n        df_zone_term = pd.read_fwf(os.path.join(model_dir, \"input\", \"zone.term\"), header = None)\n        df_zone_term.columns = [\"taz\", \"terminal_time\"]\n\n        df_agg = pd.merge(df_zone_term, df_clusters, on = \"taz\", how = 'left')\n        df_zones_agg = df_agg.groupby([\"cluster_id\"])['terminal_time'].max().reset_index()\n\n        df_zones_agg.columns = [\"taz\", \"terminal_time\"]\n        df_zones_agg.to_fwf(os.path.join(rsm_dir, \"input\", \"zone.term\"))\n\n    else:\n        raise FileNotFoundError(\"zone.term\")\n\n    # aggregating zones.park\n    if \"zones.park\" in input_files:\n        logging.info(\"Aggregating - zone.park\")\n        df_zones_park = pd.read_fwf(os.path.join(model_dir, \"input\", \"zone.park\"), header = None)\n        df_zones_park.columns = [\"taz\", \"park_zones\"]\n\n        df_zones_park_agg = pd.merge(df_zones_park, df_clusters, on = \"taz\", how = 'left')\n        df_zones_park_agg = df_zones_park_agg.groupby([\"cluster_id\"])['park_zones'].max().reset_index()\n        df_zones_park_agg.columns = [\"taz\", \"park_zones\"]\n        df_zones_park_agg.to_fwf(os.path.join(rsm_dir, \"input\", \"zone.park\"))\n\n    else:\n        raise FileNotFoundError(\"zone.park\")\n\n\n    # aggregating tap.ptype \n    if \"tap.ptype\" in input_files:\n        logging.info(\"Aggregating - tap.ptype\")\n        df_tap_ptype = pd.read_fwf(os.path.join(model_dir, \"input\", \"tap.ptype\"), header = None)\n        df_tap_ptype.columns = [\"tap\", \"lot id\", \"parking type\", \"taz\", \"capacity\", \"distance\", \"transit mode\"]\n\n        df_tap_ptype = pd.merge(df_tap_ptype, df_clusters, on = \"taz\", how = 'left')\n\n        df_tap_ptype = df_tap_ptype[[\"tap\", \"lot id\", \"parking type\", \"cluster_id\", \"capacity\", \"distance\", \"transit mode\"]]\n        df_tap_ptype = df_tap_ptype.rename(columns = {\"cluster_id\": \"taz\"})\n        #df_tap_ptype.to_fwf(os.path.join(rsm_dir, \"input\", \"tap.ptype\"))\n\n        widths = [5, 6, 6, 5, 5, 5, 3]\n\n        with open(os.path.join(rsm_dir, \"input\", \"tap.ptype\"), 'w') as f:\n            for index, row in df_tap_ptype.iterrows():\n                field1 = str(row[0]).rjust(widths[0])\n                field2 = str(row[1]).rjust(widths[1])\n                field3 = str(row[2]).rjust(widths[2])\n                field4 = str(row[3]).rjust(widths[3])\n                field5 = str(row[4]).rjust(widths[4])\n                field6 = str(row[5]).rjust(widths[5])\n                field7 = str(row[6]).rjust(widths[6])\n                f.write(f'{field1}{field2}{field3}{field4}{field5}{field6}{field7}\\n')\n\n    else:\n        raise FileNotFoundError(\"tap.ptype\")\n\n    #aggregating accessam.csv\n    if \"accessam.csv\" in input_files:\n        logging.info(\"Aggregating - accessam.csv\")\n        df_acc = pd.read_csv(os.path.join(model_dir, \"input\", \"accessam.csv\"), header = None)\n        df_acc.columns = ['TAZ', 'TAP', 'TIME', 'DISTANCE', 'MODE']\n\n        df_acc['TAZ'] = df_acc['TAZ'].map(dict_clusters)\n        df_acc_agg = df_acc.groupby(['TAZ', 'TAP', 'MODE'])['TIME', 'DISTANCE'].mean().reset_index()\n        df_acc_agg = df_acc_agg[[\"TAZ\", \"TAP\", \"TIME\", \"DISTANCE\", \"MODE\"]]\n\n        df_acc_agg.to_csv(os.path.join(rsm_dir, \"input\", \"accessam.csv\"), index = False, header =False)\n    else:\n        raise FileNotFoundError(\"accessam.csv\")\n\n    # aggregating ParkLocationAlts.csv\n    if \"ParkLocationAlts.csv\" in input_files:\n        logging.info(\"Aggregating - ParkLocationAlts.csv\")\n        df_park = pd.read_csv(os.path.join(model_dir, \"uec\", \"ParkLocationAlts.csv\"))\n        df_park['mgra_new'] = df_park[\"mgra\"].map(mgra_cwk)\n        df_park_agg = df_park.groupby([\"mgra_new\"])[\"parkarea\"].min().reset_index() # assuming 1 is \"parking\" and 2 is \"no parking\"\n        df_park_agg['a'] = [i+1 for i in range(len(df_park_agg))]\n\n        df_park_agg.columns = [\"a\", \"mgra\", \"parkarea\"]\n        df_park_agg.to_csv(os.path.join(rsm_dir, \"uec\", \"ParkLocationAlts.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"ParkLocationAlts.csv\")\n\n    # aggregating CrossBorderDestinationChoiceSoaAlternatives.csv\n    if \"CrossBorderDestinationChoiceSoaAlternatives.csv\" in input_files:\n        logging.info(\"Aggregating - CrossBorderDestinationChoiceSoaAlternatives.csv\")\n        df_cb = pd.read_csv(os.path.join(model_dir, \"uec\",\"CrossBorderDestinationChoiceSoaAlternatives.csv\"))\n\n        df_cb[\"mgra_entry\"] = df_cb[\"mgra_entry\"].map(mgra_cwk)\n        df_cb[\"mgra_return\"] = df_cb[\"mgra_return\"].map(mgra_cwk)\n        df_cb[\"a\"] = df_cb[\"a\"].map(mgra_cwk)\n\n        df_cb = pd.merge(df_cb, df_clusters, left_on = \"dest\", right_on = \"taz\", how = 'left')\n        df_cb = df_cb.drop(columns = [\"dest\", \"taz\"])\n        df_cb = df_cb.rename(columns = {'cluster_id' : 'dest'})\n\n        df_cb_final  = df_cb.drop_duplicates()\n\n        df_cb_final = df_cb_final[[\"a\", \"dest\", \"poe\", \"mgra_entry\", \"mgra_return\", \"poe_taz\"]]\n        df_cb_final.to_csv(os.path.join(rsm_dir, \"uec\", \"CrossBorderDestinationChoiceSoaAlternatives.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"CrossBorderDestinationChoiceSoaAlternatives.csv\")\n\n    # aggregating households.csv\n    if \"households.csv\" in input_files:\n        logging.info(\"Aggregating - households.csv\")\n        df_hh = pd.read_csv(os.path.join(model_dir, \"input\", \"households.csv\"))\n        df_hh[\"mgra\"] = df_hh[\"mgra\"].map(mgra_cwk)\n        df_hh[\"taz\"] = df_hh[\"taz\"].map(dict_clusters)\n\n        df_hh.to_csv(os.path.join(rsm_dir, \"input\", \"households.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"households.csv\")\n\n    # aggregating ShadowPricingOutput_school_9.csv\n    if \"ShadowPricingOutput_school_9.csv\" in input_files:\n        logging.info(\"Aggregating - ShadowPricingOutput_school_9.csv\")\n        df_sp_sch = pd.read_csv(os.path.join(model_dir, \"input\", \"ShadowPricingOutput_school_9.csv\"))\n\n        agg_instructions = {}\n        for col in df_sp_sch.columns:\n            if \"size\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"shadowPrices\" in col:\n                agg_instructions.update({col: \"max\"})\n\n            if \"_origins\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"_modeledDests\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n        df_sp_sch['mgra'] = df_sp_sch['mgra'].map(mgra_cwk)\n        df_sp_sch_agg = df_sp_sch.groupby(['mgra']).agg(agg_instructions).reset_index()\n\n        alt = list(df_sp_sch_agg['mgra'])\n        df_sp_sch_agg.insert(loc=0, column=\"alt\", value=alt)\n        df_sp_sch_agg.loc[len(df_sp_agg.index)] = 0\n\n        df_sp_sch_agg.to_csv(os.path.join(rsm_dir, \"input\", \"ShadowPricingOutput_school_9.csv\"), index=False)\n\n    else:\n        FileNotFoundError(\"ShadowPricingOutput_school_9.csv\")\n\n    # aggregating ShadowPricingOutput_work_9.csv\n    if \"ShadowPricingOutput_work_9.csv\" in input_files:\n        logging.info(\"Aggregating - ShadowPricingOutput_work_9.csv\")\n        df_sp_wrk = pd.read_csv(os.path.join(model_dir, \"input\", \"ShadowPricingOutput_work_9.csv\"))\n\n        agg_instructions = {}\n        for col in df_sp_wrk.columns:\n            if \"size\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"shadowPrices\" in col:\n                agg_instructions.update({col: \"max\"})\n\n            if \"_origins\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"_modeledDests\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n        df_sp_wrk['mgra'] = df_sp_wrk['mgra'].map(mgra_cwk)\n\n        df_sp_wrk_agg = df_sp_wrk.groupby(['mgra']).agg(agg_instructions).reset_index()\n\n        alt = list(df_sp_wrk_agg['mgra'])\n        df_sp_wrk_agg.insert(loc=0, column=\"alt\", value=alt)\n\n        df_sp_wrk_agg.loc[len(df_sp_wrk_agg.index)] = 0\n\n        df_sp_wrk_agg.to_csv(os.path.join(rsm_dir, \"input\", \"ShadowPricingOutput_work_9.csv\"), index=False)\n\n    else:\n        FileNotFoundError(\"ShadowPricingOutput_work_9.csv\")\n\n    if \"TourDcSoaDistanceAlts.csv\" in input_files:\n        logging.info(\"Aggregating - TourDcSoaDistanceAlts.csv\")\n        df_TourDcSoaDistanceAlts = pd.DataFrame({\"a\" : range(1,taz_zones+1), \"dest\" : range(1, taz_zones+1)})\n        df_TourDcSoaDistanceAlts.to_csv(os.path.join(rsm_dir, \"uec\", \"TourDcSoaDistanceAlts.csv\"), index=False)\n\n    if \"DestinationChoiceAlternatives.csv\" in input_files:\n        logging.info(\"Aggregating - DestinationChoiceAlternatives.csv\")\n        df_DestinationChoiceAlternatives = pd.DataFrame({\"a\" : range(1,mgra_zones+1), \"mgra\" : range(1, mgra_zones+1)})\n        df_DestinationChoiceAlternatives.to_csv(os.path.join(rsm_dir, \"uec\", \"DestinationChoiceAlternatives.csv\"), index=False)\n\n    if \"SoaTazDistAlts.csv\" in input_files:\n        logging.info(\"Aggregating - SoaTazDistAlts.csv\")\n        df_SoaTazDistAlts = pd.DataFrame({\"a\" : range(1,taz_zones+1), \"dest\" : range(1, taz_zones+1)})\n        df_SoaTazDistAlts.to_csv(os.path.join(rsm_dir, \"uec\", \"SoaTazDistAlts.csv\"), index=False)\n\n    if \"TripMatrices.csv\" in input_files:\n        logging.info(\"Aggregating - TripMatrices.csv\")\n        trips = pd.read_csv(os.path.join(model_dir,\"output\", \"TripMatrices.csv\"))\n        trips['i'] = trips['i'].map(dict_clusters)\n        trips['j'] = trips['j'].map(dict_clusters)\n\n        cols = list(trips.columns)\n        cols.remove(\"i\")\n        cols.remove(\"j\")\n\n        trips_df = trips.groupby(['i', 'j'])[cols].sum().reset_index()\n        trips_df.to_csv(os.path.join(rsm_dir, \"output\", \"TripMatrices.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"TripMatrices.csv\")\n\n    if \"transponderModelAccessibilities.csv\" in input_files:\n        logging.info(\"Aggregating - transponderModelAccessibilities.csv\")\n        tran_access = pd.read_csv(os.path.join(model_dir, \"output\", \"transponderModelAccessibilities.csv\"))\n        tran_access['TAZ'] = tran_access['TAZ'].map(dict_clusters)\n\n        tran_access_agg = tran_access.groupby(['TAZ'])['DIST','AVGTTS','PCTDETOUR'].mean().reset_index()\n        tran_access_agg.to_csv(os.path.join(rsm_dir, \"output\",\"transponderModelAccessibilities.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"transponderModelAccessibilities.csv\")\n\n    if \"crossBorderTours.csv\" in input_files:\n        logging.info(\"Aggregating - crossBorderTours.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"crossBorderTours.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"crossBorderTours.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"crossBorderTours.csv\")\n\n    if \"crossBorderTrips.csv\" in input_files:\n        logging.info(\"Aggregating - crossBorderTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"crossBorderTrips.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"crossBorderTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"crossBorderTrips.csv\")\n\n    if \"internalExternalTrips.csv\" in input_files:\n        logging.info(\"Aggregating - internalExternalTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"internalExternalTrips.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"internalExternalTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"internalExternalTrips.csv\")\n\n    if \"visitorTours.csv\" in input_files:\n        logging.info(\"Aggregating - visitorTours.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"visitorTours.csv\"))\n\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"visitorTours.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"visitorTours.csv\")\n\n    if \"visitorTrips.csv\" in input_files:\n        logging.info(\"Aggregating - visitorTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"visitorTrips.csv\"))\n\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"visitorTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"visitorTrips.csv\")\n\n    if \"householdAVTrips.csv\" in input_files:\n        logging.info(\"Aggregating - householdAVTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"householdAVTrips.csv\"))\n        #print(os.path.join(model_dir, \"output\", \"householdAVTrips.csv\"))\n        df['orig_mgra'] = df['orig_mgra'].map(mgra_cwk)\n        df['dest_gra'] = df['dest_gra'].map(mgra_cwk)\n\n        df['trip_orig_mgra'] = df['trip_orig_mgra'].map(mgra_cwk)\n        df['trip_dest_mgra'] = df['trip_dest_mgra'].map(mgra_cwk)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"householdAVTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"householdAVTrips.csv\")\n\n    if \"airport_out.CBX.csv\" in input_files:\n        logging.info(\"Aggregating - airport_out.CBX.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"airport_out.CBX.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"airport_out.CBX.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"airport_out.CBX.csv\")\n\n    if \"airport_out.SAN.csv\" in input_files:\n        logging.info(\"Aggregating - airport_out.SAN.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"airport_out.SAN.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"airport_out.SAN.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"airport_out.SAN.csv\")\n\n    if \"TNCtrips.csv\" in input_files:\n        logging.info(\"Aggregating - TNCtrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"TNCtrips.csv\"))\n        df['originMgra'] = df['originMgra'].map(mgra_cwk)\n        df['destinationMgra'] = df['destinationMgra'].map(mgra_cwk)\n\n        df['originTaz'] = df['originTaz'].map(dict_clusters)\n        df['destinationTaz'] = df['destinationTaz'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"TNCtrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"TNCtrips.csv\")\n\n    files = [\"Trip\" + \"_\" + i + \"_\" + j + \".csv\" for i, j in\n                itertools.product([\"FA\", \"GO\", \"IN\", \"RE\", \"SV\", \"TH\", \"WH\"],\n                                   [\"OE\", \"AM\", \"MD\", \"PM\", \"OL\"])]\n\n    for file in files:\n        logging.info(f\"Aggregating - {file}\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", file))\n        df['I'] = df['I'].map(dict_clusters)\n        df['J'] = df['J'].map(dict_clusters)\n        df['HomeZone'] = df['HomeZone'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\",file), index = False)\n
"},{"location":"api.html#rsm.translate.copy_transit_demand","title":"copy_transit_demand(matrix_names, input_dir='.', output_dir='.')","text":"

copies the omx transit demand matrix to rsm directory

"},{"location":"api.html#rsm.translate.copy_transit_demand--parameters","title":"Parameters","text":"

matrix_names : matrix_names (list) omx matrix filenames to aggregate input_dir : input_dir (Path-like) default \u201c.\u201d output_dir : output_dir (Path-like) default \u201c.\u201d

"},{"location":"api.html#rsm.translate.copy_transit_demand--returns","title":"Returns","text":"Source code in rsm/translate.py
def copy_transit_demand(\n    matrix_names,\n    input_dir=\".\",\n    output_dir=\".\"\n):\n\"\"\"\n    copies the omx transit demand matrix to rsm directory\n\n    Parameters\n    ----------\n    matrix_names : matrix_names (list)\n        omx matrix filenames to aggregate\n    input_dir : input_dir (Path-like) \n        default \".\"\n    output_dir : output_dir (Path-like)\n        default \".\"\n\n    Returns\n    -------\n\n    \"\"\"\n\n\n    for mat_name in matrix_names:\n        if '.omx' not in mat_name:\n            mat_name = mat_name + \".omx\"\n\n        input_file_dir = os.path.join(input_dir, mat_name)\n        output_file_dir = os.path.join(output_dir, mat_name)\n\n        shutil.copy(input_file_dir, output_file_dir)\n
"},{"location":"api.html#rsm.translate.translate_emmebank_demand","title":"translate_emmebank_demand(input_databank, output_databank, cores_to_aggregate, agg_zone_mapping)","text":"

aggregates the demand matrix cores from one emme databank and loads them into another databank

"},{"location":"api.html#rsm.translate.translate_emmebank_demand--parameters","title":"Parameters","text":"

input_databank : input_databank (Emme databank) Emme databank output_databank : output_databank (Emme databank) Emme databank cores_to_aggregate : cores_to_aggregate (list) matrix corenames to aggregate agg_zone_mapping: agg_zone_mapping (Path-like or pandas.DataFrame) zone number mapping between original and aggregated zones. columns: original zones as \u2018taz\u2019 and aggregated zones as \u2018cluster_id\u2019

"},{"location":"api.html#rsm.translate.translate_emmebank_demand--returns","title":"Returns","text":"

None. Loads the trip matrices into emmebank.

Source code in rsm/translate.py
def translate_emmebank_demand(\n    input_databank,\n    output_databank,\n    cores_to_aggregate,\n    agg_zone_mapping,\n): \n\"\"\"\n    aggregates the demand matrix cores from one emme databank and loads them into another databank\n\n    Parameters\n    ----------\n    input_databank : input_databank (Emme databank)\n        Emme databank\n    output_databank : output_databank (Emme databank)\n        Emme databank\n    cores_to_aggregate : cores_to_aggregate (list)\n        matrix corenames to aggregate\n    agg_zone_mapping: agg_zone_mapping (Path-like or pandas.DataFrame)\n        zone number mapping between original and aggregated zones. \n        columns: original zones as 'taz' and aggregated zones as 'cluster_id'\n\n    Returns\n    -------\n    None. Loads the trip matrices into emmebank.\n\n    \"\"\"\n\n    agg_zone_mapping_df = pd.read_csv(os.path.join(agg_zone_mapping))\n    agg_zone_mapping_df = agg_zone_mapping_df.sort_values('taz')\n\n    agg_zone_mapping_df.columns= agg_zone_mapping_df.columns.str.strip().str.lower()\n    zone_mapping = dict(zip(agg_zone_mapping_df['taz'], agg_zone_mapping_df['cluster_id']))\n\n    for core in cores_to_aggregate: \n        matrix = input_databank.matrix(core).get_data()\n        matrix_array = matrix.to_numpy()\n\n        matrix_agg = _aggregate_matrix(matrix_array, zone_mapping)\n\n        output_matrix = output_databank.matrix(core)\n        output_matrix.set_numpy_data(matrix_agg)\n
"},{"location":"api.html#rsm.translate.translate_omx_demand","title":"translate_omx_demand(matrix_names, agg_zone_mapping, input_dir='.', output_dir='.')","text":"

aggregates the omx demand matrix to aggregated zone system

"},{"location":"api.html#rsm.translate.translate_omx_demand--parameters","title":"Parameters","text":"

matrix_names : matrix_names (list) omx matrix filenames to aggregate agg_zone_mapping: agg_zone_mapping (path_like or pandas.DataFrame) zone number mapping between original and aggregated zones. columns: original zones as \u2018taz\u2019 and aggregated zones as \u2018cluster_id\u2019 input_dir : input_dir (path_like) default \u201c.\u201d output_dir : output_dir (path_like) default \u201c.\u201d

"},{"location":"api.html#rsm.translate.translate_omx_demand--returns","title":"Returns","text":"Source code in rsm/translate.py
def translate_omx_demand(\n    matrix_names,\n    agg_zone_mapping,\n    input_dir=\".\",\n    output_dir=\".\"\n): \n\"\"\"\n    aggregates the omx demand matrix to aggregated zone system\n\n    Parameters\n    ----------\n    matrix_names : matrix_names (list)\n        omx matrix filenames to aggregate\n    agg_zone_mapping: agg_zone_mapping (path_like or pandas.DataFrame)\n        zone number mapping between original and aggregated zones. \n        columns: original zones as 'taz' and aggregated zones as 'cluster_id'\n    input_dir : input_dir (path_like)\n        default \".\"\n    output_dir : output_dir (path_like) \n        default \".\"\n\n    Returns\n    -------\n\n    \"\"\"\n\n    agg_zone_mapping_df = pd.read_csv(os.path.join(agg_zone_mapping))\n    agg_zone_mapping_df = agg_zone_mapping_df.sort_values('taz')\n\n    agg_zone_mapping_df.columns= agg_zone_mapping_df.columns.str.strip().str.lower()\n    zone_mapping = dict(zip(agg_zone_mapping_df['taz'], agg_zone_mapping_df['cluster_id']))\n    agg_zones = sorted(agg_zone_mapping_df['cluster_id'].unique())\n\n    for mat_name in matrix_names:\n        if '.omx' not in mat_name:\n            mat_name = mat_name + \".omx\"\n\n        #logger.info(\"Aggregating Matrix: \" + mat_name + \" ...\")\n\n        input_skim_file = os.path.join(input_dir, mat_name)\n        print(input_skim_file)\n        output_skim_file = os.path.join(output_dir, mat_name)\n\n        assert os.path.isfile(input_skim_file)\n\n        input_matrix = omx.open_file(input_skim_file, mode=\"r\") \n        input_mapping_name = input_matrix.list_mappings()[0]\n        input_cores = input_matrix.list_matrices()\n\n        output_matrix = omx.open_file(output_skim_file, mode=\"w\")\n\n        for core in input_cores:\n            matrix = input_matrix[core]\n            matrix_array = matrix.read()\n            matrix_agg = _aggregate_matrix(matrix_array, zone_mapping)\n            output_matrix[core] = matrix_agg\n\n        output_matrix.create_mapping(title=input_mapping_name, entries=agg_zones)\n\n        input_matrix.close()\n        output_matrix.close()\n
"},{"location":"api.html#rsm.sampler.rsm_household_sampler","title":"rsm_household_sampler(input_dir='.', output_dir='.', prev_iter_access=None, curr_iter_access=None, study_area=None, input_household='households.csv', input_person='persons.csv', taz_crosswalk='taz_crosswalk.csv', mgra_crosswalk='mgra_crosswalk.csv', compare_access_columns=('NONMAN_AUTO', 'NONMAN_TRANSIT', 'NONMAN_NONMOTOR', 'NONMAN_SOV_0'), default_sampling_rate=0.25, lower_bound_sampling_rate=0.15, upper_bound_sampling_rate=1.0, random_seed=42, output_household='sampled_households.csv', output_person='sampled_person.csv')","text":"

Take an intelligent sampling of households.

"},{"location":"api.html#rsm.sampler.rsm_household_sampler--parameters","title":"Parameters","text":"

input_dir : input_dir (path_like) default \u201c.\u201d output_dir : output_dir (path_like) default \u201c.\u201d prev_iter_access : prev_iter_access (Path-like or pandas.DataFrame) Accessibility in an old (default, no treatment, etc) run is given (preloaded) or read in from here. Give as a relative path (from input_dir) or an absolute path. curr_iter_access : curr_iter_access (Path-like or pandas.DataFrame) Accessibility in the latest run is given (preloaded) or read in from here. Give as a relative path (from input_dir) or an absolute path. study_area : study_area (array-like) Array of RSM zone (these are numbered 1 to N in the RSM) in the study area. These zones are sampled at 100% if differential sampling is also turned on. input_household : input_household (Path-like or pandas.DataFrame) Complete synthetic household file. This data will be filtered to match the sampling of households and written out to a new CSV file. input_person : input_person (Path-like or pandas.DataFrame) Complete synthetic persons file. This data will be filtered to match the sampling of households and written out to a new CSV file. compare_access_columns : compare_access_columns (Collection[str]) Column names in the accessibility file to use for comparing accessibility. Only changes in the values in these columns will be evaluated. default_sampling_rate : default_sampling_rate (float) The default sampling rate, in the range (0,1] lower_bound_sampling_rate : lower_bound_sampling_rate (float) Sampling rates by zone will be truncated so they are never lower than this. upper_bound_sampling_rate : upper_bound_sampling_rate (float) Sampling rates by zone will be truncated so they are never higher than this.

"},{"location":"api.html#rsm.sampler.rsm_household_sampler--returns","title":"Returns","text":"

sample_households_df, sample_persons_df : sample_households_df, sample_persons_df (pandas.DataFrame) These are the sampled population to resimulate. They are also written to the output_dir

Source code in rsm/sampler.py
def rsm_household_sampler(\n    input_dir=\".\",\n    output_dir=\".\",\n    prev_iter_access=None,\n    curr_iter_access=None,\n    study_area=None,\n    input_household=\"households.csv\",\n    input_person=\"persons.csv\",\n    taz_crosswalk=\"taz_crosswalk.csv\",\n    mgra_crosswalk=\"mgra_crosswalk.csv\",\n    compare_access_columns=(\n        \"NONMAN_AUTO\",\n        \"NONMAN_TRANSIT\",\n        \"NONMAN_NONMOTOR\",\n        \"NONMAN_SOV_0\",\n    ),\n    default_sampling_rate=0.25,  # fix the values of this after some testing\n    lower_bound_sampling_rate=0.15,  # fix the values of this after some testing\n    upper_bound_sampling_rate=1.0,  # fix the values of this after some testing\n    random_seed=42,\n    output_household=\"sampled_households.csv\",\n    output_person=\"sampled_person.csv\",\n):\n\"\"\"\n    Take an intelligent sampling of households.\n\n    Parameters\n    ----------\n    input_dir : input_dir (path_like)\n        default \".\"\n    output_dir : output_dir (path_like)\n        default \".\"\n    prev_iter_access : prev_iter_access (Path-like or pandas.DataFrame)\n        Accessibility in an old (default, no treatment, etc) run is given (preloaded)\n        or read in from here. Give as a relative path (from `input_dir`) or an\n        absolute path.\n    curr_iter_access : curr_iter_access (Path-like or pandas.DataFrame)\n        Accessibility in the latest run is given (preloaded) or read in from here.\n        Give as a relative path (from `input_dir`) or an absolute path.\n    study_area : study_area (array-like)\n        Array of RSM zone (these are numbered 1 to N in the RSM) in the study area.\n        These zones are sampled at 100% if differential sampling is also turned on.\n    input_household : input_household (Path-like or pandas.DataFrame)\n        Complete synthetic household file.  This data will be filtered to match the\n        sampling of households and written out to a new CSV file.\n    input_person : input_person (Path-like or pandas.DataFrame)\n        Complete synthetic persons file.  This data will be filtered to match the\n        sampling of households and written out to a new CSV file.\n    compare_access_columns : compare_access_columns (Collection[str])\n        Column names in the accessibility file to use for comparing accessibility.\n        Only changes in the values in these columns will be evaluated.\n    default_sampling_rate : default_sampling_rate (float)\n        The default sampling rate, in the range (0,1]\n    lower_bound_sampling_rate : lower_bound_sampling_rate (float)\n        Sampling rates by zone will be truncated so they are never lower than this.\n    upper_bound_sampling_rate : upper_bound_sampling_rate (float)\n        Sampling rates by zone will be truncated so they are never higher than this.\n\n    Returns\n    -------\n    sample_households_df, sample_persons_df : sample_households_df, sample_persons_df (pandas.DataFrame)\n        These are the sampled population to resimulate.  They are also written to\n        the output_dir\n    \"\"\"\n\n    input_dir = Path(input_dir or \".\")\n    output_dir = Path(output_dir or \".\")\n\n    logger.debug(\"CALL rsm_household_sampler\")\n    logger.debug(f\"  {input_dir=}\")\n    logger.debug(f\"  {output_dir=}\")\n\n    def _resolve_df(x, directory, make_index=None):\n        if isinstance(x, (str, Path)):\n            # read in the file to a pandas DataFrame\n            x = Path(x).expanduser()\n            if not x.is_absolute():\n                x = Path(directory or \".\").expanduser().joinpath(x)\n            try:\n                result = pd.read_csv(x)\n            except FileNotFoundError:\n                raise\n        elif isinstance(x, pd.DataFrame):\n            result = x\n        elif x is None:\n            result = None\n        else:\n            raise TypeError(\"must be path-like or DataFrame\")\n        if (\n            result is not None\n            and make_index is not None\n            and make_index in result.columns\n        ):\n            result = result.set_index(make_index)\n        return result\n\n    def _resolve_out_filename(x):\n        x = Path(x).expanduser()\n        if not x.is_absolute():\n            x = Path(output_dir).expanduser().joinpath(x)\n        x.parent.mkdir(parents=True, exist_ok=True)\n        return x\n\n    prev_iter_access_df = _resolve_df(\n        prev_iter_access, input_dir, make_index=\"MGRA\"\n    )\n    curr_iter_access_df = _resolve_df(\n        curr_iter_access, input_dir, make_index=\"MGRA\"\n    )\n    rsm_zones = _resolve_df(taz_crosswalk, input_dir)\n    dict_clusters = dict(zip(rsm_zones[\"taz\"], rsm_zones[\"cluster_id\"]))\n\n    rsm_mgra_zones = _resolve_df(mgra_crosswalk, input_dir)\n    rsm_mgra_zones.columns = rsm_mgra_zones.columns.str.strip().str.lower()\n    dict_clusters_mgra = dict(zip(rsm_mgra_zones[\"mgra\"], rsm_mgra_zones[\"cluster_id\"]))\n\n    # changing the taz and mgra to new cluster ids\n    input_household_df = _resolve_df(input_household, input_dir)\n    input_household_df[\"taz\"] = input_household_df[\"taz\"].map(dict_clusters)\n    input_household_df[\"mgra\"] = input_household_df[\"mgra\"].map(dict_clusters_mgra)\n    input_household_df[\"count\"] = 1\n\n    mgra_hh = input_household_df.groupby([\"mgra\"]).size().rename(\"n_hh\").to_frame()\n\n    if curr_iter_access_df is None or prev_iter_access_df is None:\n\n        if curr_iter_access_df is None:\n            logger.warning(f\"missing curr_iter_access_df from {curr_iter_access}\")\n        if prev_iter_access_df is None:\n            logger.warning(f\"missing prev_iter_access_df from {prev_iter_access}\")\n        # true when sampler is turned off. default_sampling_rate should be set to 1\n\n        mgra_hh[\"sampling_rate\"] = default_sampling_rate\n        if study_area is not None:\n            mgra_hh.loc[mgra_hh.index.isin(study_area), \"sampling_rate\"] = 1\n\n        sample_households = []\n\n        for mgra_id, row in mgra_hh.iterrows():\n            df = input_household_df.loc[input_household_df[\"mgra\"] == mgra_id]\n            sampling_rate = row[\"sampling_rate\"]\n            logger.info(f\"Sampling rate of RSM zone {mgra_id}: {sampling_rate}\")\n            df = df.sample(frac=sampling_rate, random_state=mgra_id + random_seed)\n            sample_households.append(df)\n\n        # combine study are and non-study area households into single dataframe\n        sample_households_df = pd.concat(sample_households)\n\n    else:\n        # restrict to rows only where TAZs have households\n        prev_iter_access_df = prev_iter_access_df[\n            prev_iter_access_df.index.isin(mgra_hh.index)\n        ].copy()\n        curr_iter_access_df = curr_iter_access_df[\n            curr_iter_access_df.index.isin(mgra_hh.index)\n        ].copy()\n\n        # compare accessibility columns\n        compare_results = pd.DataFrame()\n\n        for column in compare_access_columns:\n            compare_results[column] = (\n                curr_iter_access_df[column] - prev_iter_access_df[column]\n            ).abs()  # take absolute difference\n        compare_results[\"MGRA\"] = prev_iter_access_df.index\n\n        compare_results = compare_results.set_index(\"MGRA\")\n\n        # Take row sums of all difference\n        compare_results[\"Total\"] = compare_results[list(compare_access_columns)].sum(\n            axis=1\n        )\n\n        # TODO: potentially adjust this later after we figure out a better approach\n        wgts = compare_results[\"Total\"] + 0.01\n        wgts /= wgts.mean() / default_sampling_rate\n        compare_results[\"sampling_rate\"] = np.clip(\n            wgts, lower_bound_sampling_rate, upper_bound_sampling_rate\n        )\n\n        sample_households = []\n        sample_rate_df = compare_results[[\"sampling_rate\"]].copy()\n        if study_area is not None:\n            sample_rate_df.loc[\n                sample_rate_df.index.isin(study_area), \"sampling_rate\"\n            ] = 1\n\n        for mgra_id, row in sample_rate_df.iterrows():\n            df = input_household_df.loc[input_household_df[\"mgra\"] == mgra_id]\n            sampling_rate = row[\"sampling_rate\"]\n            logger.info(f\"Sampling rate of RSM zone {mgra_id}: {sampling_rate}\")\n            df = df.sample(frac=sampling_rate, random_state=mgra_id + random_seed)\n            sample_households.append(df)\n\n        # combine study are and non-study area households into single dataframe\n        sample_households_df = pd.concat(sample_households)\n\n    sample_households_df = sample_households_df.sort_values(by=[\"hhid\"])\n    sample_households_df.to_csv(_resolve_out_filename(output_household), index=False)\n\n    # select persons belonging to sampled households\n    sample_hhids = sample_households_df[\"hhid\"].to_numpy()\n\n    persons_df = _resolve_df(input_person, input_dir)\n    sample_persons_df = persons_df.loc[persons_df[\"hhid\"].isin(sample_hhids)]\n    sample_persons_df.to_csv(_resolve_out_filename(output_person), index=False)\n\n    global_sample_rate = round(len(sample_households_df) / len(input_household_df),2)\n    logger.info(f\"Total Sampling Rate : {global_sample_rate}\")\n\n    return sample_households_df, sample_persons_df\n
"},{"location":"api.html#rsm.assembler.rsm_assemble","title":"rsm_assemble(orig_indiv, orig_joint, rsm_indiv, rsm_joint, households, mgra_crosswalk=None, taz_crosswalk=None, sample_rate=0.25, study_area_taz=None, run_assembler=1)","text":"

Assemble and evaluate RSM trip making.

"},{"location":"api.html#rsm.assembler.rsm_assemble--parameters","title":"Parameters","text":"

orig_indiv : orig_indiv (path_like) Trips table from \u201coriginal\u201d model run, should be comprehensive simulation of all individual trips for all synthetic households. orig_joint : orig_joint (path_like) Joint trips table from \u201coriginal\u201d model run, should be comprehensive simulation of all joint trips for all synthetic households. rsm_indiv : rsm_indiv (path_like) Trips table from RSM model run, should be a simulation of all individual trips for potentially only a subset of all synthetic households. rsm_joint : rsm_joint (path_like) Trips table from RSM model run, should be a simulation of all joint trips for potentially only a subset of all synthetic households (the same sampled households as in rsm_indiv). households : households (path_like) Synthetic household file, used to get home zones for households. mgra_crosswalk : mgra_crosswalk (path_like, optional) Crosswalk from original MGRA to clustered zone ids. Provide this crosswalk if the orig_indiv and orig_joint files reference the original MGRA system and those id\u2019s need to be converted to aggregated values before merging. sample_rate : sample_rate (float) Default/fixed sample rate if sampler was turned off this is used to scale the trips if run_assembler is 0 run_assembler : run_assembler (boolean) Flag to indicate whether to run RSM assembler or not. 1 is to run assembler, 0 is to turn if off setting this to 0 is only an option if sampler is turned off sample_rate : float default/fixed sample rate if sampler was turned off this is used to scale the trips if run_assembler is 0 study_area_rsm_zones : list it is list of study area RSM zones

"},{"location":"api.html#rsm.assembler.rsm_assemble--returns","title":"Returns","text":"

final_trips_rsm : final_ind_trips (pd.DataFrame) Assembled trip table for RSM run, filling in archived trip values for non-resimulated households. combined_trips_by_zone : final_jnt_trips (pd.DataFrame) Summary table of changes in trips by mode, by household home zone. Used to check whether undersampled zones have stable travel behavior.

Separate tables for individual and joint trips, as required by java.

Source code in rsm/assembler.py
def rsm_assemble(\n    orig_indiv,\n    orig_joint,\n    rsm_indiv,\n    rsm_joint,\n    households,\n    mgra_crosswalk=None,\n    taz_crosswalk=None,\n    sample_rate=0.25,\n    study_area_taz=None,\n    run_assembler=1,\n):\n\"\"\"\n    Assemble and evaluate RSM trip making.\n\n    Parameters\n    ----------\n    orig_indiv : orig_indiv (path_like)\n        Trips table from \"original\" model run, should be comprehensive simulation\n        of all individual trips for all synthetic households.\n    orig_joint : orig_joint (path_like)\n        Joint trips table from \"original\" model run, should be comprehensive simulation\n        of all joint trips for all synthetic households.\n    rsm_indiv : rsm_indiv (path_like)\n        Trips table from RSM model run, should be a simulation of all individual\n        trips for potentially only a subset of all synthetic households.\n    rsm_joint : rsm_joint (path_like)\n        Trips table from RSM model run, should be a simulation of all joint\n        trips for potentially only a subset of all synthetic households (the\n        same sampled households as in `rsm_indiv`).\n    households : households (path_like)\n        Synthetic household file, used to get home zones for households.\n    mgra_crosswalk : mgra_crosswalk (path_like, optional)\n        Crosswalk from original MGRA to clustered zone ids.  Provide this crosswalk\n        if the `orig_indiv` and `orig_joint` files reference the original MGRA system\n        and those id's need to be converted to aggregated values before merging.\n    sample_rate : sample_rate (float)\n        Default/fixed sample rate if sampler was turned off\n        this is used to scale the trips if run_assembler is 0\n    run_assembler : run_assembler (boolean)\n        Flag to indicate whether to run RSM assembler or not. \n        1 is to run assembler, 0 is to turn if off\n        setting this to 0 is only an option if sampler is turned off       \n    sample_rate : float\n        default/fixed sample rate if sampler was turned off\n        this is used to scale the trips if run_assembler is 0\n    study_area_rsm_zones :  list\n        it is list of study area RSM zones\n\n    Returns\n    -------\n    final_trips_rsm : final_ind_trips (pd.DataFrame)\n        Assembled trip table for RSM run, filling in archived trip values for\n        non-resimulated households.\n    combined_trips_by_zone : final_jnt_trips (pd.DataFrame)\n        Summary table of changes in trips by mode, by household home zone.\n        Used to check whether undersampled zones have stable travel behavior.\n\n    Separate tables for individual and joint trips, as required by java.\n\n\n    \"\"\"\n    orig_indiv = Path(orig_indiv).expanduser()\n    orig_joint = Path(orig_joint).expanduser()\n    rsm_indiv = Path(rsm_indiv).expanduser()\n    rsm_joint = Path(rsm_joint).expanduser()\n    households = Path(households).expanduser()\n\n    assert os.path.isfile(orig_indiv)\n    assert os.path.isfile(orig_joint)\n    assert os.path.isfile(rsm_indiv)\n    assert os.path.isfile(rsm_joint)\n    assert os.path.isfile(households)\n\n    if mgra_crosswalk is not None:\n        mgra_crosswalk = Path(mgra_crosswalk).expanduser()\n        assert os.path.isfile(mgra_crosswalk)\n\n    if taz_crosswalk is not None:\n        taz_crosswalk = Path(taz_crosswalk).expanduser()\n        assert os.path.isfile(taz_crosswalk)\n\n    # load trip data - partial simulation of RSM model\n    logger.info(\"reading ind_trips_rsm\")\n    ind_trips_rsm = pd.read_csv(rsm_indiv)\n    logger.info(\"reading jnt_trips_rsm\")\n    jnt_trips_rsm = pd.read_csv(rsm_joint)\n\n    scale_factor = int(1.0/sample_rate)\n\n    if run_assembler == 1:\n        # load trip data - full simulation of residual/source model\n        logger.info(\"reading ind_trips_full\")\n        ind_trips_full = pd.read_csv(orig_indiv)\n        logger.info(\"reading jnt_trips_full\")\n        jnt_trips_full = pd.read_csv(orig_joint)\n\n        if mgra_crosswalk is not None:\n            logger.info(\"applying mgra_crosswalk to original data\")\n            mgra_crosswalk = pd.read_csv(mgra_crosswalk).set_index(\"MGRA\")[\"cluster_id\"]\n            mgra_crosswalk[-1] = -1\n            mgra_crosswalk[0] = 0\n            for col in [c for c in ind_trips_full.columns if c.lower().endswith(\"_mgra\")]:\n                ind_trips_full[col] = ind_trips_full[col].map(mgra_crosswalk)\n            for col in [c for c in jnt_trips_full.columns if c.lower().endswith(\"_mgra\")]:\n                jnt_trips_full[col] = jnt_trips_full[col].map(mgra_crosswalk)\n\n        # convert to rsm trips\n        logger.info(\"convert to common table platform\")\n        rsm_trips = _merge_joint_and_indiv_trips(ind_trips_rsm, jnt_trips_rsm)\n        original_trips = _merge_joint_and_indiv_trips(ind_trips_full, jnt_trips_full)\n\n        logger.info(\"get all hhids in trips produced by RSM\")\n        hh_ids_rsm = rsm_trips[\"hh_id\"].unique()\n\n        logger.info(\"remove orginal model trips made by households chosen in RSM trips\")\n        original_trips_not_resimulated = original_trips.loc[\n            ~original_trips[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n        original_ind_trips_not_resimulated = ind_trips_full[\n            ~ind_trips_full[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n        original_jnt_trips_not_resimulated = jnt_trips_full[\n            ~jnt_trips_full[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n\n        logger.info(\"concatenate trips from rsm and original model\")\n        final_trips_rsm = pd.concat(\n            [rsm_trips, original_trips_not_resimulated], ignore_index=True\n        ).reset_index(drop=True)\n        final_ind_trips = pd.concat(\n            [ind_trips_rsm, original_ind_trips_not_resimulated], ignore_index=True\n        ).reset_index(drop=True)\n        final_jnt_trips = pd.concat(\n            [jnt_trips_rsm, original_jnt_trips_not_resimulated], ignore_index=True\n        ).reset_index(drop=True)\n\n        # Get percentage change in total trips by mode for each home zone\n\n        # extract trips made by households in RSM and Original model\n        original_trips_that_were_resimulated = original_trips.loc[\n            original_trips[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n\n        def _agg_by_hhid_and_tripmode(df, name):\n            return df.groupby([\"hh_id\", \"trip_mode\"]).size().rename(name).reset_index()\n\n        # combining trips by hhid and trip mode\n        combined_trips = pd.merge(\n            _agg_by_hhid_and_tripmode(original_trips_that_were_resimulated, \"n_trips_orig\"),\n            _agg_by_hhid_and_tripmode(rsm_trips, \"n_trips_rsm\"),\n            on=[\"hh_id\", \"trip_mode\"],\n            how=\"outer\",\n            sort=True,\n        ).fillna(0)\n\n        # aggregating by Home zone\n        hh_rsm = pd.read_csv(households)\n        hh_id_col_names = [\"hhid\", \"hh_id\", \"household_id\"]\n        for hhid in hh_id_col_names:\n            if hhid in hh_rsm.columns:\n                break\n        else:\n            raise KeyError(f\"none of {hh_id_col_names!r} in household file\")\n        homezone_col_names = [\"mgra\", \"home_mgra\"]\n        for zoneid in homezone_col_names:\n            if zoneid in hh_rsm.columns:\n                break\n        else:\n            raise KeyError(f\"none of {homezone_col_names!r} in household file\")\n        hh_rsm = hh_rsm[[hhid, zoneid]]\n\n        # attach home zone id\n        combined_trips = pd.merge(\n            combined_trips, hh_rsm, left_on=\"hh_id\", right_on=hhid, how=\"left\"\n        )\n\n        combined_trips_by_zone = (\n            combined_trips.groupby([zoneid, \"trip_mode\"])[[\"n_trips_orig\", \"n_trips_rsm\"]]\n            .sum()\n            .reset_index()\n        )\n\n        combined_trips_by_zone = combined_trips_by_zone.eval(\n            \"net_change = (n_trips_rsm - n_trips_orig)\"\n        )\n\n        combined_trips_by_zone[\"max_trips\"] = np.fmax(\n            combined_trips_by_zone.n_trips_rsm, combined_trips_by_zone.n_trips_orig\n        )\n        combined_trips_by_zone = combined_trips_by_zone.eval(\n            \"pct_change = net_change / max_trips * 100\"\n        )\n        combined_trips_by_zone = combined_trips_by_zone.drop(columns=\"max_trips\")\n    else:\n        # if assembler is set to be turned off\n        # then scale the trips in the trip list using the fixed sample rate \n        # trips in the final trip lists will be 100%\n        scale_factor = int(1.0/sample_rate)\n\n        if study_area_taz:\n            sa_rsm = study_area_taz\n        else:\n            sa_rsm = None\n\n        # concat is slow\n        # https://stackoverflow.com/questions/50788508/how-can-i-replicate-rows-of-a-pandas-dataframe\n        #final_ind_trips = pd.concat([ind_trips_rsm]*scale_factor, ignore_index=True)\n        #final_jnt_trips = pd.concat([jnt_trips_rsm]*scale_factor, ignore_index=True)\n\n\n        final_ind_trips = scaleup_to_rsm_samplingrate(ind_trips_rsm, \n                                                      households, \n                                                      taz_crosswalk, \n                                                      scale_factor, \n                                                      study_area_tazs=sa_rsm)\n\n        final_jnt_trips = scaleup_to_rsm_samplingrate(jnt_trips_rsm, \n                                                      households, \n                                                      taz_crosswalk, \n                                                      scale_factor,\n                                                      study_area_tazs=sa_rsm) \n\n    return final_ind_trips, final_jnt_trips\n
"},{"location":"assessment.html","title":"Assessment","text":""},{"location":"assessment.html#rsm-configuration","title":"RSM Configuration","text":"

The team conducted tests using different combinations for the RSM parameters, including the number of RSM zones (1000, 2000), default sampling rates (15%, 25%, 100%), enabling or disabling the intelligent sampler, and choosing the number of global iterations (2 or 3), among other factors. The most significant influence of the number of RSM zones was observed on the runtime of the highway assignment process. Since the highway assignment runtime was already low with 1000 RSM zones, there was no motivation to explore lower RSM zone number. Altering the sampling rate had a greater impact on the runtime of the demand model (CT-RAMP) compared to changing the number of RSM zones. These test runs exhibited varying runtimes depending on the specific configuration. Key metrics at the regional level were analyzed across these different test runs to comprehend the trade-off between improved runtime for RSM and achieving RSM results that are similar to ABM. Based on this, the team collectively determined that for the MVP (Minimum Viable Product) version of the RSM, the \u201coptimal\u201d configuration would be to use 2000 RSM zones, a 25% default sampling rate, the intelligent sampler turned off, and 2 global iterations and this RSM configuration was used to move forward with the overall assessment of the RSM.

"},{"location":"assessment.html#calibration","title":"Calibration","text":"

Aggregating the ABM zones to RSM zones, distorts the walk trips share coming out of the model. With the model configuration (Rapid Zones, Global Iterations, Sample Rate, etc.) for RSM as identified above, tour mode choice calibration was performed to match the RSM mode share to ABM2+ mode share, primarily to match the walk trips. A calibration constant was applied to the tour mode choice UEC to School, Maintenance, Discretionary tour purpose. The mode share for Work and University purpsoe were reasonable, therefore the calibration wasn\u2019t applied to those purposes.

RSM specific constants were added to the Tour Mode Choice UEC (TourModeChoice.xls) to some of the tour purposes. The Walk mode share for the Maintenance and Discretionary purposes was first adjusted by calibrating and applying n RSM specific constant row to the UEC. Furthermore, in cases where the tour involved escorting for Maintenance or Discretionary purposes, an additional calibration constant was introduced to further adjust the walk mode share for such escort tours. Similarly, a differeent set of constants were added to calibrate the School tour purpose. There was no need to calibrate mode choice for any other tour purpose as those were reasonable from RSM.

Note that a minor calibration will be required for RSM when number of rapid zones are changed.

Here is how the mode share and VMT compares before and after the calibration for RSM. Donor model in the charts below refers to the ABM2+ run.

"},{"location":"assessment.html#base-year-validation","title":"Base Year Validation","text":"

Here is the table of ABM2+ and RSM outcome comparison after the RSM calibration. The metrics used are some of the regional level key metrics. Volume comparison for the roadway segment on I-5 and I-8 were chosen at random.

"},{"location":"assessment.html#runtime-comparison","title":"Runtime Comparison","text":"

For base year 2016 simulation, below is the runtime comparison of ABM2+ vs RSM.

"},{"location":"assessment.html#sensitivity-testing","title":"Sensitivity Testing","text":"

After validating the RSM for base year with the chosen design configuration, RSM was used to carry out hypothetical planning studies related to some broader use-cases. Model results from both RSM and ABM2+ were compared for each of the sensitivity test to assess the performance of RSM and evaluate if RSM could be a viable tool for such policy planning.

For each test, a few key metrics from ABM2+ No Action, ABM2+ Action, RSM No Action and RSM Action scenario runs were compared. The goal was to have RSM and ABM2+ show similar sensitivities for action vs no-action.

"},{"location":"assessment.html#regional-highway-changes","title":"Regional Highway Changes","text":""},{"location":"assessment.html#auto-operating-cost-50-increase","title":"Auto Operating Cost - 50% Increase","text":""},{"location":"assessment.html#auto-operating-cost-50-decrease","title":"Auto Operating Cost - 50% Decrease","text":""},{"location":"assessment.html#ride-hailing-cost-50-decrease","title":"Ride Hailing Cost - 50% decrease","text":""},{"location":"assessment.html#automated-vehicles-100-adoption","title":"Automated Vehicles - 100% Adoption","text":"

In SANDAG model, the AV adoption is analyzed by capturing the zero occupancy vehicle movement as simulated in the Household AV Allocation module. For RSM, this AV allocation module is skipped, which is why RSM is not a viable tool for evaluating policies related to automated vehicles.

"},{"location":"assessment.html#land-use-changes","title":"Land Use Changes","text":"

RSM and ABM2+ shows similar sensitivities for the two tested scenarios with land use change.

"},{"location":"assessment.html#change-in-land-use-job-housing-balance","title":"Change in land use - Job Housing Balance","text":""},{"location":"assessment.html#change-in-land-use-mixed-land-use","title":"Change in land use - Mixed Land Use","text":""},{"location":"assessment.html#regional-transit-changes","title":"Regional Transit Changes","text":""},{"location":"assessment.html#transit-frequency","title":"Transit Frequency","text":"

The RSM and ABM generally match on changes in regional metrics when the transit frequency is globally doubled.

"},{"location":"assessment.html#local-highway-changes","title":"Local Highway Changes","text":""},{"location":"assessment.html#toll-removal","title":"Toll Removal","text":"

The removal of the toll on SR-125 (The South Bay Expressway) was tested in both ABM and RSM. In both models, volumes on SR-125 increased and volumes on I-805 at the same point decreased.

"},{"location":"assessment.html#local-transit-changes","title":"Local Transit Changes","text":""},{"location":"assessment.html#rapid-637-brt","title":"Rapid 637 BRT","text":"

Tests were conducted that added the planned Rapid 637 line from North Park to the Naval Facilities to the base year network. Without the study area definition there were around 3,000 boardings from the RSM, but the addition of a study area resulted in a value much closer to the one produced by ABM2+.

"},{"location":"development.html","title":"Development","text":""},{"location":"development.html#needs","title":"Needs","text":"

The time needed to configure, run, and summarize results from ABM2+ is too slow to support a nimble, challenging, and engagement-oriented planning process. SANDAG needed a tool that quickly approximates the outcomes of ABM2+. The rapid strategic model, or RSM, was built for this purpose.

ABM2+ Schematic is shown below

"},{"location":"development.html#design-considerations","title":"Design Considerations","text":"

Reducing the number of zones reduces model runtime.

Reducing the number of model components reduces runtime.

Reducing the number of global iterations reduces runtime.

Reducing sample rate reduces runtime.

"},{"location":"development.html#architecture","title":"Architecture","text":"

The RSM is developed as a Python package and the required modules are launched when running the existing SANDAG travel model as Rapid Model. It takes as input a complete ABM2+ model run and has following modules:

"},{"location":"development.html#zone-aggregator","title":"Zone Aggregator","text":"

The RSM zone creator/aggregator creates a set of RSM analysis zones (Rapid Zones) and a set of RSM input files compatible with the zone system, using a donor model run (ABM2+/ABM3) as input. The inputs include the MGRA shapefile (MGRASHAPE.zip), MGRA socioeconomic file (example: mgra13_based_input2016.csv), individual trips (indivTripData_3.csv), from the donor model. It produces a new MGRA socioeconomic file with new RSM zones and crosswalk files between original TAZ/MGRA and the rapid zones. Along with the inputs, the user can specify other parameters such as number of RSM zones, donor model run directory, number of external zones, MGRA socioeconomic file, names of crosswalk files generated by the zone aggregator module, optional study area file (to study localized changes in the region) and RSM zone centroid csv files in the model properties file (sandag_abm.properties).

At the core of the RSM zone aggregator, the module performs several steps. The MGRA geographies are loaded from shapefiles, MGRA data is loaded from the MGRA socioeconomic file, and trip data is extracted from the individual trip file. Additional computations, like intersection counts and density variables, are performed on the MGRA data. The script aggregates the MGRA\u2019s attributes to create a new zone data based on \u201cTAZ\u201d (Traffic Analysis Zone). The individual trips file is used to calculate the mode shares for each TAZ. Additional travel time between TAZs to the point of interest (default includes San Diego city hall, outside Pendleton gate, Escondido city hall, Viejas casino, and San Ysidro trolley) are also added to the aggregated data by TAZ. The TAZs are further clustered to a user-defined number of RSM zones using several cluster factors (default factors and their weights are as follows: \u201cpopden\u201d: 1, \u201cempden\u201d: 1, \u201cmodeshare_NM\u201d: 100, \u201cmodeshare_WT\u201d: 100) and clustering algorithm. The current scripts support KMeans and agglomerative clustering algorithms to cluster the TAZs. In case the user has specified a study area, the function separately handles them and aggregates them into their clusters based on the specification provided in the study area file. The remaining TAZs are aggregated based on the aggregation algorithm.

After the clustering, the aggregator produces the TAZ/MGRA crosswalks between old TAZs/MGRAs to new RSM zones. The elementary and high school enrollments are further checked and adjusted in the new RSM zone socioeconomic to prevent zero values.

The user can also control the execution of the zone aggregator from the properties file. Once a baseline RSM run is established, other project related RSM can be setup to skip running the zone aggregator and the zone system from the RSM baseline can be used. Please note that MGRA and TAZs are essentially same geographically in the RSM model run except their numbering is different.

"},{"location":"development.html#input-aggregator","title":"Input Aggregator","text":"

The input aggregator module of RSM aggregates several input files, uec (soa) files, non-abm model outputs of the donor model based on the new RSM zones. The main inputs to this module include the location of the donor model, RSM socioeconomic file, TAZ and MGRA crosswalks. The module reads the original socioeconomic file and adds intersection count and several density variables that were originally generated by the 4D module of the current ABM2+ model. This is done here in RSM because the 4D module is skipped when running RSM. The module then uses the MGRA crosswalks between MGRA and RSM zones to aggregate the original socioeconomic file data based on the new RSM zones to create a new RSM specific socioeconomic file. Next, the module aggregates the following input files:

File Name Aggregation Columns Aggregation Methodology microMgraEquivMinutes.csv walkTime, dist, mmTime, mmCost, mtTime, mtCost, mmGenTime, mtGenTime, minTime Mapped MGRA to RSM zones and aggregated the columns by taking mean. microMgraTapEquivMinutes.csv walkTime, dist, mmTime, mmCost, mtTime, mtCost, mmGenTime, mtGenTime, minTime Mapped MGRA to RSM zones and aggregated the columns by taking mean. walkMgraTapEquivMinutes.csv boardingPerceived, boardingActual, alightingPerceived, alightingActual, boardingGain, alightingGain Mapped MGRA to RSM zones and aggregated the columns by taking mean. walkMgraEquivMinutes.csv percieved, actual, gain Mapped MGRA to RSM zones and aggregated the columns by taking mean. bikeTazLogsum.csv logsum, time Mapped TAZ to RSM zones and aggregated the columns by taking the mean. bikeMgraLogsum.csv logsum, time Mapped MGRA to RSM zones and aggregated the columns by taking the mean. zone.term terminal_time Mapped TAZ to RSM zones and took the maximum. zones.park park_zones Mapped TAZ to RSM zones and took the maximum. tap.ptype Mapping RSM zones to TAZs accessam.csv TIME, DISTANCE ParkLocationAlts.csv parkarea Mapped MGRA to RSM zones and took the minimum. CrossBorderDestinationChoiceSoaAlternatives.csv Mapping MGRA to RSM Zones TourDcSoaDistanceAlts.csv a, mgra It is recreated with RSM zones DestinationChoiceAlternatives.csv a, mgra It is recreated with RSM zones SoaTazDistAlts.csv a, dest It is recreated with RSM zones TripMatrices.csv CVM_ XX:LT, CVM_ XX:IT, CVM_ XX:MT, CVM_ XX:HT,CVM_XX:LNT, CVM_XX:INT, CVM_XX:MNT, CVM_XX:HNTwhere XX = EA, AM, MD, PM, EV Mapped TAZ to RSM zones and aggregated the columns by taking the sum. transponderModelAccessibilities.csv DIST,AVGTTS,PCTDETOUR Mapped TAZ to RSM zones and aggregated the columns by taking the mean. crossBorderTours.csv Mapped MGRA/TAZs to RSM zones internalExternalTrips.csv Mapped MGRA/TAZs to RSM zones visitorTours.csv Mapped MGRA to RSM zones visitorTrips.csv Mapped MGRA to RSM zones householdAVTrips.csv Mapped MGRA to RSM zones airport_out.SAN.csv Mapped MGRA/TAZ to RSM zones airport_out.CBX.csv Mapped MGRA/TAZ to RSM zones TNCtrips.csv Mapped MGRA/TAZ to RSM zones TRIP_ST_XX.CSVwhere ST (Sector Type) = FA, GO, IN, RE, SV, TH, WH; XX (Time Period) = OE, AM, MD, PM, OL Mapped TAZ to RSM zones

More details on the the above files can be found here.

"},{"location":"development.html#translate-demand","title":"Translate Demand","text":"

The translate demand module of the RSM aggregates the non-resident demand matrices and trip tables based on the new RSM zone structure. The inputs of this module includes the path to the RSM model directory, donor model directory and crosswalks. In particular the module aggregates the demand from auto, transit, non-motorized, other trips from the airport, cross border, internal external and visitor model. It also aggregated TNC vehicle trips and empty AV trips.

"},{"location":"development.html#intelligent-sampler","title":"Intelligent Sampler","text":"

The intelligent sampler module is designed to intelligently sample households and persons from synthetic households and person data, considering accessibility metrics and other parameters. The main inputs to this module are the households file, person file, TAZ/MGRA crosswalks and the outputs are sampled households and person files. In the model properties file (sandag_abm.properties), the user can choose to run RSM sampler, specify the default sampling rate, and minimum sampling rate for the RSM model run. The user also has the ability to sample specific zones at 100% by specifying them in the study area file and turn on the differential sampling indicator (use.differential.sampling equals to 1).

The sampler function follows these primary steps:

  1. Zone Mapping: The function maps zones from the synthetic households/person data to their corresponding RSM zones using crosswalk data.

  2. Household Sampling:

  3. If accessibility data is missing (first iteration) or if the RSM sampler is turned off, a default sampling rate is applied to all RSM zones, with optional 100% sampling in the study area.
  4. If accessibility data is available and the RSM sampler is turned on, the function calculates differences in accessibility metrics between the current and previous iterations. The sampling rates are determined based on these differences and are adjusted to be within specified bounds. The RSM zones of the study area are sampled at a 100% sampling rate if the differential sampling indicator is turned on.

  5. Households and Persons Selection: The function selects households based on the calculated sampling rates. It also selects persons associated with the sampled households.

  6. Output:

  7. The selected households and persons are written to output CSV files in the specified output directory.
  8. The function also computes and logs the total sampling rate, representing the proportion of selected households relative to the total number of households.

Note that in the current RSM deployment, sampler is set to use 25% default sampling rate. The intelligent sampler needs further testing to be used to sample households using the accessibility change.

"},{"location":"development.html#intelligent-assembler","title":"Intelligent Assembler","text":"

The intelligent assembler module assembles the trips of RSM model run and scale them appropriately based on the sampling rate of the RSM zones. The main inputs to this module are joint and individual trips from the donor and RSM model, households file, crosswalks for mapping zones, optional study area file and a flag to running the assembler.

The assembler function follows these primary steps:

  1. Load Trip Files: The function reads the individual and joint trip data for the RSM run. If the assembler is set to run (flag run_assembler equals 1), the function also loads the corresponding trip data from the donor model run.

  2. Assemble Trips: It converts individual and joint trip data from both the RSM run and the original model run into a common table format using a merging process. It separates trips made by households in the RSM run and those that were not resimulated. Then, it combines these trips to create the final assembled trip data, including individual and joint trips.

  3. Evaluation of Trip Changes: The function calculates and evaluates the percentage change in total trips by mode for each home zone. It aggregates trips made by households in the RSM and original model runs and compares their trip counts by mode. This information is used to assess the stability of travel behavior in different zones.

  4. Alternative Behavior (If Assembler is Off): If the assembler is turned off (flag run_assembler equals 0), the function scales the RSM individual and joint trips based on the specified default sampling rate. This alternative behavior is intended to simulate all trips as if they were selected, eliminating the need for the assembler. If the study area file is present and the differential sampling is turned on(use.differential.sampling equals to 1), then the trips made by residents of the study area are not scaled based on the RSM deafult sampling rate.

  5. Outputs: The function returns two outputs: individual trips containing the assembled individual trip data, and joint trips containing the assembled joint trip data. These data files are structured to align with the format required for further analysis or use by Java components.

In summary, the RSM assembler module takes multiple trip datasets and assembles them to create a unified dataset for further analysis, accommodating cases where only a subset of households were resimulated. The function also evaluates changes in trip behavior across different zones.

"},{"location":"development.html#user-experience","title":"User Experience","text":"

The RSM repurposes the ABM2+ Emme-based GUI. The options will be updated to reflect the RSM options, as will the input file locations and other parameters. The RSM user experience will, therefore, be nearly the same as the ABM2+ user experience.

"},{"location":"userguide.html","title":"User Guide","text":""},{"location":"userguide.html#rsm-setup","title":"RSM Setup","text":"

Below are the steps to setup an RSM scenario run:

  1. Set up an ABM run on the server\u2019s C drive* by using the ABM2+ release 14.2.2 scenario creation GUI located at T:\\ABM\\release\\ABM\\version_14_2_2\\dist\\createStudyAndScenario.exe.

    *running the model on the T drive and setting it to run on the local drive causes an error. An issue has been created on GitHub

  2. Open Anaconda Prompt and type the following command:

    python T:\\projects\\RSM\\setup\\setup_rsm.py [MODEL_RUN_DIRECTORY]

    Specifying the model run directory in the command line is optional. If it is not specified a dialog box will open asking the user to specify the model run.

  3. Change the inputs and properties as needed. Be sure to check the following:

    1. If running a new network, make sure the network files are correct
    2. Check that the RSM properties were appended to the property file and make sure the RSM properties are correct
    3. Check that the updated Tour Mode Choice UEC was copied over
  4. After opening Emme using start_emme_with_virtual_env.bat and opening the SANDAG toolbox in Modeller as usual, set the steps to skip all of the special market models and to run only 2 iterations. Most of these should be set automatically, though you may need to set it to skip the EE model manually.

    Figure 1: Steps to run in SANDAG model GUI for RSM run

"},{"location":"userguide.html#debugging","title":"Debugging","text":"

For crashes encountered in CT-RAMP, review the event log as usual. However, if it occurs during an RSM step, a new logfile called rsm-logging.log is created in the LogFiles folder.

"},{"location":"userguide.html#rsm-specific-changes","title":"RSM Specific Changes","text":""},{"location":"userguide.html#application","title":"Application","text":""},{"location":"userguide.html#bin","title":"Bin","text":""},{"location":"userguide.html#emme_project","title":"Emme_project","text":""},{"location":"userguide.html#input","title":"Input","text":""},{"location":"userguide.html#pythonemmetoolbox","title":"Python\\emme\\toolbox","text":""},{"location":"userguide.html#new-properties","title":"New Properties","text":""},{"location":"userguide.html#new-files","title":"New Files","text":"
  1. study_area.csv:

    This optional file specifies an explicit definition of how to aggregate certain zones, and consequentially, which zones to not aggregate. This is useful for project-level analysis as a modeler may want higher resolution close to a project but not be need the resolution further away. The file has two columns, taz and group. The taz column is the zone ID in the ABM zone system, and the group column indicates what RSM zone the ABM zone will be a part of. This will be the MGRA ID, and the TAZ ID being the MGRA ID added to the number of external zones. If a user doesn\u2019t want to aggregate any zones within the study area, the group ID should be distinct for all of them. Presently, all RSM zones defined in the study area are sampled at 100%, and the remaining zones are sampled at the sampling rate set in the property file.

    Any zones not within the study area will be aggregated using the standard RSM zone aggregating algorithm.

    An example of how the study area file works is shown below (assuming 12 external zones):

    Figure 2: ABM Zones

    Table 1: study_area.csv

    taz group 1 1 2 2 3 3 4 4 5 5 6 6

    Figure 3: Resulting RSM Zones

    For a practical example, see Figure 4, where a study area was defined as every zone within a half mile of a project. Note that within the study area, no zones were aggregated (as it was defined), but outside of the study area, aggregation occurred.

    Figure 4: Example Study Area

"},{"location":"visualizer.html","title":"Visualizer","text":""},{"location":"visualizer.html#introduction","title":"Introduction","text":"

The team developed a RSM visualizer tool to allow user to summarize and compare metrics from multiple RSM model runs. It is a dashboard style tool built using SimWrapper (an open source web-based data visualization tool for building disaggregate transportation simulations) and also leverages SANDAG\u2019s Data Pipeline Tool. SimWrapper software works by creating a mini file server to host reduced data summaries of travel model. The dashboard is created via YAML files, which can be customized to automate interactive report summaries, such as charts, summary tables, and spatial maps.

"},{"location":"visualizer.html#design","title":"Design","text":"

Visualizer has three main components:

"},{"location":"visualizer.html#data-pipeline","title":"Data Pipeline","text":"

SANDAG Data Pipeline Tool aims to aid in the process of building data pipelines that ingest, transform, and summarize data by taking advantage of the parameterization of data pipelines. Rather than coding from scratch, configure a few files and the tool will figure out the rest. Using pipeline helps to get the desired model summaries in a csv format. See here to learn how the tool works. Note that RSM visualizer currently supports a fixed set of summaries from the model and additional summaries can be easily incorporated into the pipeline by modifying the settings, processor and expression files.

"},{"location":"visualizer.html#post-processing","title":"Post Processing","text":"

Next, there is a post-processing script to perform all the data manipulations which are done outside of the data pipeline tool to prepare the data in the format required by SimWrapper. Similar to data pipeline, user can also modify this post-processing script to add any new summaries in order to bring them into the SimWrapper dashboard in order to use them in Simwrapper.

"},{"location":"visualizer.html#simwrapper","title":"SimWrapper","text":"

Lastly, the created summary files are consumed by SimWrapper to generate dashboard. SimWrapper is a web platform that can display either individual full-page data visualizations, or collections of visualizations in \u201cdashboard\u201d format. It expects your simulation outputs to just be local files on your filesystem somewhere; there is no need to upload the summary files to centralized database or cloud server to create the dashboard.

For setting up the visualization in SimWrapper, configuration files (in YAML format) are created that provide all the config details to get it up and running, such as which data to load, how to lay out the dashboard, what type of chart to create etc. Refer to SimWrapper documentation here to get more familiar with it.

"},{"location":"visualizer.html#setup","title":"Setup","text":"

The visualizer is currently deployed to compare 3 scenario runs at once. Running data pipeline and post-processing for each of those scenario is controlled thorugh the process_scenarios python script and configuration for scenarios are specified using the scenarios.yaml file. User will need to modify this yaml file to specify the scenarios they would like to compare using visualizer. There are two categories of scenarios to be specified - RSM and ABM (Donor Model) runs. For each of the scenario run, specify the directory of input and report folders in this configuration file. Files from input and report folder for the scenarios are then used in the data pipeline tool and post-processing step to create summaries in the processed folder of SimWrapper directory. Note that additional number of scenarios can be compared by extending the configuration in this file yaml file.

"},{"location":"visualizer.html#visualization","title":"Visualization","text":"

Currently there are five default visualization summaries in the visualizer:

"},{"location":"visualizer.html#bar-charts","title":"Bar Charts","text":"

These charts are for comparing VMT, mode shares, transit boardings and trip purpose by time-of-day distribution. Here is a snapshot of sample YAML configuration file for bar chart:

User can add as many charts as you want to the layout. For each chart, you should specify a csv file for the summaries and columns should match the csv file column name. There are also other specifications for the bar charts which you learn more about here.

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#network-flows","title":"Network Flows","text":"

These charts are for comparing flows and VMT on the network. You can compare any two scenarios on one network. Here is a snapshot of the configuration file:

For each network you need the csv files for two scenario summaries and an underlying network file which should be in geojson format. The supporting script creates the geojson files from the model outputs for the SimWrapper. For more info on network visualization specification see here.

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#sample-rate-map","title":"Sample Rate Map","text":"

This visual is a map for showing the RSM sample rates for each zone. Here is a snapshot of the configuration [file]:

For each map you need a csv file of sample rates and the map of zones in .shp format. For more info on network visualization specification see here.

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#zero-car-map","title":"Zero Car Map","text":"

This visual is a map for showing the zero-car household distribution. Here is a snapshot of the configuration file:

For each map you need a csv file of household rates and the map of zones in .shp format. For more info on network visualization specification see here

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#od-flows","title":"OD Flows","text":"

This chart is for showing OD trip flows. Here is a snapshot of the configuration file:

For each map you need a csv file of od trip flows and the map of zones in .shp format. For more info on network visualization specification see here

Here is how the how the visual looks in the dashboard:

You can also modify the data and configuration of each visual on SimWrapper server. For each visual, there is a configuration button (see below), where you can add data, and modify all the map configurations. You can also export these configurations into a YAML file so you can use it in future.

"},{"location":"visualizer.html#how-to-run","title":"How to Run","text":"

The first step to run the visualizer is to bring in the scenario files. Currently the visualizer is setup to compare three scenarios: donor_ model, rsm_base and rsm_scen. donor_model is the ABM run, rsm_base is the baseline (no-action) RSM run and rsm_scen is the project (action) RSM run.

As mentioned earlier, if you wish to add any more RSM scenarios for comaprison, you can do it by modifying the scenarios.yaml file. Simply add the scenario configuration by copying the rsm_scen section and paste it under and change \u201crsm_scen\u201d to that new scenario name. Note that you will also need to add that another scenario config to the Data Pipeline and Post-Processing step.

Once you have copied required scenario files and the configuration setup, you are ready to runt the visualizer.

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"index.html","title":"SANDAG Rapid Strategic Model","text":"

Welcome to the SANDAG Rapid Strategic Model documentation site!

"},{"location":"index.html#introduction","title":"Introduction","text":"

The travel demand model SANDAG used for the 2021 regional plan, referred to as ABM2+, is one of the most sophisticated modeling tools used anywhere in the world. Its activity-based approach to representing travel is behaviorally rich; the representations of land development and transportation infrastructure are represented in high fidelity spatial detail. An operational shortcoming of ABM2+ is it requires significant computational resources to carry out a simulation. A typical forecast year simulation of ABM2+ takes over 40 hours to complete on a high end workstation (e.g., 48 physical computing cores and 256 gigabytes of RAM). The components of this runtime include:

The computational time of ABM2+, and the likely computational time of the successor to ABM2+ (ABM3), hinders SANDAG\u2019s ability to carry out certain analyses in a timely manner. For example, if an analyst wants to explore 10 different roadway pricing schemes for a select corridor, a month of computation time would be required.

SANDAG requires a tool capable of quickly approximating the outcomes of ABM2+. Therefore, a tool was built for this purpose, referred to henceforth as the Rapid Strategic Model (RSM). The primary objective of the RSM was to enhance the speed of the resident passenger component within the broader modeling system and produce results that closely aligned with ABM2+ for policy planning requirements.

"},{"location":"index.html#use-cases-and-key-limitations","title":"Use Cases and Key Limitations","text":"

Based on set of tests done as part of this project, RSM performs well for regional scale roadway projects (e.g., auto operating costs and mileage fee, TNC costs and wait times etc.) and regional scale transit projects (transit fare, headway changes etc.). RSM also performed well for land-use change policies. Lastly, RSM was also tested for local roadway changes (e.g., managed lanes conversion) and local transit changes (e.g., new BRT line), and the results indicate that those policies are reasonably represented by RSM as well.

Here are some of the current limitations of RSM:

"},{"location":"api.html","title":"Application Programming Interface","text":""},{"location":"api.html#rsm.zone_agg.aggregate_zones","title":"aggregate_zones(mgra_gdf, method='kmeans', n_zones=2000, random_state=0, cluster_factors=None, cluster_factors_onehot=None, use_xy=True, explicit_agg=(), explicit_col='mgra', agg_instruction=None, start_cluster_ids=13)","text":"

Aggregate zones.

"},{"location":"api.html#rsm.zone_agg.aggregate_zones--parameters","title":"Parameters","text":"

mgra_gdf : mgra_gdf (GeoDataFrame) Geometry and attibutes of MGRAs method : method (array) default {\u2018kmeans\u2019, \u2018agglom\u2019, \u2018agglom_adj\u2019} n_zones : n_zones (int) random_state : random_state (RandomState or int) cluster_factors : cluster_factors (dict) cluster_factors_onehot : cluster_factors_onehot (dict) use_xy : use_xy (bool or float) Use X and Y coordinates as a cluster factor, use a float to scale the x-y coordinates from the CRS if needed. explicit_agg : explicit_agg (list[int or list]) A list containing integers (individual MGRAs that should not be aggregated) or lists of integers (groups of MGRAs that should be aggregated exactly as given, with no less and no more) explicit_col : explicit_col (str) The name of the column containing the ID\u2019s from explicit_agg, usually \u2018taz\u2019 agg_instruction : agg_instruction (dict) Dictionary passed to pandas agg that says how to aggregate data columns. start_cluster_ids : start_cluster_ids (int, default 13) Cluster id\u2019s start at this value. Can be 1, but typically SANDAG has the smallest id\u2019s reserved for external zones, so starting at a greater value is typical.

"},{"location":"api.html#rsm.zone_agg.aggregate_zones--returns","title":"Returns","text":"

GeoDataFrame

Source code in rsm/zone_agg.py
def aggregate_zones(\n    mgra_gdf,\n    method=\"kmeans\",\n    n_zones=2000,\n    random_state=0,\n    cluster_factors=None,\n    cluster_factors_onehot=None,\n    use_xy=True,\n    explicit_agg=(),\n    explicit_col=\"mgra\",\n    agg_instruction=None,\n    start_cluster_ids=13,\n):\n\"\"\"\n    Aggregate zones.\n\n    Parameters\n    ----------\n    mgra_gdf : mgra_gdf (GeoDataFrame)\n        Geometry and attibutes of MGRAs\n    method : method (array)\n        default {'kmeans', 'agglom', 'agglom_adj'}\n    n_zones : n_zones (int)\n    random_state : random_state (RandomState or int)\n    cluster_factors : cluster_factors (dict)\n    cluster_factors_onehot : cluster_factors_onehot (dict)\n    use_xy : use_xy (bool or float)\n        Use X and Y coordinates as a cluster factor, use a float to scale the\n        x-y coordinates from the CRS if needed.\n    explicit_agg : explicit_agg (list[int or list])\n        A list containing integers (individual MGRAs that should not be aggregated)\n        or lists of integers (groups of MGRAs that should be aggregated exactly as\n        given, with no less and no more)\n    explicit_col : explicit_col (str)\n        The name of the column containing the ID's from `explicit_agg`, usually 'taz'\n    agg_instruction : agg_instruction (dict)\n        Dictionary passed to pandas `agg` that says how to aggregate data columns.\n    start_cluster_ids : start_cluster_ids (int, default 13)\n        Cluster id's start at this value.  Can be 1, but typically SANDAG has the\n        smallest id's reserved for external zones, so starting at a greater value\n        is typical.\n\n    Returns\n    -------\n    GeoDataFrame\n    \"\"\"\n\n    if cluster_factors is None:\n        cluster_factors = {}\n\n    n = start_cluster_ids\n    if explicit_agg:\n        explicit_agg_ids = {}\n        for i in explicit_agg:\n            if isinstance(i, Number):\n                explicit_agg_ids[i] = n\n            else:\n                for j in i:\n                    explicit_agg_ids[j] = n\n            n += 1\n        if explicit_col == mgra_gdf.index.name:\n            mgra_gdf = mgra_gdf.reset_index()\n            mgra_gdf.index = mgra_gdf[explicit_col]\n        in_explicit = mgra_gdf[explicit_col].isin(explicit_agg_ids)\n        mgra_gdf_algo = mgra_gdf.loc[~in_explicit].copy()\n        mgra_gdf_explicit = mgra_gdf.loc[in_explicit].copy()\n        mgra_gdf_explicit[\"cluster_id\"] = mgra_gdf_explicit[explicit_col].map(\n            explicit_agg_ids\n        )\n        n_zones_algorithm = n_zones - len(\n            mgra_gdf_explicit[\"cluster_id\"].value_counts()\n        )\n    else:\n        mgra_gdf_algo = mgra_gdf.copy()\n        mgra_gdf_explicit = None\n        n_zones_algorithm = n_zones\n\n    if use_xy:\n        geometry = mgra_gdf_algo.centroid\n        X = list(geometry.apply(lambda p: p.x))\n        Y = list(geometry.apply(lambda p: p.y))\n        factors = [np.asarray(X) * use_xy, np.asarray(Y) * use_xy]\n    else:\n        factors = []\n    for cf, cf_wgt in cluster_factors.items():\n        factors.append(cf_wgt * mgra_gdf_algo[cf].values.astype(np.float32))\n    if cluster_factors_onehot:\n        for cf, cf_wgt in cluster_factors_onehot.items():\n            factors.append(cf_wgt * OneHotEncoder().fit_transform(mgra_gdf_algo[[cf]]))\n        from scipy.sparse import hstack\n\n        factors2d = []\n        for j in factors:\n            if j.ndim < 2:\n                factors2d.append(np.expand_dims(j, -1))\n            else:\n                factors2d.append(j)\n        data = hstack(factors2d).toarray()\n    else:\n        data = np.array(factors).T\n\n    if method == \"kmeans\":\n        kmeans = KMeans(n_clusters=n_zones_algorithm, random_state=random_state)\n        kmeans.fit(data)\n        cluster_id = kmeans.labels_\n    elif method == \"agglom\":\n        agglom = AgglomerativeClustering(\n            n_clusters=n_zones_algorithm, affinity=\"euclidean\", linkage=\"ward\"\n        )\n        agglom.fit_predict(data)\n        cluster_id = agglom.labels_\n    elif method == \"agglom_adj\":\n        from libpysal.weights import Rook\n\n        w_rook = Rook.from_dataframe(mgra_gdf_algo)\n        adj_mat = nx.adjacency_matrix(w_rook.to_networkx())\n        agglom = AgglomerativeClustering(\n            n_clusters=n_zones_algorithm,\n            affinity=\"euclidean\",\n            linkage=\"ward\",\n            connectivity=adj_mat,\n        )\n        agglom.fit_predict(data)\n        cluster_id = agglom.labels_\n    else:\n        raise NotImplementedError(method)\n    mgra_gdf_algo[\"cluster_id\"] = cluster_id\n\n    if mgra_gdf_explicit is None or len(mgra_gdf_explicit) == 0:\n        combined = merge_zone_data(\n            mgra_gdf_algo,\n            agg_instruction,\n            cluster_id=\"cluster_id\",\n        )\n        combined[\"cluster_id\"] = list(range(n, n + n_zones_algorithm))\n    else:\n        pending = []\n        for df in [mgra_gdf_algo, mgra_gdf_explicit]:\n            logger.info(f\"... merging {len(df)}\")\n            pending.append(\n                merge_zone_data(\n                    df,\n                    agg_instruction,\n                    cluster_id=\"cluster_id\",\n                ).reset_index()\n            )\n\n        pending[0][\"cluster_id\"] = list(range(n, n + n_zones_algorithm))\n\n        pending[0] = pending[0][\n            [c for c in pending[1].columns if c in pending[0].columns]\n        ]\n        pending[1] = pending[1][\n            [c for c in pending[0].columns if c in pending[1].columns]\n        ]\n        combined = pd.concat(pending, ignore_index=False)\n    combined = combined.reset_index(drop=True)\n\n    return combined\n
"},{"location":"api.html#rsm.input_agg.agg_input_files","title":"agg_input_files(model_dir='.', rsm_dir='.', taz_cwk_file='taz_crosswalk.csv', mgra_cwk_file='mgra_crosswalk.csv', agg_zones=2000, ext_zones=12, input_files=['microMgraEquivMinutes.csv', 'microMgraTapEquivMinutes.csv', 'walkMgraTapEquivMinutes.csv', 'walkMgraEquivMinutes.csv', 'bikeTazLogsum.csv', 'bikeMgraLogsum.csv', 'zone.term', 'zones.park', 'tap.ptype', 'accessam.csv', 'ParkLocationAlts.csv', 'CrossBorderDestinationChoiceSoaAlternatives.csv', 'TourDcSoaDistanceAlts.csv', 'DestinationChoiceAlternatives.csv', 'SoaTazDistAlts.csv', 'TripMatrices.csv', 'transponderModelAccessibilities.csv', 'crossBorderTours.csv', 'internalExternalTrips.csv', 'visitorTours.csv', 'visitorTrips.csv', 'householdAVTrips.csv', 'crossBorderTrips.csv', 'TNCTrips.csv', 'airport_out.SAN.csv', 'airport_out.CBX.csv', 'TNCtrips.csv'])","text":""},{"location":"api.html#rsm.input_agg.agg_input_files--parameters","title":"Parameters","text":"

model_dir : model_dir (path_like) path to full model run, default \u201c.\u201d rsm_dir : rsm_dir (path_like) path to RSM, default \u201c.\u201d taz_cwk_file : taz_cwk_file (csv file) default taz_crosswalk.csv taz to aggregated zones file. Should be located in RSM input folder mgra_cwk_file : mgra_cwk_file (csv file) default mgra_crosswalk.csv mgra to aggregated zones file. Should be located in RSM input folder input_files : input_files (csv + other files) list of input files to be aggregated. Should include the following files \u201cmicroMgraEquivMinutes.csv\u201d, \u201cmicroMgraTapEquivMinutes.csv\u201d, \u201cwalkMgraTapEquivMinutes.csv\u201d, \u201cwalkMgraEquivMinutes.csv\u201d, \u201cbikeTazLogsum.csv\u201d, \u201cbikeMgraLogsum.csv\u201d, \u201czone.term\u201d, \u201czones.park\u201d, \u201ctap.ptype\u201d, \u201caccessam.csv\u201d, \u201cParkLocationAlts.csv\u201d, \u201cCrossBorderDestinationChoiceSoaAlternatives.csv\u201d, \u201cTourDcSoaDistanceAlts.csv\u201d, \u201cDestinationChoiceAlternatives.csv\u201d, \u201cSoaTazDistAlts.csv\u201d, \u201cTripMatrices.csv\u201d, \u201ctransponderModelAccessibilities.csv\u201d, \u201ccrossBorderTours.csv\u201d, \u201cinternalExternalTrips.csv\u201d, \u201cvisitorTours.csv\u201d, \u201cvisitorTrips.csv\u201d, \u201chouseholdAVTrips.csv\u201d, \u201ccrossBorderTrips.csv\u201d, \u201cTNCTrips.csv\u201d, \u201cairport_out.SAN.csv\u201d, \u201cairport_out.CBX.csv\u201d, \u201cTNCtrips.csv\u201d

"},{"location":"api.html#rsm.input_agg.agg_input_files--returns","title":"Returns","text":"

Aggregated files in the RSM input/output/uec directory

Source code in rsm/input_agg.py
def agg_input_files(\n    model_dir = \".\", \n    rsm_dir = \".\",\n    taz_cwk_file = \"taz_crosswalk.csv\",\n    mgra_cwk_file = \"mgra_crosswalk.csv\",\n    agg_zones=2000,\n    ext_zones=12,\n    input_files = [\"microMgraEquivMinutes.csv\", \"microMgraTapEquivMinutes.csv\", \n    \"walkMgraTapEquivMinutes.csv\", \"walkMgraEquivMinutes.csv\", \"bikeTazLogsum.csv\",\n    \"bikeMgraLogsum.csv\", \"zone.term\", \"zones.park\", \"tap.ptype\", \"accessam.csv\",\n    \"ParkLocationAlts.csv\", \"CrossBorderDestinationChoiceSoaAlternatives.csv\", \n    \"TourDcSoaDistanceAlts.csv\", \"DestinationChoiceAlternatives.csv\", \"SoaTazDistAlts.csv\",\n    \"TripMatrices.csv\", \"transponderModelAccessibilities.csv\", \"crossBorderTours.csv\", \n    \"internalExternalTrips.csv\", \"visitorTours.csv\", \"visitorTrips.csv\", \"householdAVTrips.csv\", \n    \"crossBorderTrips.csv\", \"TNCTrips.csv\", \"airport_out.SAN.csv\", \"airport_out.CBX.csv\", \n    \"TNCtrips.csv\"]\n    ):\n\n\"\"\"\n        Parameters\n        ----------\n        model_dir : model_dir (path_like)\n            path to full model run, default \".\"\n        rsm_dir : rsm_dir (path_like)\n            path to RSM, default \".\"\n        taz_cwk_file : taz_cwk_file (csv file)\n            default taz_crosswalk.csv\n            taz to aggregated zones file. Should be located in RSM input folder\n        mgra_cwk_file : mgra_cwk_file (csv file)\n            default mgra_crosswalk.csv\n            mgra to aggregated zones file. Should be located in RSM input folder\n        input_files : input_files (csv + other files)\n            list of input files to be aggregated. \n            Should include the following files\n                \"microMgraEquivMinutes.csv\", \"microMgraTapEquivMinutes.csv\", \n                \"walkMgraTapEquivMinutes.csv\", \"walkMgraEquivMinutes.csv\", \"bikeTazLogsum.csv\",\n                \"bikeMgraLogsum.csv\", \"zone.term\", \"zones.park\", \"tap.ptype\", \"accessam.csv\",\n                \"ParkLocationAlts.csv\", \"CrossBorderDestinationChoiceSoaAlternatives.csv\",\n                \"TourDcSoaDistanceAlts.csv\", \"DestinationChoiceAlternatives.csv\", \"SoaTazDistAlts.csv\",\n                \"TripMatrices.csv\", \"transponderModelAccessibilities.csv\", \"crossBorderTours.csv\",\n                \"internalExternalTrips.csv\", \"visitorTours.csv\", \"visitorTrips.csv\", \"householdAVTrips.csv\",\n                \"crossBorderTrips.csv\", \"TNCTrips.csv\", \"airport_out.SAN.csv\", \"airport_out.CBX.csv\",\n                \"TNCtrips.csv\"\n\n        Returns\n        -------\n        Aggregated files in the RSM input/output/uec directory\n    \"\"\"\n\n    df_clusters = pd.read_csv(os.path.join(rsm_dir, \"input\", taz_cwk_file))\n    df_clusters.columns= df_clusters.columns.str.strip().str.lower()\n    dict_clusters = dict(zip(df_clusters['taz'], df_clusters['cluster_id']))\n\n    mgra_cwk = pd.read_csv(os.path.join(rsm_dir, \"input\", mgra_cwk_file))\n    mgra_cwk.columns= mgra_cwk.columns.str.strip().str.lower()\n    mgra_cwk = dict(zip(mgra_cwk['mgra'], mgra_cwk['cluster_id']))\n\n    taz_zones = int(agg_zones) + int(ext_zones)\n    mgra_zones = int(agg_zones)\n\n    # aggregating microMgraEquivMinutes.csv\n    if \"microMgraEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - microMgraEquivMinutes.csv\")\n        df_mm_eqmin = pd.read_csv(os.path.join(model_dir, \"output\", \"microMgraEquivMinutes.csv\"))\n        df_mm_eqmin['i_new'] = df_mm_eqmin['i'].map(mgra_cwk)\n        df_mm_eqmin['j_new'] = df_mm_eqmin['j'].map(mgra_cwk)\n\n        df_mm_eqmin_agg = df_mm_eqmin.groupby(['i_new', 'j_new'])['walkTime', 'dist', 'mmTime', 'mmCost', 'mtTime', 'mtCost',\n       'mmGenTime', 'mtGenTime', 'minTime'].mean().reset_index()\n\n        df_mm_eqmin_agg = df_mm_eqmin_agg.rename(columns = {'i_new' : 'i', 'j_new' : 'j'})\n        df_mm_eqmin_agg.to_csv(os.path.join(rsm_dir, \"input\", \"microMgraEquivMinutes.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"microMgraEquivMinutes.csv\")\n\n\n    # aggregating microMgraTapEquivMinutes.csv\"   \n    if \"microMgraTapEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - microMgraTapEquivMinutes.csv\")\n        df_mm_tap = pd.read_csv(os.path.join(model_dir, \"output\", \"microMgraTapEquivMinutes.csv\"))\n        df_mm_tap['mgra'] = df_mm_tap['mgra'].map(mgra_cwk)\n\n        df_mm_tap_agg = df_mm_tap.groupby(['mgra', 'tap'])['walkTime', 'dist', 'mmTime', 'mmCost', 'mtTime',\n       'mtCost', 'mmGenTime', 'mtGenTime', 'minTime'].mean().reset_index()\n\n        df_mm_tap_agg.to_csv(os.path.join(rsm_dir, \"input\", \"microMgraTapEquivMinutes.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"microMgraTapEquivMinutes.csv\")\n\n    # aggregating walkMgraTapEquivMinutes.csv\n    if \"walkMgraTapEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - walkMgraTapEquivMinutes.csv\")\n        df_wlk_mgra_tap = pd.read_csv(os.path.join(model_dir, \"output\", \"walkMgraTapEquivMinutes.csv\"))\n        df_wlk_mgra_tap[\"mgra\"] = df_wlk_mgra_tap[\"mgra\"].map(mgra_cwk)\n\n        df_wlk_mgra_agg = df_wlk_mgra_tap.groupby([\"mgra\", \"tap\"])[\"boardingPerceived\", \"boardingActual\",\"alightingPerceived\",\"alightingActual\",\"boardingGain\",\"alightingGain\"].mean().reset_index()\n        df_wlk_mgra_agg.to_csv(os.path.join(rsm_dir, \"input\", \"walkMgraTapEquivMinutes.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"walkMgraTapEquivMinutes.csv\")\n\n    # aggregating walkMgraEquivMinutes.csv\n    if \"walkMgraEquivMinutes.csv\" in input_files:\n        logging.info(\"Aggregating - walkMgraEquivMinutes.csv\")\n        df_wlk_min = pd.read_csv(os.path.join(model_dir, \"output\", \"walkMgraEquivMinutes.csv\"))\n        df_wlk_min[\"i\"] = df_wlk_min[\"i\"].map(mgra_cwk)\n        df_wlk_min[\"j\"] = df_wlk_min[\"j\"].map(mgra_cwk)\n\n        df_wlk_min_agg = df_wlk_min.groupby([\"i\", \"j\"])[\"percieved\",\"actual\", \"gain\"].mean().reset_index()\n\n        df_wlk_min_agg.to_csv(os.path.join(rsm_dir, \"input\", \"walkMgraEquivMinutes.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"walkMgraEquivMinutes.csv\")\n\n    # aggregating biketazlogsum\n    if \"bikeTazLogsum.csv\" in input_files:\n        logging.info(\"Aggregating - bikeTazLogsum.csv\")\n        bike_taz = pd.read_csv(os.path.join(model_dir, \"output\", \"bikeTazLogsum.csv\"))\n\n        bike_taz[\"i\"] = bike_taz[\"i\"].map(dict_clusters)\n        bike_taz[\"j\"] = bike_taz[\"j\"].map(dict_clusters)\n\n        bike_taz_agg = bike_taz.groupby([\"i\", \"j\"])[\"logsum\", \"time\"].mean().reset_index()\n        bike_taz_agg.to_csv(os.path.join(rsm_dir, \"input\", \"bikeTazLogsum.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"bikeTazLogsum.csv\")\n\n    # aggregating bikeMgraLogsum.csv\n    if \"bikeMgraLogsum.csv\" in input_files:\n        logging.info(\"Aggregating - bikeMgraLogsum.csv\")\n        bike_mgra = pd.read_csv(os.path.join(model_dir, \"output\", \"bikeMgraLogsum.csv\"))\n        bike_mgra[\"i\"] = bike_mgra[\"i\"].map(mgra_cwk)\n        bike_mgra[\"j\"] = bike_mgra[\"j\"].map(mgra_cwk)\n\n        bike_mgra_agg = bike_mgra.groupby([\"i\", \"j\"])[\"logsum\", \"time\"].mean().reset_index()\n        bike_mgra_agg.to_csv(os.path.join(rsm_dir, \"input\", \"bikeMgraLogsum.csv\"), index = False)\n    else:\n        raise FileNotFoundError(\"bikeMgraLogsum.csv\")\n\n    # aggregating zone.term\n    if \"zone.term\" in input_files:\n        logging.info(\"Aggregating - zone.term\")\n        df_zone_term = pd.read_fwf(os.path.join(model_dir, \"input\", \"zone.term\"), header = None)\n        df_zone_term.columns = [\"taz\", \"terminal_time\"]\n\n        df_agg = pd.merge(df_zone_term, df_clusters, on = \"taz\", how = 'left')\n        df_zones_agg = df_agg.groupby([\"cluster_id\"])['terminal_time'].max().reset_index()\n\n        df_zones_agg.columns = [\"taz\", \"terminal_time\"]\n        df_zones_agg.to_fwf(os.path.join(rsm_dir, \"input\", \"zone.term\"))\n\n    else:\n        raise FileNotFoundError(\"zone.term\")\n\n    # aggregating zones.park\n    if \"zones.park\" in input_files:\n        logging.info(\"Aggregating - zone.park\")\n        df_zones_park = pd.read_fwf(os.path.join(model_dir, \"input\", \"zone.park\"), header = None)\n        df_zones_park.columns = [\"taz\", \"park_zones\"]\n\n        df_zones_park_agg = pd.merge(df_zones_park, df_clusters, on = \"taz\", how = 'left')\n        df_zones_park_agg = df_zones_park_agg.groupby([\"cluster_id\"])['park_zones'].max().reset_index()\n        df_zones_park_agg.columns = [\"taz\", \"park_zones\"]\n        df_zones_park_agg.to_fwf(os.path.join(rsm_dir, \"input\", \"zone.park\"))\n\n    else:\n        raise FileNotFoundError(\"zone.park\")\n\n\n    # aggregating tap.ptype \n    if \"tap.ptype\" in input_files:\n        logging.info(\"Aggregating - tap.ptype\")\n        df_tap_ptype = pd.read_fwf(os.path.join(model_dir, \"input\", \"tap.ptype\"), header = None)\n        df_tap_ptype.columns = [\"tap\", \"lot id\", \"parking type\", \"taz\", \"capacity\", \"distance\", \"transit mode\"]\n\n        df_tap_ptype = pd.merge(df_tap_ptype, df_clusters, on = \"taz\", how = 'left')\n\n        df_tap_ptype = df_tap_ptype[[\"tap\", \"lot id\", \"parking type\", \"cluster_id\", \"capacity\", \"distance\", \"transit mode\"]]\n        df_tap_ptype = df_tap_ptype.rename(columns = {\"cluster_id\": \"taz\"})\n        #df_tap_ptype.to_fwf(os.path.join(rsm_dir, \"input\", \"tap.ptype\"))\n\n        widths = [5, 6, 6, 5, 5, 5, 3]\n\n        with open(os.path.join(rsm_dir, \"input\", \"tap.ptype\"), 'w') as f:\n            for index, row in df_tap_ptype.iterrows():\n                field1 = str(row[0]).rjust(widths[0])\n                field2 = str(row[1]).rjust(widths[1])\n                field3 = str(row[2]).rjust(widths[2])\n                field4 = str(row[3]).rjust(widths[3])\n                field5 = str(row[4]).rjust(widths[4])\n                field6 = str(row[5]).rjust(widths[5])\n                field7 = str(row[6]).rjust(widths[6])\n                f.write(f'{field1}{field2}{field3}{field4}{field5}{field6}{field7}\\n')\n\n    else:\n        raise FileNotFoundError(\"tap.ptype\")\n\n    #aggregating accessam.csv\n    if \"accessam.csv\" in input_files:\n        logging.info(\"Aggregating - accessam.csv\")\n        df_acc = pd.read_csv(os.path.join(model_dir, \"input\", \"accessam.csv\"), header = None)\n        df_acc.columns = ['TAZ', 'TAP', 'TIME', 'DISTANCE', 'MODE']\n\n        df_acc['TAZ'] = df_acc['TAZ'].map(dict_clusters)\n        df_acc_agg = df_acc.groupby(['TAZ', 'TAP', 'MODE'])['TIME', 'DISTANCE'].mean().reset_index()\n        df_acc_agg = df_acc_agg[[\"TAZ\", \"TAP\", \"TIME\", \"DISTANCE\", \"MODE\"]]\n\n        df_acc_agg.to_csv(os.path.join(rsm_dir, \"input\", \"accessam.csv\"), index = False, header =False)\n    else:\n        raise FileNotFoundError(\"accessam.csv\")\n\n    # aggregating ParkLocationAlts.csv\n    if \"ParkLocationAlts.csv\" in input_files:\n        logging.info(\"Aggregating - ParkLocationAlts.csv\")\n        df_park = pd.read_csv(os.path.join(model_dir, \"uec\", \"ParkLocationAlts.csv\"))\n        df_park['mgra_new'] = df_park[\"mgra\"].map(mgra_cwk)\n        df_park_agg = df_park.groupby([\"mgra_new\"])[\"parkarea\"].min().reset_index() # assuming 1 is \"parking\" and 2 is \"no parking\"\n        df_park_agg['a'] = [i+1 for i in range(len(df_park_agg))]\n\n        df_park_agg.columns = [\"a\", \"mgra\", \"parkarea\"]\n        df_park_agg.to_csv(os.path.join(rsm_dir, \"uec\", \"ParkLocationAlts.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"ParkLocationAlts.csv\")\n\n    # aggregating CrossBorderDestinationChoiceSoaAlternatives.csv\n    if \"CrossBorderDestinationChoiceSoaAlternatives.csv\" in input_files:\n        logging.info(\"Aggregating - CrossBorderDestinationChoiceSoaAlternatives.csv\")\n        df_cb = pd.read_csv(os.path.join(model_dir, \"uec\",\"CrossBorderDestinationChoiceSoaAlternatives.csv\"))\n\n        df_cb[\"mgra_entry\"] = df_cb[\"mgra_entry\"].map(mgra_cwk)\n        df_cb[\"mgra_return\"] = df_cb[\"mgra_return\"].map(mgra_cwk)\n        df_cb[\"a\"] = df_cb[\"a\"].map(mgra_cwk)\n\n        df_cb = pd.merge(df_cb, df_clusters, left_on = \"dest\", right_on = \"taz\", how = 'left')\n        df_cb = df_cb.drop(columns = [\"dest\", \"taz\"])\n        df_cb = df_cb.rename(columns = {'cluster_id' : 'dest'})\n\n        df_cb_final  = df_cb.drop_duplicates()\n\n        df_cb_final = df_cb_final[[\"a\", \"dest\", \"poe\", \"mgra_entry\", \"mgra_return\", \"poe_taz\"]]\n        df_cb_final.to_csv(os.path.join(rsm_dir, \"uec\", \"CrossBorderDestinationChoiceSoaAlternatives.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"CrossBorderDestinationChoiceSoaAlternatives.csv\")\n\n    # aggregating households.csv\n    if \"households.csv\" in input_files:\n        logging.info(\"Aggregating - households.csv\")\n        df_hh = pd.read_csv(os.path.join(model_dir, \"input\", \"households.csv\"))\n        df_hh[\"mgra\"] = df_hh[\"mgra\"].map(mgra_cwk)\n        df_hh[\"taz\"] = df_hh[\"taz\"].map(dict_clusters)\n\n        df_hh.to_csv(os.path.join(rsm_dir, \"input\", \"households.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"households.csv\")\n\n    # aggregating ShadowPricingOutput_school_9.csv\n    if \"ShadowPricingOutput_school_9.csv\" in input_files:\n        logging.info(\"Aggregating - ShadowPricingOutput_school_9.csv\")\n        df_sp_sch = pd.read_csv(os.path.join(model_dir, \"input\", \"ShadowPricingOutput_school_9.csv\"))\n\n        agg_instructions = {}\n        for col in df_sp_sch.columns:\n            if \"size\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"shadowPrices\" in col:\n                agg_instructions.update({col: \"max\"})\n\n            if \"_origins\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"_modeledDests\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n        df_sp_sch['mgra'] = df_sp_sch['mgra'].map(mgra_cwk)\n        df_sp_sch_agg = df_sp_sch.groupby(['mgra']).agg(agg_instructions).reset_index()\n\n        alt = list(df_sp_sch_agg['mgra'])\n        df_sp_sch_agg.insert(loc=0, column=\"alt\", value=alt)\n        df_sp_sch_agg.loc[len(df_sp_agg.index)] = 0\n\n        df_sp_sch_agg.to_csv(os.path.join(rsm_dir, \"input\", \"ShadowPricingOutput_school_9.csv\"), index=False)\n\n    else:\n        FileNotFoundError(\"ShadowPricingOutput_school_9.csv\")\n\n    # aggregating ShadowPricingOutput_work_9.csv\n    if \"ShadowPricingOutput_work_9.csv\" in input_files:\n        logging.info(\"Aggregating - ShadowPricingOutput_work_9.csv\")\n        df_sp_wrk = pd.read_csv(os.path.join(model_dir, \"input\", \"ShadowPricingOutput_work_9.csv\"))\n\n        agg_instructions = {}\n        for col in df_sp_wrk.columns:\n            if \"size\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"shadowPrices\" in col:\n                agg_instructions.update({col: \"max\"})\n\n            if \"_origins\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n            if \"_modeledDests\" in col:\n                agg_instructions.update({col: \"sum\"})\n\n        df_sp_wrk['mgra'] = df_sp_wrk['mgra'].map(mgra_cwk)\n\n        df_sp_wrk_agg = df_sp_wrk.groupby(['mgra']).agg(agg_instructions).reset_index()\n\n        alt = list(df_sp_wrk_agg['mgra'])\n        df_sp_wrk_agg.insert(loc=0, column=\"alt\", value=alt)\n\n        df_sp_wrk_agg.loc[len(df_sp_wrk_agg.index)] = 0\n\n        df_sp_wrk_agg.to_csv(os.path.join(rsm_dir, \"input\", \"ShadowPricingOutput_work_9.csv\"), index=False)\n\n    else:\n        FileNotFoundError(\"ShadowPricingOutput_work_9.csv\")\n\n    if \"TourDcSoaDistanceAlts.csv\" in input_files:\n        logging.info(\"Aggregating - TourDcSoaDistanceAlts.csv\")\n        df_TourDcSoaDistanceAlts = pd.DataFrame({\"a\" : range(1,taz_zones+1), \"dest\" : range(1, taz_zones+1)})\n        df_TourDcSoaDistanceAlts.to_csv(os.path.join(rsm_dir, \"uec\", \"TourDcSoaDistanceAlts.csv\"), index=False)\n\n    if \"DestinationChoiceAlternatives.csv\" in input_files:\n        logging.info(\"Aggregating - DestinationChoiceAlternatives.csv\")\n        df_DestinationChoiceAlternatives = pd.DataFrame({\"a\" : range(1,mgra_zones+1), \"mgra\" : range(1, mgra_zones+1)})\n        df_DestinationChoiceAlternatives.to_csv(os.path.join(rsm_dir, \"uec\", \"DestinationChoiceAlternatives.csv\"), index=False)\n\n    if \"SoaTazDistAlts.csv\" in input_files:\n        logging.info(\"Aggregating - SoaTazDistAlts.csv\")\n        df_SoaTazDistAlts = pd.DataFrame({\"a\" : range(1,taz_zones+1), \"dest\" : range(1, taz_zones+1)})\n        df_SoaTazDistAlts.to_csv(os.path.join(rsm_dir, \"uec\", \"SoaTazDistAlts.csv\"), index=False)\n\n    if \"TripMatrices.csv\" in input_files:\n        logging.info(\"Aggregating - TripMatrices.csv\")\n        trips = pd.read_csv(os.path.join(model_dir,\"output\", \"TripMatrices.csv\"))\n        trips['i'] = trips['i'].map(dict_clusters)\n        trips['j'] = trips['j'].map(dict_clusters)\n\n        cols = list(trips.columns)\n        cols.remove(\"i\")\n        cols.remove(\"j\")\n\n        trips_df = trips.groupby(['i', 'j'])[cols].sum().reset_index()\n        trips_df.to_csv(os.path.join(rsm_dir, \"output\", \"TripMatrices.csv\"), index = False)\n\n    else:\n        FileNotFoundError(\"TripMatrices.csv\")\n\n    if \"transponderModelAccessibilities.csv\" in input_files:\n        logging.info(\"Aggregating - transponderModelAccessibilities.csv\")\n        tran_access = pd.read_csv(os.path.join(model_dir, \"output\", \"transponderModelAccessibilities.csv\"))\n        tran_access['TAZ'] = tran_access['TAZ'].map(dict_clusters)\n\n        tran_access_agg = tran_access.groupby(['TAZ'])['DIST','AVGTTS','PCTDETOUR'].mean().reset_index()\n        tran_access_agg.to_csv(os.path.join(rsm_dir, \"output\",\"transponderModelAccessibilities.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"transponderModelAccessibilities.csv\")\n\n    if \"crossBorderTours.csv\" in input_files:\n        logging.info(\"Aggregating - crossBorderTours.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"crossBorderTours.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"crossBorderTours.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"crossBorderTours.csv\")\n\n    if \"crossBorderTrips.csv\" in input_files:\n        logging.info(\"Aggregating - crossBorderTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"crossBorderTrips.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"crossBorderTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"crossBorderTrips.csv\")\n\n    if \"internalExternalTrips.csv\" in input_files:\n        logging.info(\"Aggregating - internalExternalTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"internalExternalTrips.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"internalExternalTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"internalExternalTrips.csv\")\n\n    if \"visitorTours.csv\" in input_files:\n        logging.info(\"Aggregating - visitorTours.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"visitorTours.csv\"))\n\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"visitorTours.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"visitorTours.csv\")\n\n    if \"visitorTrips.csv\" in input_files:\n        logging.info(\"Aggregating - visitorTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"visitorTrips.csv\"))\n\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"visitorTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"visitorTrips.csv\")\n\n    if \"householdAVTrips.csv\" in input_files:\n        logging.info(\"Aggregating - householdAVTrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"householdAVTrips.csv\"))\n        #print(os.path.join(model_dir, \"output\", \"householdAVTrips.csv\"))\n        df['orig_mgra'] = df['orig_mgra'].map(mgra_cwk)\n        df['dest_gra'] = df['dest_gra'].map(mgra_cwk)\n\n        df['trip_orig_mgra'] = df['trip_orig_mgra'].map(mgra_cwk)\n        df['trip_dest_mgra'] = df['trip_dest_mgra'].map(mgra_cwk)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"householdAVTrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"householdAVTrips.csv\")\n\n    if \"airport_out.CBX.csv\" in input_files:\n        logging.info(\"Aggregating - airport_out.CBX.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"airport_out.CBX.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"airport_out.CBX.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"airport_out.CBX.csv\")\n\n    if \"airport_out.SAN.csv\" in input_files:\n        logging.info(\"Aggregating - airport_out.SAN.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"airport_out.SAN.csv\"))\n        df['originMGRA'] = df['originMGRA'].map(mgra_cwk)\n        df['destinationMGRA'] = df['destinationMGRA'].map(mgra_cwk)\n\n        df['originTAZ'] = df['originTAZ'].map(dict_clusters)\n        df['destinationTAZ'] = df['destinationTAZ'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"airport_out.SAN.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"airport_out.SAN.csv\")\n\n    if \"TNCtrips.csv\" in input_files:\n        logging.info(\"Aggregating - TNCtrips.csv\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", \"TNCtrips.csv\"))\n        df['originMgra'] = df['originMgra'].map(mgra_cwk)\n        df['destinationMgra'] = df['destinationMgra'].map(mgra_cwk)\n\n        df['originTaz'] = df['originTaz'].map(dict_clusters)\n        df['destinationTaz'] = df['destinationTaz'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\", \"TNCtrips.csv\"), index = False)\n\n    else:\n        raise FileNotFoundError(\"TNCtrips.csv\")\n\n    files = [\"Trip\" + \"_\" + i + \"_\" + j + \".csv\" for i, j in\n                itertools.product([\"FA\", \"GO\", \"IN\", \"RE\", \"SV\", \"TH\", \"WH\"],\n                                   [\"OE\", \"AM\", \"MD\", \"PM\", \"OL\"])]\n\n    for file in files:\n        logging.info(f\"Aggregating - {file}\")\n        df = pd.read_csv(os.path.join(model_dir, \"output\", file))\n        df['I'] = df['I'].map(dict_clusters)\n        df['J'] = df['J'].map(dict_clusters)\n        df['HomeZone'] = df['HomeZone'].map(dict_clusters)\n        df.to_csv(os.path.join(rsm_dir, \"output\",file), index = False)\n
"},{"location":"api.html#rsm.translate.copy_transit_demand","title":"copy_transit_demand(matrix_names, input_dir='.', output_dir='.')","text":"

copies the omx transit demand matrix to rsm directory

"},{"location":"api.html#rsm.translate.copy_transit_demand--parameters","title":"Parameters","text":"

matrix_names : matrix_names (list) omx matrix filenames to aggregate input_dir : input_dir (Path-like) default \u201c.\u201d output_dir : output_dir (Path-like) default \u201c.\u201d

"},{"location":"api.html#rsm.translate.copy_transit_demand--returns","title":"Returns","text":"Source code in rsm/translate.py
def copy_transit_demand(\n    matrix_names,\n    input_dir=\".\",\n    output_dir=\".\"\n):\n\"\"\"\n    copies the omx transit demand matrix to rsm directory\n\n    Parameters\n    ----------\n    matrix_names : matrix_names (list)\n        omx matrix filenames to aggregate\n    input_dir : input_dir (Path-like) \n        default \".\"\n    output_dir : output_dir (Path-like)\n        default \".\"\n\n    Returns\n    -------\n\n    \"\"\"\n\n\n    for mat_name in matrix_names:\n        if '.omx' not in mat_name:\n            mat_name = mat_name + \".omx\"\n\n        input_file_dir = os.path.join(input_dir, mat_name)\n        output_file_dir = os.path.join(output_dir, mat_name)\n\n        shutil.copy(input_file_dir, output_file_dir)\n
"},{"location":"api.html#rsm.translate.translate_emmebank_demand","title":"translate_emmebank_demand(input_databank, output_databank, cores_to_aggregate, agg_zone_mapping)","text":"

aggregates the demand matrix cores from one emme databank and loads them into another databank

"},{"location":"api.html#rsm.translate.translate_emmebank_demand--parameters","title":"Parameters","text":"

input_databank : input_databank (Emme databank) Emme databank output_databank : output_databank (Emme databank) Emme databank cores_to_aggregate : cores_to_aggregate (list) matrix corenames to aggregate agg_zone_mapping: agg_zone_mapping (Path-like or pandas.DataFrame) zone number mapping between original and aggregated zones. columns: original zones as \u2018taz\u2019 and aggregated zones as \u2018cluster_id\u2019

"},{"location":"api.html#rsm.translate.translate_emmebank_demand--returns","title":"Returns","text":"

None. Loads the trip matrices into emmebank.

Source code in rsm/translate.py
def translate_emmebank_demand(\n    input_databank,\n    output_databank,\n    cores_to_aggregate,\n    agg_zone_mapping,\n): \n\"\"\"\n    aggregates the demand matrix cores from one emme databank and loads them into another databank\n\n    Parameters\n    ----------\n    input_databank : input_databank (Emme databank)\n        Emme databank\n    output_databank : output_databank (Emme databank)\n        Emme databank\n    cores_to_aggregate : cores_to_aggregate (list)\n        matrix corenames to aggregate\n    agg_zone_mapping: agg_zone_mapping (Path-like or pandas.DataFrame)\n        zone number mapping between original and aggregated zones. \n        columns: original zones as 'taz' and aggregated zones as 'cluster_id'\n\n    Returns\n    -------\n    None. Loads the trip matrices into emmebank.\n\n    \"\"\"\n\n    agg_zone_mapping_df = pd.read_csv(os.path.join(agg_zone_mapping))\n    agg_zone_mapping_df = agg_zone_mapping_df.sort_values('taz')\n\n    agg_zone_mapping_df.columns= agg_zone_mapping_df.columns.str.strip().str.lower()\n    zone_mapping = dict(zip(agg_zone_mapping_df['taz'], agg_zone_mapping_df['cluster_id']))\n\n    for core in cores_to_aggregate: \n        matrix = input_databank.matrix(core).get_data()\n        matrix_array = matrix.to_numpy()\n\n        matrix_agg = _aggregate_matrix(matrix_array, zone_mapping)\n\n        output_matrix = output_databank.matrix(core)\n        output_matrix.set_numpy_data(matrix_agg)\n
"},{"location":"api.html#rsm.translate.translate_omx_demand","title":"translate_omx_demand(matrix_names, agg_zone_mapping, input_dir='.', output_dir='.')","text":"

aggregates the omx demand matrix to aggregated zone system

"},{"location":"api.html#rsm.translate.translate_omx_demand--parameters","title":"Parameters","text":"

matrix_names : matrix_names (list) omx matrix filenames to aggregate agg_zone_mapping: agg_zone_mapping (path_like or pandas.DataFrame) zone number mapping between original and aggregated zones. columns: original zones as \u2018taz\u2019 and aggregated zones as \u2018cluster_id\u2019 input_dir : input_dir (path_like) default \u201c.\u201d output_dir : output_dir (path_like) default \u201c.\u201d

"},{"location":"api.html#rsm.translate.translate_omx_demand--returns","title":"Returns","text":"Source code in rsm/translate.py
def translate_omx_demand(\n    matrix_names,\n    agg_zone_mapping,\n    input_dir=\".\",\n    output_dir=\".\"\n): \n\"\"\"\n    aggregates the omx demand matrix to aggregated zone system\n\n    Parameters\n    ----------\n    matrix_names : matrix_names (list)\n        omx matrix filenames to aggregate\n    agg_zone_mapping: agg_zone_mapping (path_like or pandas.DataFrame)\n        zone number mapping between original and aggregated zones. \n        columns: original zones as 'taz' and aggregated zones as 'cluster_id'\n    input_dir : input_dir (path_like)\n        default \".\"\n    output_dir : output_dir (path_like) \n        default \".\"\n\n    Returns\n    -------\n\n    \"\"\"\n\n    agg_zone_mapping_df = pd.read_csv(os.path.join(agg_zone_mapping))\n    agg_zone_mapping_df = agg_zone_mapping_df.sort_values('taz')\n\n    agg_zone_mapping_df.columns= agg_zone_mapping_df.columns.str.strip().str.lower()\n    zone_mapping = dict(zip(agg_zone_mapping_df['taz'], agg_zone_mapping_df['cluster_id']))\n    agg_zones = sorted(agg_zone_mapping_df['cluster_id'].unique())\n\n    for mat_name in matrix_names:\n        if '.omx' not in mat_name:\n            mat_name = mat_name + \".omx\"\n\n        #logger.info(\"Aggregating Matrix: \" + mat_name + \" ...\")\n\n        input_skim_file = os.path.join(input_dir, mat_name)\n        print(input_skim_file)\n        output_skim_file = os.path.join(output_dir, mat_name)\n\n        assert os.path.isfile(input_skim_file)\n\n        input_matrix = omx.open_file(input_skim_file, mode=\"r\") \n        input_mapping_name = input_matrix.list_mappings()[0]\n        input_cores = input_matrix.list_matrices()\n\n        output_matrix = omx.open_file(output_skim_file, mode=\"w\")\n\n        for core in input_cores:\n            matrix = input_matrix[core]\n            matrix_array = matrix.read()\n            matrix_agg = _aggregate_matrix(matrix_array, zone_mapping)\n            output_matrix[core] = matrix_agg\n\n        output_matrix.create_mapping(title=input_mapping_name, entries=agg_zones)\n\n        input_matrix.close()\n        output_matrix.close()\n
"},{"location":"api.html#rsm.sampler.rsm_household_sampler","title":"rsm_household_sampler(input_dir='.', output_dir='.', prev_iter_access=None, curr_iter_access=None, study_area=None, input_household='households.csv', input_person='persons.csv', taz_crosswalk='taz_crosswalk.csv', mgra_crosswalk='mgra_crosswalk.csv', compare_access_columns=('NONMAN_AUTO', 'NONMAN_TRANSIT', 'NONMAN_NONMOTOR', 'NONMAN_SOV_0'), default_sampling_rate=0.25, lower_bound_sampling_rate=0.15, upper_bound_sampling_rate=1.0, random_seed=42, output_household='sampled_households.csv', output_person='sampled_person.csv')","text":"

Take an intelligent sampling of households.

"},{"location":"api.html#rsm.sampler.rsm_household_sampler--parameters","title":"Parameters","text":"

input_dir : input_dir (path_like) default \u201c.\u201d output_dir : output_dir (path_like) default \u201c.\u201d prev_iter_access : prev_iter_access (Path-like or pandas.DataFrame) Accessibility in an old (default, no treatment, etc) run is given (preloaded) or read in from here. Give as a relative path (from input_dir) or an absolute path. curr_iter_access : curr_iter_access (Path-like or pandas.DataFrame) Accessibility in the latest run is given (preloaded) or read in from here. Give as a relative path (from input_dir) or an absolute path. study_area : study_area (array-like) Array of RSM zone (these are numbered 1 to N in the RSM) in the study area. These zones are sampled at 100% if differential sampling is also turned on. input_household : input_household (Path-like or pandas.DataFrame) Complete synthetic household file. This data will be filtered to match the sampling of households and written out to a new CSV file. input_person : input_person (Path-like or pandas.DataFrame) Complete synthetic persons file. This data will be filtered to match the sampling of households and written out to a new CSV file. compare_access_columns : compare_access_columns (Collection[str]) Column names in the accessibility file to use for comparing accessibility. Only changes in the values in these columns will be evaluated. default_sampling_rate : default_sampling_rate (float) The default sampling rate, in the range (0,1] lower_bound_sampling_rate : lower_bound_sampling_rate (float) Sampling rates by zone will be truncated so they are never lower than this. upper_bound_sampling_rate : upper_bound_sampling_rate (float) Sampling rates by zone will be truncated so they are never higher than this.

"},{"location":"api.html#rsm.sampler.rsm_household_sampler--returns","title":"Returns","text":"

sample_households_df, sample_persons_df : sample_households_df, sample_persons_df (pandas.DataFrame) These are the sampled population to resimulate. They are also written to the output_dir

Source code in rsm/sampler.py
def rsm_household_sampler(\n    input_dir=\".\",\n    output_dir=\".\",\n    prev_iter_access=None,\n    curr_iter_access=None,\n    study_area=None,\n    input_household=\"households.csv\",\n    input_person=\"persons.csv\",\n    taz_crosswalk=\"taz_crosswalk.csv\",\n    mgra_crosswalk=\"mgra_crosswalk.csv\",\n    compare_access_columns=(\n        \"NONMAN_AUTO\",\n        \"NONMAN_TRANSIT\",\n        \"NONMAN_NONMOTOR\",\n        \"NONMAN_SOV_0\",\n    ),\n    default_sampling_rate=0.25,  # fix the values of this after some testing\n    lower_bound_sampling_rate=0.15,  # fix the values of this after some testing\n    upper_bound_sampling_rate=1.0,  # fix the values of this after some testing\n    random_seed=42,\n    output_household=\"sampled_households.csv\",\n    output_person=\"sampled_person.csv\",\n):\n\"\"\"\n    Take an intelligent sampling of households.\n\n    Parameters\n    ----------\n    input_dir : input_dir (path_like)\n        default \".\"\n    output_dir : output_dir (path_like)\n        default \".\"\n    prev_iter_access : prev_iter_access (Path-like or pandas.DataFrame)\n        Accessibility in an old (default, no treatment, etc) run is given (preloaded)\n        or read in from here. Give as a relative path (from `input_dir`) or an\n        absolute path.\n    curr_iter_access : curr_iter_access (Path-like or pandas.DataFrame)\n        Accessibility in the latest run is given (preloaded) or read in from here.\n        Give as a relative path (from `input_dir`) or an absolute path.\n    study_area : study_area (array-like)\n        Array of RSM zone (these are numbered 1 to N in the RSM) in the study area.\n        These zones are sampled at 100% if differential sampling is also turned on.\n    input_household : input_household (Path-like or pandas.DataFrame)\n        Complete synthetic household file.  This data will be filtered to match the\n        sampling of households and written out to a new CSV file.\n    input_person : input_person (Path-like or pandas.DataFrame)\n        Complete synthetic persons file.  This data will be filtered to match the\n        sampling of households and written out to a new CSV file.\n    compare_access_columns : compare_access_columns (Collection[str])\n        Column names in the accessibility file to use for comparing accessibility.\n        Only changes in the values in these columns will be evaluated.\n    default_sampling_rate : default_sampling_rate (float)\n        The default sampling rate, in the range (0,1]\n    lower_bound_sampling_rate : lower_bound_sampling_rate (float)\n        Sampling rates by zone will be truncated so they are never lower than this.\n    upper_bound_sampling_rate : upper_bound_sampling_rate (float)\n        Sampling rates by zone will be truncated so they are never higher than this.\n\n    Returns\n    -------\n    sample_households_df, sample_persons_df : sample_households_df, sample_persons_df (pandas.DataFrame)\n        These are the sampled population to resimulate.  They are also written to\n        the output_dir\n    \"\"\"\n\n    input_dir = Path(input_dir or \".\")\n    output_dir = Path(output_dir or \".\")\n\n    logger.debug(\"CALL rsm_household_sampler\")\n    logger.debug(f\"  {input_dir=}\")\n    logger.debug(f\"  {output_dir=}\")\n\n    def _resolve_df(x, directory, make_index=None):\n        if isinstance(x, (str, Path)):\n            # read in the file to a pandas DataFrame\n            x = Path(x).expanduser()\n            if not x.is_absolute():\n                x = Path(directory or \".\").expanduser().joinpath(x)\n            try:\n                result = pd.read_csv(x)\n            except FileNotFoundError:\n                raise\n        elif isinstance(x, pd.DataFrame):\n            result = x\n        elif x is None:\n            result = None\n        else:\n            raise TypeError(\"must be path-like or DataFrame\")\n        if (\n            result is not None\n            and make_index is not None\n            and make_index in result.columns\n        ):\n            result = result.set_index(make_index)\n        return result\n\n    def _resolve_out_filename(x):\n        x = Path(x).expanduser()\n        if not x.is_absolute():\n            x = Path(output_dir).expanduser().joinpath(x)\n        x.parent.mkdir(parents=True, exist_ok=True)\n        return x\n\n    prev_iter_access_df = _resolve_df(\n        prev_iter_access, input_dir, make_index=\"MGRA\"\n    )\n    curr_iter_access_df = _resolve_df(\n        curr_iter_access, input_dir, make_index=\"MGRA\"\n    )\n    rsm_zones = _resolve_df(taz_crosswalk, input_dir)\n    dict_clusters = dict(zip(rsm_zones[\"taz\"], rsm_zones[\"cluster_id\"]))\n\n    rsm_mgra_zones = _resolve_df(mgra_crosswalk, input_dir)\n    rsm_mgra_zones.columns = rsm_mgra_zones.columns.str.strip().str.lower()\n    dict_clusters_mgra = dict(zip(rsm_mgra_zones[\"mgra\"], rsm_mgra_zones[\"cluster_id\"]))\n\n    # changing the taz and mgra to new cluster ids\n    input_household_df = _resolve_df(input_household, input_dir)\n    input_household_df[\"taz\"] = input_household_df[\"taz\"].map(dict_clusters)\n    input_household_df[\"mgra\"] = input_household_df[\"mgra\"].map(dict_clusters_mgra)\n    input_household_df[\"count\"] = 1\n\n    mgra_hh = input_household_df.groupby([\"mgra\"]).size().rename(\"n_hh\").to_frame()\n\n    if curr_iter_access_df is None or prev_iter_access_df is None:\n\n        if curr_iter_access_df is None:\n            logger.warning(f\"missing curr_iter_access_df from {curr_iter_access}\")\n        if prev_iter_access_df is None:\n            logger.warning(f\"missing prev_iter_access_df from {prev_iter_access}\")\n        # true when sampler is turned off. default_sampling_rate should be set to 1\n\n        mgra_hh[\"sampling_rate\"] = default_sampling_rate\n        if study_area is not None:\n            mgra_hh.loc[mgra_hh.index.isin(study_area), \"sampling_rate\"] = 1\n\n        sample_households = []\n\n        for mgra_id, row in mgra_hh.iterrows():\n            df = input_household_df.loc[input_household_df[\"mgra\"] == mgra_id]\n            sampling_rate = row[\"sampling_rate\"]\n            logger.info(f\"Sampling rate of RSM zone {mgra_id}: {sampling_rate}\")\n            df = df.sample(frac=sampling_rate, random_state=mgra_id + random_seed)\n            sample_households.append(df)\n\n        # combine study are and non-study area households into single dataframe\n        sample_households_df = pd.concat(sample_households)\n\n    else:\n        # restrict to rows only where TAZs have households\n        prev_iter_access_df = prev_iter_access_df[\n            prev_iter_access_df.index.isin(mgra_hh.index)\n        ].copy()\n        curr_iter_access_df = curr_iter_access_df[\n            curr_iter_access_df.index.isin(mgra_hh.index)\n        ].copy()\n\n        # compare accessibility columns\n        compare_results = pd.DataFrame()\n\n        for column in compare_access_columns:\n            compare_results[column] = (\n                curr_iter_access_df[column] - prev_iter_access_df[column]\n            ).abs()  # take absolute difference\n        compare_results[\"MGRA\"] = prev_iter_access_df.index\n\n        compare_results = compare_results.set_index(\"MGRA\")\n\n        # Take row sums of all difference\n        compare_results[\"Total\"] = compare_results[list(compare_access_columns)].sum(\n            axis=1\n        )\n\n        # TODO: potentially adjust this later after we figure out a better approach\n        wgts = compare_results[\"Total\"] + 0.01\n        wgts /= wgts.mean() / default_sampling_rate\n        compare_results[\"sampling_rate\"] = np.clip(\n            wgts, lower_bound_sampling_rate, upper_bound_sampling_rate\n        )\n\n        sample_households = []\n        sample_rate_df = compare_results[[\"sampling_rate\"]].copy()\n        if study_area is not None:\n            sample_rate_df.loc[\n                sample_rate_df.index.isin(study_area), \"sampling_rate\"\n            ] = 1\n\n        for mgra_id, row in sample_rate_df.iterrows():\n            df = input_household_df.loc[input_household_df[\"mgra\"] == mgra_id]\n            sampling_rate = row[\"sampling_rate\"]\n            logger.info(f\"Sampling rate of RSM zone {mgra_id}: {sampling_rate}\")\n            df = df.sample(frac=sampling_rate, random_state=mgra_id + random_seed)\n            sample_households.append(df)\n\n        # combine study are and non-study area households into single dataframe\n        sample_households_df = pd.concat(sample_households)\n\n    sample_households_df = sample_households_df.sort_values(by=[\"hhid\"])\n    sample_households_df.to_csv(_resolve_out_filename(output_household), index=False)\n\n    # select persons belonging to sampled households\n    sample_hhids = sample_households_df[\"hhid\"].to_numpy()\n\n    persons_df = _resolve_df(input_person, input_dir)\n    sample_persons_df = persons_df.loc[persons_df[\"hhid\"].isin(sample_hhids)]\n    sample_persons_df.to_csv(_resolve_out_filename(output_person), index=False)\n\n    global_sample_rate = round(len(sample_households_df) / len(input_household_df),2)\n    logger.info(f\"Total Sampling Rate : {global_sample_rate}\")\n\n    return sample_households_df, sample_persons_df\n
"},{"location":"api.html#rsm.assembler.rsm_assemble","title":"rsm_assemble(orig_indiv, orig_joint, rsm_indiv, rsm_joint, households, mgra_crosswalk=None, taz_crosswalk=None, sample_rate=0.25, study_area_taz=None, run_assembler=1)","text":"

Assemble and evaluate RSM trip making.

"},{"location":"api.html#rsm.assembler.rsm_assemble--parameters","title":"Parameters","text":"

orig_indiv : orig_indiv (path_like) Trips table from \u201coriginal\u201d model run, should be comprehensive simulation of all individual trips for all synthetic households. orig_joint : orig_joint (path_like) Joint trips table from \u201coriginal\u201d model run, should be comprehensive simulation of all joint trips for all synthetic households. rsm_indiv : rsm_indiv (path_like) Trips table from RSM model run, should be a simulation of all individual trips for potentially only a subset of all synthetic households. rsm_joint : rsm_joint (path_like) Trips table from RSM model run, should be a simulation of all joint trips for potentially only a subset of all synthetic households (the same sampled households as in rsm_indiv). households : households (path_like) Synthetic household file, used to get home zones for households. mgra_crosswalk : mgra_crosswalk (path_like, optional) Crosswalk from original MGRA to clustered zone ids. Provide this crosswalk if the orig_indiv and orig_joint files reference the original MGRA system and those id\u2019s need to be converted to aggregated values before merging. sample_rate : sample_rate (float) Default/fixed sample rate if sampler was turned off this is used to scale the trips if run_assembler is 0 run_assembler : run_assembler (boolean) Flag to indicate whether to run RSM assembler or not. 1 is to run assembler, 0 is to turn if off setting this to 0 is only an option if sampler is turned off sample_rate : float default/fixed sample rate if sampler was turned off this is used to scale the trips if run_assembler is 0 study_area_rsm_zones : list it is list of study area RSM zones

"},{"location":"api.html#rsm.assembler.rsm_assemble--returns","title":"Returns","text":"

final_trips_rsm : final_ind_trips (pd.DataFrame) Assembled trip table for RSM run, filling in archived trip values for non-resimulated households. combined_trips_by_zone : final_jnt_trips (pd.DataFrame) Summary table of changes in trips by mode, by household home zone. Used to check whether undersampled zones have stable travel behavior.

Separate tables for individual and joint trips, as required by java.

Source code in rsm/assembler.py
def rsm_assemble(\n    orig_indiv,\n    orig_joint,\n    rsm_indiv,\n    rsm_joint,\n    households,\n    mgra_crosswalk=None,\n    taz_crosswalk=None,\n    sample_rate=0.25,\n    study_area_taz=None,\n    run_assembler=1,\n):\n\"\"\"\n    Assemble and evaluate RSM trip making.\n\n    Parameters\n    ----------\n    orig_indiv : orig_indiv (path_like)\n        Trips table from \"original\" model run, should be comprehensive simulation\n        of all individual trips for all synthetic households.\n    orig_joint : orig_joint (path_like)\n        Joint trips table from \"original\" model run, should be comprehensive simulation\n        of all joint trips for all synthetic households.\n    rsm_indiv : rsm_indiv (path_like)\n        Trips table from RSM model run, should be a simulation of all individual\n        trips for potentially only a subset of all synthetic households.\n    rsm_joint : rsm_joint (path_like)\n        Trips table from RSM model run, should be a simulation of all joint\n        trips for potentially only a subset of all synthetic households (the\n        same sampled households as in `rsm_indiv`).\n    households : households (path_like)\n        Synthetic household file, used to get home zones for households.\n    mgra_crosswalk : mgra_crosswalk (path_like, optional)\n        Crosswalk from original MGRA to clustered zone ids.  Provide this crosswalk\n        if the `orig_indiv` and `orig_joint` files reference the original MGRA system\n        and those id's need to be converted to aggregated values before merging.\n    sample_rate : sample_rate (float)\n        Default/fixed sample rate if sampler was turned off\n        this is used to scale the trips if run_assembler is 0\n    run_assembler : run_assembler (boolean)\n        Flag to indicate whether to run RSM assembler or not. \n        1 is to run assembler, 0 is to turn if off\n        setting this to 0 is only an option if sampler is turned off       \n    sample_rate : float\n        default/fixed sample rate if sampler was turned off\n        this is used to scale the trips if run_assembler is 0\n    study_area_rsm_zones :  list\n        it is list of study area RSM zones\n\n    Returns\n    -------\n    final_trips_rsm : final_ind_trips (pd.DataFrame)\n        Assembled trip table for RSM run, filling in archived trip values for\n        non-resimulated households.\n    combined_trips_by_zone : final_jnt_trips (pd.DataFrame)\n        Summary table of changes in trips by mode, by household home zone.\n        Used to check whether undersampled zones have stable travel behavior.\n\n    Separate tables for individual and joint trips, as required by java.\n\n\n    \"\"\"\n    orig_indiv = Path(orig_indiv).expanduser()\n    orig_joint = Path(orig_joint).expanduser()\n    rsm_indiv = Path(rsm_indiv).expanduser()\n    rsm_joint = Path(rsm_joint).expanduser()\n    households = Path(households).expanduser()\n\n    assert os.path.isfile(orig_indiv)\n    assert os.path.isfile(orig_joint)\n    assert os.path.isfile(rsm_indiv)\n    assert os.path.isfile(rsm_joint)\n    assert os.path.isfile(households)\n\n    if mgra_crosswalk is not None:\n        mgra_crosswalk = Path(mgra_crosswalk).expanduser()\n        assert os.path.isfile(mgra_crosswalk)\n\n    if taz_crosswalk is not None:\n        taz_crosswalk = Path(taz_crosswalk).expanduser()\n        assert os.path.isfile(taz_crosswalk)\n\n    # load trip data - partial simulation of RSM model\n    logger.info(\"reading ind_trips_rsm\")\n    ind_trips_rsm = pd.read_csv(rsm_indiv)\n    logger.info(\"reading jnt_trips_rsm\")\n    jnt_trips_rsm = pd.read_csv(rsm_joint)\n\n    scale_factor = int(1.0/sample_rate)\n\n    if run_assembler == 1:\n        # load trip data - full simulation of residual/source model\n        logger.info(\"reading ind_trips_full\")\n        ind_trips_full = pd.read_csv(orig_indiv)\n        logger.info(\"reading jnt_trips_full\")\n        jnt_trips_full = pd.read_csv(orig_joint)\n\n        if mgra_crosswalk is not None:\n            logger.info(\"applying mgra_crosswalk to original data\")\n            mgra_crosswalk = pd.read_csv(mgra_crosswalk).set_index(\"MGRA\")[\"cluster_id\"]\n            mgra_crosswalk[-1] = -1\n            mgra_crosswalk[0] = 0\n            for col in [c for c in ind_trips_full.columns if c.lower().endswith(\"_mgra\")]:\n                ind_trips_full[col] = ind_trips_full[col].map(mgra_crosswalk)\n            for col in [c for c in jnt_trips_full.columns if c.lower().endswith(\"_mgra\")]:\n                jnt_trips_full[col] = jnt_trips_full[col].map(mgra_crosswalk)\n\n        # convert to rsm trips\n        logger.info(\"convert to common table platform\")\n        rsm_trips = _merge_joint_and_indiv_trips(ind_trips_rsm, jnt_trips_rsm)\n        original_trips = _merge_joint_and_indiv_trips(ind_trips_full, jnt_trips_full)\n\n        logger.info(\"get all hhids in trips produced by RSM\")\n        hh_ids_rsm = rsm_trips[\"hh_id\"].unique()\n\n        logger.info(\"remove orginal model trips made by households chosen in RSM trips\")\n        original_trips_not_resimulated = original_trips.loc[\n            ~original_trips[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n        original_ind_trips_not_resimulated = ind_trips_full[\n            ~ind_trips_full[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n        original_jnt_trips_not_resimulated = jnt_trips_full[\n            ~jnt_trips_full[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n\n        logger.info(\"concatenate trips from rsm and original model\")\n        final_trips_rsm = pd.concat(\n            [rsm_trips, original_trips_not_resimulated], ignore_index=True\n        ).reset_index(drop=True)\n        final_ind_trips = pd.concat(\n            [ind_trips_rsm, original_ind_trips_not_resimulated], ignore_index=True\n        ).reset_index(drop=True)\n        final_jnt_trips = pd.concat(\n            [jnt_trips_rsm, original_jnt_trips_not_resimulated], ignore_index=True\n        ).reset_index(drop=True)\n\n        # Get percentage change in total trips by mode for each home zone\n\n        # extract trips made by households in RSM and Original model\n        original_trips_that_were_resimulated = original_trips.loc[\n            original_trips[\"hh_id\"].isin(hh_ids_rsm)\n        ]\n\n        def _agg_by_hhid_and_tripmode(df, name):\n            return df.groupby([\"hh_id\", \"trip_mode\"]).size().rename(name).reset_index()\n\n        # combining trips by hhid and trip mode\n        combined_trips = pd.merge(\n            _agg_by_hhid_and_tripmode(original_trips_that_were_resimulated, \"n_trips_orig\"),\n            _agg_by_hhid_and_tripmode(rsm_trips, \"n_trips_rsm\"),\n            on=[\"hh_id\", \"trip_mode\"],\n            how=\"outer\",\n            sort=True,\n        ).fillna(0)\n\n        # aggregating by Home zone\n        hh_rsm = pd.read_csv(households)\n        hh_id_col_names = [\"hhid\", \"hh_id\", \"household_id\"]\n        for hhid in hh_id_col_names:\n            if hhid in hh_rsm.columns:\n                break\n        else:\n            raise KeyError(f\"none of {hh_id_col_names!r} in household file\")\n        homezone_col_names = [\"mgra\", \"home_mgra\"]\n        for zoneid in homezone_col_names:\n            if zoneid in hh_rsm.columns:\n                break\n        else:\n            raise KeyError(f\"none of {homezone_col_names!r} in household file\")\n        hh_rsm = hh_rsm[[hhid, zoneid]]\n\n        # attach home zone id\n        combined_trips = pd.merge(\n            combined_trips, hh_rsm, left_on=\"hh_id\", right_on=hhid, how=\"left\"\n        )\n\n        combined_trips_by_zone = (\n            combined_trips.groupby([zoneid, \"trip_mode\"])[[\"n_trips_orig\", \"n_trips_rsm\"]]\n            .sum()\n            .reset_index()\n        )\n\n        combined_trips_by_zone = combined_trips_by_zone.eval(\n            \"net_change = (n_trips_rsm - n_trips_orig)\"\n        )\n\n        combined_trips_by_zone[\"max_trips\"] = np.fmax(\n            combined_trips_by_zone.n_trips_rsm, combined_trips_by_zone.n_trips_orig\n        )\n        combined_trips_by_zone = combined_trips_by_zone.eval(\n            \"pct_change = net_change / max_trips * 100\"\n        )\n        combined_trips_by_zone = combined_trips_by_zone.drop(columns=\"max_trips\")\n    else:\n        # if assembler is set to be turned off\n        # then scale the trips in the trip list using the fixed sample rate \n        # trips in the final trip lists will be 100%\n        scale_factor = int(1.0/sample_rate)\n\n        if study_area_taz:\n            sa_rsm = study_area_taz\n        else:\n            sa_rsm = None\n\n        # concat is slow\n        # https://stackoverflow.com/questions/50788508/how-can-i-replicate-rows-of-a-pandas-dataframe\n        #final_ind_trips = pd.concat([ind_trips_rsm]*scale_factor, ignore_index=True)\n        #final_jnt_trips = pd.concat([jnt_trips_rsm]*scale_factor, ignore_index=True)\n\n\n        final_ind_trips = scaleup_to_rsm_samplingrate(ind_trips_rsm, \n                                                      households, \n                                                      taz_crosswalk, \n                                                      scale_factor, \n                                                      study_area_tazs=sa_rsm)\n\n        final_jnt_trips = scaleup_to_rsm_samplingrate(jnt_trips_rsm, \n                                                      households, \n                                                      taz_crosswalk, \n                                                      scale_factor,\n                                                      study_area_tazs=sa_rsm) \n\n    return final_ind_trips, final_jnt_trips\n
"},{"location":"assessment.html","title":"Assessment","text":""},{"location":"assessment.html#rsm-configuration","title":"RSM Configuration","text":"

The team conducted tests using different combinations for the RSM parameters, including the number of RSM zones (1000, 2000), default sampling rates (15%, 25%, 100%), enabling or disabling the intelligent sampler, and choosing the number of global iterations (2 or 3), among other factors. The most significant influence of the number of RSM zones was observed on the runtime of the highway assignment process. Since the highway assignment runtime was already low with 1000 RSM zones, there was no motivation to explore lower RSM zone number. Altering the sampling rate had a greater impact on the runtime of the demand model (CT-RAMP) compared to changing the number of RSM zones. These test runs exhibited varying runtimes depending on the specific configuration. Key metrics at the regional level were analyzed across these different test runs to comprehend the trade-off between improved runtime for RSM and achieving RSM results that are similar to ABM. Based on this, the team collectively determined that for the MVP (Minimum Viable Product) version of the RSM, the \u201coptimal\u201d configuration would be to use 2000 RSM zones, a 25% default sampling rate, the intelligent sampler turned off, and 2 global iterations and this RSM configuration was used to move forward with the overall assessment of the RSM.

"},{"location":"assessment.html#calibration","title":"Calibration","text":"

Aggregating the ABM zones to RSM zones, distorts the walk trips share coming out of the model. With the model configuration (Rapid Zones, Global Iterations, Sample Rate, etc.) for RSM as identified above, tour mode choice calibration was performed to match the RSM mode share to ABM2+ mode share, primarily to match the walk trips. A calibration constant was applied to the tour mode choice UEC to School, Maintenance, Discretionary tour purpose. The mode share for Work and University purpsoe were reasonable, therefore the calibration wasn\u2019t applied to those purposes.

RSM specific constants were added to the Tour Mode Choice UEC (TourModeChoice.xls) to some of the tour purposes. The Walk mode share for the Maintenance and Discretionary purposes was first adjusted by calibrating and applying n RSM specific constant row to the UEC. Furthermore, in cases where the tour involved escorting for Maintenance or Discretionary purposes, an additional calibration constant was introduced to further adjust the walk mode share for such escort tours. Similarly, a differeent set of constants were added to calibrate the School tour purpose. There was no need to calibrate mode choice for any other tour purpose as those were reasonable from RSM.

Note that a minor calibration will be required for RSM when number of rapid zones are changed.

Here is how the mode share and VMT compares before and after the calibration for RSM. Donor model in the charts below refers to the ABM2+ run.

"},{"location":"assessment.html#base-year-validation","title":"Base Year Validation","text":"

Here is the table of ABM2+ and RSM outcome comparison after the RSM calibration. The metrics used are some of the regional level key metrics. Volume comparison for the roadway segment on I-5 and I-8 were chosen at random.

"},{"location":"assessment.html#runtime-comparison","title":"Runtime Comparison","text":"

For base year 2016 simulation, below is the runtime comparison of ABM2+ vs RSM.

"},{"location":"assessment.html#sensitivity-testing","title":"Sensitivity Testing","text":"

After validating the RSM for base year with the chosen design configuration, RSM was used to carry out hypothetical planning studies related to some broader use-cases. Model results from both RSM and ABM2+ were compared for each of the sensitivity test to assess the performance of RSM and evaluate if RSM could be a viable tool for such policy planning.

For each test, a few key metrics from ABM2+ No Action, ABM2+ Action, RSM No Action and RSM Action scenario runs were compared. The goal was to have RSM and ABM2+ show similar sensitivities for action vs no-action.

"},{"location":"assessment.html#regional-highway-changes","title":"Regional Highway Changes","text":""},{"location":"assessment.html#auto-operating-cost-50-increase","title":"Auto Operating Cost - 50% Increase","text":""},{"location":"assessment.html#auto-operating-cost-50-decrease","title":"Auto Operating Cost - 50% Decrease","text":""},{"location":"assessment.html#ride-hailing-cost-50-decrease","title":"Ride Hailing Cost - 50% decrease","text":""},{"location":"assessment.html#automated-vehicles-100-adoption","title":"Automated Vehicles - 100% Adoption","text":"

In SANDAG model, the AV adoption is analyzed by capturing the zero occupancy vehicle movement as simulated in the Household AV Allocation module. For RSM, this AV allocation module is skipped, which is why RSM is not a viable tool for evaluating policies related to automated vehicles.

"},{"location":"assessment.html#land-use-changes","title":"Land Use Changes","text":"

RSM and ABM2+ shows similar sensitivities for the two tested scenarios with land use change.

"},{"location":"assessment.html#change-in-land-use-job-housing-balance","title":"Change in land use - Job Housing Balance","text":""},{"location":"assessment.html#change-in-land-use-mixed-land-use","title":"Change in land use - Mixed Land Use","text":""},{"location":"assessment.html#regional-transit-changes","title":"Regional Transit Changes","text":""},{"location":"assessment.html#transit-frequency","title":"Transit Frequency","text":"

The RSM and ABM generally match on changes in regional metrics when the transit frequency is globally doubled.

"},{"location":"assessment.html#local-highway-changes","title":"Local Highway Changes","text":""},{"location":"assessment.html#toll-removal","title":"Toll Removal","text":"

The removal of the toll on SR-125 (The South Bay Expressway) was tested in both ABM and RSM. In both models, volumes on SR-125 increased and volumes on I-805 at the same point decreased.

"},{"location":"assessment.html#local-transit-changes","title":"Local Transit Changes","text":""},{"location":"assessment.html#rapid-637-brt","title":"Rapid 637 BRT","text":"

Tests were conducted that added the planned Rapid 637 line from North Park to the Naval Facilities to the base year network. Without the study area definition there were around 3,000 boardings from the RSM, but the addition of a study area resulted in a value much closer to the one produced by ABM2+.

"},{"location":"development.html","title":"Development","text":""},{"location":"development.html#needs","title":"Needs","text":"

The time needed to configure, run, and summarize results from ABM2+ is too slow to support a nimble, challenging, and engagement-oriented planning process. SANDAG needed a tool that quickly approximates the outcomes of ABM2+. The rapid strategic model, or RSM, was built for this purpose.

ABM2+ Schematic is shown below

"},{"location":"development.html#design-considerations","title":"Design Considerations","text":"

Reducing the number of zones reduces model runtime.

Reducing the number of model components reduces runtime.

Reducing the number of global iterations reduces runtime.

Reducing sample rate reduces runtime.

"},{"location":"development.html#architecture","title":"Architecture","text":"

The RSM is developed as a Python package and the required modules are launched when running the existing SANDAG travel model as Rapid Model. It takes as input a complete ABM2+ model run and has following modules:

"},{"location":"development.html#zone-aggregator","title":"Zone Aggregator","text":"

The RSM zone creator/aggregator creates a set of RSM analysis zones (Rapid Zones) and a set of RSM input files compatible with the zone system, using a donor model run (ABM2+/ABM3) as input. The inputs include the MGRA shapefile (MGRASHAPE.zip), MGRA socioeconomic file (example: mgra13_based_input2016.csv), individual trips (indivTripData_3.csv), from the donor model. It produces a new MGRA socioeconomic file with new RSM zones and crosswalk files between original TAZ/MGRA and the rapid zones. Along with the inputs, the user can specify other parameters such as number of RSM zones, donor model run directory, number of external zones, MGRA socioeconomic file, names of crosswalk files generated by the zone aggregator module, optional study area file (to study localized changes in the region) and RSM zone centroid csv files in the model properties file (sandag_abm.properties).

At the core of the RSM zone aggregator, the module performs several steps. The MGRA geographies are loaded from shapefiles, MGRA data is loaded from the MGRA socioeconomic file, and trip data is extracted from the individual trip file. Additional computations, like intersection counts and density variables, are performed on the MGRA data. The script aggregates the MGRA\u2019s attributes to create a new zone data based on \u201cTAZ\u201d (Traffic Analysis Zone). The individual trips file is used to calculate the mode shares for each TAZ. Additional travel time between TAZs to the point of interest (default includes San Diego city hall, outside Pendleton gate, Escondido city hall, Viejas casino, and San Ysidro trolley) are also added to the aggregated data by TAZ. The TAZs are further clustered to a user-defined number of RSM zones using several cluster factors (default factors and their weights are as follows: \u201cpopden\u201d: 1, \u201cempden\u201d: 1, \u201cmodeshare_NM\u201d: 100, \u201cmodeshare_WT\u201d: 100) and clustering algorithm. The current scripts support KMeans and agglomerative clustering algorithms to cluster the TAZs. In case the user has specified a study area, the function separately handles them and aggregates them into their clusters based on the specification provided in the study area file. The remaining TAZs are aggregated based on the aggregation algorithm.

After the clustering, the aggregator produces the TAZ/MGRA crosswalks between old TAZs/MGRAs to new RSM zones. The elementary and high school enrollments are further checked and adjusted in the new RSM zone socioeconomic to prevent zero values.

The user can also control the execution of the zone aggregator from the properties file. Once a baseline RSM run is established, other project related RSM can be setup to skip running the zone aggregator and the zone system from the RSM baseline can be used. Please note that MGRA and TAZs are essentially same geographically in the RSM model run except their numbering is different.

"},{"location":"development.html#input-aggregator","title":"Input Aggregator","text":"

The input aggregator module of RSM aggregates several input files, uec (soa) files, non-abm model outputs of the donor model based on the new RSM zones. The main inputs to this module include the location of the donor model, RSM socioeconomic file, TAZ and MGRA crosswalks. The module reads the original socioeconomic file and adds intersection count and several density variables that were originally generated by the 4D module of the current ABM2+ model. This is done here in RSM because the 4D module is skipped when running RSM. The module then uses the MGRA crosswalks between MGRA and RSM zones to aggregate the original socioeconomic file data based on the new RSM zones to create a new RSM specific socioeconomic file. Next, the module aggregates the following input files:

File Name Aggregation Columns Aggregation Methodology microMgraEquivMinutes.csv walkTime, dist, mmTime, mmCost, mtTime, mtCost, mmGenTime, mtGenTime, minTime Mapped MGRA to RSM zones and aggregated the columns by taking mean. microMgraTapEquivMinutes.csv walkTime, dist, mmTime, mmCost, mtTime, mtCost, mmGenTime, mtGenTime, minTime Mapped MGRA to RSM zones and aggregated the columns by taking mean. walkMgraTapEquivMinutes.csv boardingPerceived, boardingActual, alightingPerceived, alightingActual, boardingGain, alightingGain Mapped MGRA to RSM zones and aggregated the columns by taking mean. walkMgraEquivMinutes.csv percieved, actual, gain Mapped MGRA to RSM zones and aggregated the columns by taking mean. bikeTazLogsum.csv logsum, time Mapped TAZ to RSM zones and aggregated the columns by taking the mean. bikeMgraLogsum.csv logsum, time Mapped MGRA to RSM zones and aggregated the columns by taking the mean. zone.term terminal_time Mapped TAZ to RSM zones and took the maximum. zones.park park_zones Mapped TAZ to RSM zones and took the maximum. tap.ptype Mapping RSM zones to TAZs accessam.csv TIME, DISTANCE ParkLocationAlts.csv parkarea Mapped MGRA to RSM zones and took the minimum. CrossBorderDestinationChoiceSoaAlternatives.csv Mapping MGRA to RSM Zones TourDcSoaDistanceAlts.csv a, mgra It is recreated with RSM zones DestinationChoiceAlternatives.csv a, mgra It is recreated with RSM zones SoaTazDistAlts.csv a, dest It is recreated with RSM zones TripMatrices.csv CVM_ XX:LT, CVM_ XX:IT, CVM_ XX:MT, CVM_ XX:HT,CVM_XX:LNT, CVM_XX:INT, CVM_XX:MNT, CVM_XX:HNTwhere XX = EA, AM, MD, PM, EV Mapped TAZ to RSM zones and aggregated the columns by taking the sum. transponderModelAccessibilities.csv DIST,AVGTTS,PCTDETOUR Mapped TAZ to RSM zones and aggregated the columns by taking the mean. crossBorderTours.csv Mapped MGRA/TAZs to RSM zones internalExternalTrips.csv Mapped MGRA/TAZs to RSM zones visitorTours.csv Mapped MGRA to RSM zones visitorTrips.csv Mapped MGRA to RSM zones householdAVTrips.csv Mapped MGRA to RSM zones airport_out.SAN.csv Mapped MGRA/TAZ to RSM zones airport_out.CBX.csv Mapped MGRA/TAZ to RSM zones TNCtrips.csv Mapped MGRA/TAZ to RSM zones TRIP_ST_XX.CSVwhere ST (Sector Type) = FA, GO, IN, RE, SV, TH, WH; XX (Time Period) = OE, AM, MD, PM, OL Mapped TAZ to RSM zones

More details on the the above files can be found here.

"},{"location":"development.html#translate-demand","title":"Translate Demand","text":"

The translate demand module of the RSM aggregates the non-resident demand matrices and trip tables based on the new RSM zone structure. The inputs of this module includes the path to the RSM model directory, donor model directory and crosswalks. In particular the module aggregates the demand from auto, transit, non-motorized, other trips from the airport, cross border, internal external and visitor model. It also aggregated TNC vehicle trips and empty AV trips.

"},{"location":"development.html#intelligent-sampler","title":"Intelligent Sampler","text":"

The intelligent sampler module is designed to intelligently sample households and persons from synthetic households and person data, considering accessibility metrics and other parameters. The main inputs to this module are the households file, person file, TAZ/MGRA crosswalks and the outputs are sampled households and person files. In the model properties file (sandag_abm.properties), the user can choose to run RSM sampler, specify the default sampling rate, and minimum sampling rate for the RSM model run. The user also has the ability to sample specific zones at 100% by specifying them in the study area file and turn on the differential sampling indicator (use.differential.sampling equals to 1).

The sampler function follows these primary steps:

  1. Zone Mapping: The function maps zones from the synthetic households/person data to their corresponding RSM zones using crosswalk data.

  2. Household Sampling:

  3. If accessibility data is missing (first iteration) or if the RSM sampler is turned off, a default sampling rate is applied to all RSM zones, with optional 100% sampling in the study area.
  4. If accessibility data is available and the RSM sampler is turned on, the function calculates differences in accessibility metrics between the current and previous iterations. The sampling rates are determined based on these differences and are adjusted to be within specified bounds. The RSM zones of the study area are sampled at a 100% sampling rate if the differential sampling indicator is turned on.

  5. Households and Persons Selection: The function selects households based on the calculated sampling rates. It also selects persons associated with the sampled households.

  6. Output: The selected households and persons are written to output CSV files in the specified output directory. The function also computes and logs the total sampling rate, representing the proportion of selected households relative to the total number of households.

Note that in the current RSM deployment, sampler is set to use 25% default sampling rate. The intelligent sampler needs further testing to be used to sample households using the accessibility change.

"},{"location":"development.html#intelligent-assembler","title":"Intelligent Assembler","text":"

The intelligent assembler module assembles the trips of RSM model run and scale them appropriately based on the sampling rate of the RSM zones. The main inputs to this module are joint and individual trips from the donor and RSM model, households file, crosswalks for mapping zones, optional study area file and a flag to running the assembler.

The assembler function follows these primary steps:

  1. Load Trip Files: The function reads the individual and joint trip data for the RSM run. If the assembler is set to run (flag run_assembler equals 1), the function also loads the corresponding trip data from the donor model run.

  2. Assemble Trips: It converts individual and joint trip data from both the RSM run and the original model run into a common table format using a merging process. It separates trips made by households in the RSM run and those that were not resimulated. Then, it combines these trips to create the final assembled trip data, including individual and joint trips.

  3. Evaluation of Trip Changes: The function calculates and evaluates the percentage change in total trips by mode for each home zone. It aggregates trips made by households in the RSM and original model runs and compares their trip counts by mode. This information is used to assess the stability of travel behavior in different zones.

  4. Alternative Behavior (If Assembler is Off): If the assembler is turned off (flag run_assembler equals 0), the function scales the RSM individual and joint trips based on the specified default sampling rate. This alternative behavior is intended to simulate all trips as if they were selected, eliminating the need for the assembler. If the study area file is present and the differential sampling is turned on(use.differential.sampling equals to 1), then the trips made by residents of the study area are not scaled based on the RSM deafult sampling rate.

  5. Outputs: The function returns two outputs: individual trips containing the assembled individual trip data, and joint trips containing the assembled joint trip data. These data files are structured to align with the format required for further analysis or use by Java components.

In summary, the RSM assembler module takes multiple trip datasets and assembles them to create a unified dataset for further analysis, accommodating cases where only a subset of households were resimulated. The function also evaluates changes in trip behavior across different zones.

"},{"location":"development.html#user-experience","title":"User Experience","text":"

The RSM repurposes the ABM2+ Emme-based GUI. The options will be updated to reflect the RSM options, as will the input file locations and other parameters. The RSM user experience will, therefore, be nearly the same as the ABM2+ user experience.

"},{"location":"userguide.html","title":"User Guide","text":""},{"location":"userguide.html#rsm-setup","title":"RSM Setup","text":"

Below are the steps to setup an RSM scenario run:

  1. Set up an ABM run on the server\u2019s C drive* by using the ABM2+ release 14.2.2 scenario creation GUI located at T:\\ABM\\release\\ABM\\version_14_2_2\\dist\\createStudyAndScenario.exe.

    *running the model on the T drive and setting it to run on the local drive causes an error. An issue has been created on GitHub

  2. Open Anaconda Prompt and type the following command:

    python T:\\projects\\RSM\\setup\\setup_rsm.py [MODEL_RUN_DIRECTORY]

    Specifying the model run directory in the command line is optional. If it is not specified a dialog box will open asking the user to specify the model run.

  3. Change the inputs and properties as needed. Be sure to check the following:

    1. If running a new network, make sure the network files are correct
    2. Check that the RSM properties were appended to the property file and make sure the RSM properties are correct
    3. Check that the updated Tour Mode Choice UEC was copied over
  4. After opening Emme using start_emme_with_virtual_env.bat and opening the SANDAG toolbox in Modeller as usual, set the steps to skip all of the special market models and to run only 2 iterations. Most of these should be set automatically, though you may need to set it to skip the EE model manually.

    Figure 1: Steps to run in SANDAG model GUI for RSM run

"},{"location":"userguide.html#debugging","title":"Debugging","text":"

For crashes encountered in CT-RAMP, review the event log as usual. However, if it occurs during an RSM step, a new logfile called rsm-logging.log is created in the LogFiles folder.

"},{"location":"userguide.html#rsm-specific-changes","title":"RSM Specific Changes","text":""},{"location":"userguide.html#application","title":"Application","text":""},{"location":"userguide.html#bin","title":"Bin","text":""},{"location":"userguide.html#emme_project","title":"Emme_project","text":""},{"location":"userguide.html#input","title":"Input","text":""},{"location":"userguide.html#pythonemmetoolbox","title":"Python\\emme\\toolbox","text":""},{"location":"userguide.html#new-properties","title":"New Properties","text":""},{"location":"userguide.html#new-files","title":"New Files","text":"
  1. study_area.csv:

    This optional file specifies an explicit definition of how to aggregate certain zones, and consequentially, which zones to not aggregate. This is useful for project-level analysis as a modeler may want higher resolution close to a project but not be need the resolution further away. The file has two columns, taz and group. The taz column is the zone ID in the ABM zone system, and the group column indicates what RSM zone the ABM zone will be a part of. This will be the MGRA ID, and the TAZ ID being the MGRA ID added to the number of external zones. If a user doesn\u2019t want to aggregate any zones within the study area, the group ID should be distinct for all of them. Presently, all RSM zones defined in the study area are sampled at 100%, and the remaining zones are sampled at the sampling rate set in the property file.

    Any zones not within the study area will be aggregated using the standard RSM zone aggregating algorithm.

    An example of how the study area file works is shown below (assuming 12 external zones):

    Figure 2: ABM Zones

    Table 1: study_area.csv

    taz group 1 1 2 2 3 3 4 4 5 5 6 6

    Figure 3: Resulting RSM Zones

    For a practical example, see Figure 4, where a study area was defined as every zone within a half mile of a project. Note that within the study area, no zones were aggregated (as it was defined), but outside of the study area, aggregation occurred.

    Figure 4: Example Study Area

"},{"location":"visualizer.html","title":"Visualizer","text":""},{"location":"visualizer.html#introduction","title":"Introduction","text":"

The team developed a RSM visualizer tool to allow user to summarize and compare metrics from multiple RSM model runs. It is a dashboard style tool built using SimWrapper (an open source web-based data visualization tool for building disaggregate transportation simulations) and also leverages SANDAG\u2019s Data Pipeline Tool. SimWrapper software works by creating a mini file server to host reduced data summaries of travel model. The dashboard is created via YAML files, which can be customized to automate interactive report summaries, such as charts, summary tables, and spatial maps.

"},{"location":"visualizer.html#design","title":"Design","text":"

Visualizer has three main components:

"},{"location":"visualizer.html#data-pipeline","title":"Data Pipeline","text":"

SANDAG Data Pipeline Tool aims to aid in the process of building data pipelines that ingest, transform, and summarize data by taking advantage of the parameterization of data pipelines. Rather than coding from scratch, configure a few files and the tool will figure out the rest. Using pipeline helps to get the desired model summaries in a csv format. See here to learn how the tool works. Note that RSM visualizer currently supports a fixed set of summaries from the model and additional summaries can be easily incorporated into the pipeline by modifying the settings, processor and expression files.

"},{"location":"visualizer.html#post-processing","title":"Post Processing","text":"

Next, there is a post-processing script to perform all the data manipulations which are done outside of the data pipeline tool to prepare the data in the format required by SimWrapper. Similar to data pipeline, user can also modify this post-processing script to add any new summaries in order to bring them into the SimWrapper dashboard in order to use them in Simwrapper.

"},{"location":"visualizer.html#simwrapper","title":"SimWrapper","text":"

Lastly, the created summary files are consumed by SimWrapper to generate dashboard. SimWrapper is a web platform that can display either individual full-page data visualizations, or collections of visualizations in \u201cdashboard\u201d format. It expects your simulation outputs to just be local files on your filesystem somewhere; there is no need to upload the summary files to centralized database or cloud server to create the dashboard.

For setting up the visualization in SimWrapper, configuration files (in YAML format) are created that provide all the config details to get it up and running, such as which data to load, how to lay out the dashboard, what type of chart to create etc. Refer to SimWrapper documentation here to get more familiar with it.

"},{"location":"visualizer.html#setup","title":"Setup","text":"

The visualizer is currently deployed to compare 3 scenario runs at once. Running data pipeline and post-processing for each of those scenario is controlled thorugh the process_scenarios python script and configuration for scenarios are specified using the scenarios.yaml file. User will need to modify this yaml file to specify the scenarios they would like to compare using visualizer. There are two categories of scenarios to be specified - RSM and ABM (Donor Model) runs. For each of the scenario run, specify the directory of input and report folders in this configuration file. Files from input and report folder for the scenarios are then used in the data pipeline tool and post-processing step to create summaries in the processed folder of SimWrapper directory. Note that additional number of scenarios can be compared by extending the configuration in this file yaml file.

"},{"location":"visualizer.html#visualization","title":"Visualization","text":"

Currently there are five default visualization summaries in the visualizer:

"},{"location":"visualizer.html#bar-charts","title":"Bar Charts","text":"

These charts are for comparing VMT, mode shares, transit boardings and trip purpose by time-of-day distribution. Here is a snapshot of sample YAML configuration file for bar chart:

User can add as many charts as you want to the layout. For each chart, you should specify a csv file for the summaries and columns should match the csv file column name. There are also other specifications for the bar charts which you learn more about here.

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#network-flows","title":"Network Flows","text":"

These charts are for comparing flows and VMT on the network. You can compare any two scenarios on one network. Here is a snapshot of the configuration file:

For each network you need the csv files for two scenario summaries and an underlying network file which should be in geojson format. The supporting script creates the geojson files from the model outputs for the SimWrapper. For more info on network visualization specification see here.

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#sample-rate-map","title":"Sample Rate Map","text":"

This visual is a map for showing the RSM sample rates for each zone. Here is a snapshot of the configuration [file]:

For each map you need a csv file of sample rates and the map of zones in .shp format. For more info on network visualization specification see here.

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#zero-car-map","title":"Zero Car Map","text":"

This visual is a map for showing the zero-car household distribution. Here is a snapshot of the configuration file:

For each map you need a csv file of household rates and the map of zones in .shp format. For more info on network visualization specification see here

Here is how the how the visual looks in the dashboard:

"},{"location":"visualizer.html#od-flows","title":"OD Flows","text":"

This chart is for showing OD trip flows. Here is a snapshot of the configuration file:

For each map you need a csv file of od trip flows and the map of zones in .shp format. For more info on network visualization specification see here

Here is how the how the visual looks in the dashboard:

You can also modify the data and configuration of each visual on SimWrapper server. For each visual, there is a configuration button (see below), where you can add data, and modify all the map configurations. You can also export these configurations into a YAML file so you can use it in future.

"},{"location":"visualizer.html#how-to-run","title":"How to Run","text":"

The first step to run the visualizer is to bring in the scenario files. Currently the visualizer is setup to compare three scenarios: donor_ model, rsm_base and rsm_scen. donor_model is the ABM run, rsm_base is the baseline (no-action) RSM run and rsm_scen is the project (action) RSM run.

As mentioned earlier, if you wish to add any more RSM scenarios for comaprison, you can do it by modifying the scenarios.yaml file. Simply add the scenario configuration by copying the rsm_scen section and paste it under and change \u201crsm_scen\u201d to that new scenario name. Note that you will also need to add that another scenario config to the Data Pipeline and Post-Processing step.

Once you have copied required scenario files and the configuration setup, you are ready to runt the visualizer.

"}]} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml index 327682d..836575b 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -2,32 +2,32 @@ https://sandag.github.io/rsm/index.html - 2023-09-08 + 2023-09-13 daily https://sandag.github.io/rsm/api.html - 2023-09-08 + 2023-09-13 daily https://sandag.github.io/rsm/assessment.html - 2023-09-08 + 2023-09-13 daily https://sandag.github.io/rsm/development.html - 2023-09-08 + 2023-09-13 daily https://sandag.github.io/rsm/userguide.html - 2023-09-08 + 2023-09-13 daily https://sandag.github.io/rsm/visualizer.html - 2023-09-08 + 2023-09-13 daily \ No newline at end of file diff --git a/sitemap.xml.gz b/sitemap.xml.gz index 82706b97821efe22434ffdc594ba226e1401960b..f10ac32eb8c7ec54e26fca0f813c6ddc4d321947 100644 GIT binary patch delta 240 zcmV#tA7ai$4&b19@xZAIM+K%5mF21ONb*lT%=ZLzgw|V?oi^7>fd~7#4Xbi0g(fr qHVmm@0vpDSQ)Yr_GHTc`New PropertiesNew Files
    diff --git a/userguide.md b/userguide.md index 8492736..3f9b1f8 100644 --- a/userguide.md +++ b/userguide.md @@ -103,6 +103,8 @@ For crashes encountered in CT-RAMP, review the event log as usual. However, if i * Maps MGRAs to aggregated zones - Cluster.zone.centroid.file * Latitude and longitude coordinates of aggregated zone centroids +- use.differential.sampling + * If set to 1, study area zones will be sampled at 100%. If set to 0, every zone will be sampled at the deafult sampling rate. #### New Files 1. study_area.csv: