-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporal Opportunity Density and Dual Accessibility #884
Conversation
This draft implementation finds the first N destination points in a pointset. No consideration is given to the number of opportunities at each point. One use case for this was finding the single closest destination. However, the linked issue refers to "dual accessibility". If "accessibility" is defined as the number of opportunities reached in a fixed amount of time, the "dual" of accessibility (in the mathematical sense of the word) is the amount of time taken to reach a fixed number of opportunities. This is not the same as the closest N points and requires a different implementation. The simplest is to retain travel times to all destinations (as if we were building a full OD travel time matrix) and sort them before finding the threshold point. This would be less space and time efficient than the current "closest N points" approach, but not prohibitively slow, so we should probably do it this way and optimize later. We probably want to allow both kinds of output: the time to each of the closest N points, and the time to reach M opportunities. But it must be possible to disable output of the closest N points and the underlying complete set of travel times because they can be very voluminous. One thing that makes this tricky is that for code reuse (and to avoid duplication of computational effort) we want to enable the same data structure/method to accumulate data for either full OD matrices or dual accessibility. But in some cases we want to report the full contents of that data structure and in other cases we want to summarize it as dual accessibility and/or closest N points. This means we need to separately enable accumulation of data and three different ways of summarizing those data. |
Hmm, we have |
On further thought, a histogram of how many destinations are reached at each minute is just the discrete derivative of accessibility. In single-point requests where we ask the worker to report all 120 cutoffs, the marginal increase in accessibility as we increase the cutoff by 1 minute is the number of destinations in that histogram bin. In fact we don't even need to materialize the histogram at all. Iterating over the cutoffs starting at 1, we can simply note the cutoff value where the cumulative access curve crosses the desired threshold and bail out. We currently request all 120 cutoffs in single point tasks, but only a few cutoffs in regional tasks which each become separate regional results. It might be tricky though to selectively enable all 120 cutoffs in only those regional analyses where we want to report dual accessibility, without inadvertently generating 120 sets of gridded accessibility results. It's probably simpler and more maintainable to build up that one-minute-resolution histogram separately, independent of how many cutoffs are specified. It should be very lightweight and fast to construct. Also consider that this histogram or the dual of accessibility can only be easily derived from the cumulative accessibility curve when using the step function. When using other decay functions we might still want to know at exactly which minute the opportunities were located - this allows some interesting/informative visualizations alongside the resulting cumulative accessibility curve. This is another argument for accumulating and reporting the histogram independently from the accessibility values. Of course maybe someone actually wants to record dual accessibility where the threshold accessibility indicator value is computed using a custom decay function. For example, how many minutes of travel to reach 100k jobs, using logistic decay to weight the jobs. In that case you would again need to compute the indicator value at every cutoff - this can be done by essentially issuing a single point request at each origin and scanning over the 120 cutoffs until the threshold value is exceeded. So it can currently be done in a script calling our API but not in a regional analysis. |
still no way to collate and export results for a regional analysis
- method to compute dual accessibility from temporal opportunity density - maintain nearby opportunities list in sorted order while constructing Still needs: - task parameters to enable these outputs - collation of results in regional analyses
- Enable opportunity temporal density, nearest n opportunities, and dual accessibility in AnalysisRequest and AnalysisWorkerTask - CSVResultWriter recording opportunity density and dual accessibility
Simpler and more maintainable, given that there's no immediate demand for this information among users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I added a few minor clarifications/suggestions in-line, and these can be addressed later. To test, we'll check that with freeform origins, includeTemporalDensity: true
, and dualAccessibilityThreshold
set to a positive integer, regional results yield a CSV with the expected travel time density and dual accessibility result.
* The data retained here feed into three different kinds of results: "Dual" accessibility (the number of opportunities | ||
* reached in a given number of minutes of travel time); temporal opportunity density (analogous to a probability density | ||
* function, how many opportunities are encountered during each minute of travel, whose integral is the cumulative | ||
* accessibility curve). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps part of this comment was unintentionally deleted? Otherwise,
* The data retained here feed into three different kinds of results: "Dual" accessibility (the number of opportunities | |
* reached in a given number of minutes of travel time); temporal opportunity density (analogous to a probability density | |
* function, how many opportunities are encountered during each minute of travel, whose integral is the cumulative | |
* accessibility curve). | |
* The data retained here feed into two different kinds of results: "Dual" accessibility (the number of minutes of | |
* travel time needed to reach a given number of opportunities); and temporal opportunity density (analogous to a probability density | |
* function, how many opportunities are encountered during each minute of travel, whose integral is the cumulative | |
* accessibility curve). |
* Note that this is one histogram _per target_ showing on how many iterations each travel time is the fastest, | ||
* _not_ one histogram per origin/percentile showing how many destinations are reached at each travel time. The | ||
* latter is essentially the discrete derivative of step-function accessibility and is tracked elsewhere (TemporalDensityResult). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helpful to avoid future confusion 👍
/** | ||
* This handles collating regional results into CSV files containing temporal opportunity density | ||
* (number of opportunities reached in each one-minute interval, the derivative of step-function accessibility) | ||
* as well as "dual" accessibility (the amount of time needed to reach n opportunities). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Dual" accessibility will be recorded as -1 if n
is not specified, or if the time needed exceeds 120 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the convenience method added here used elsewhere? I did not see it in a quick search of the diff.
This is an initial draft of the feature described in #875: finding the closest N destinations to each origin in regional analyses and reporting them back to the broker. Implementation should eventually be improved (e.g. some abstraction or ordering in the sorting of retained nearby destinations). There is not yet any way to collate the results to CSV or grid files for display and use.