Skip to content

Latest commit

 

History

History
41 lines (30 loc) · 1.09 KB

README.md

File metadata and controls

41 lines (30 loc) · 1.09 KB

Spark stat analyzer

POC to generate consolidated statistics from json files (generated by navitia-stat-logger or navitia-stat-exporter)

Pre-requisites

  • Spark 1.5+ (may work with previous versions, but untested)
  • A repository with exported statistics files where stat files are stored in a tree like
  |
  \- <year>
      |
      \- <month>
          |
          \- <day>
              |
              \- <files>.json.log(.gz)

The files are json logs (one json per line). The files may be compressed using gzip.

Usage

  • For requests_calls consolidation
<path/to/spark>/bin/spark-submit  --conf spark.ui.showConsoleProgress=true --master=local[3] requests_calls.py <your_export_directory> <start_date> <end_date>

where:

  • start_date and end_date is in YYYY-MM-DD format

  • For coverage_journeys consolidation

<path/to/spark>/bin/spark-submit  --conf spark.ui.showConsoleProgress=true --master=local[3] coverage_journeys.py <your_export_directory> <start_date> <end_date>

where:

  • start_date and end_date is in YYYY-MM-DD format

Note that the results are stored in the export dir