Skip to content

POC of navitia stat consolider using apache spark

Notifications You must be signed in to change notification settings

niko64fx/spark-stat-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spark stat analyzer

POC to generate consolidated statistics from json files (generated by navitia-stat-logger or navitia-stat-exporter)

Pre-requisites

  • Spark 1.5+ (may work with previous versions, but untested)
  • A repository with exported statistics files where stat files are stored in a tree like
  |
  \- <year>
      |
      \- <month>
          |
          \- <day>
              |
              \- <files>.json.log(.gz)

The files are json logs (one json per line). The files may be compressed using gzip.

Usage

  • For requests_calls consolidation
<path/to/spark>/bin/spark-submit  --conf spark.ui.showConsoleProgress=true --master=local[3] requests_calls.py <your_export_directory> <start_date> <end_date>

where:

  • start_date and end_date is in YYYY-MM-DD format

  • For coverage_journeys consolidation

<path/to/spark>/bin/spark-submit  --conf spark.ui.showConsoleProgress=true --master=local[3] coverage_journeys.py <your_export_directory> <start_date> <end_date>

where:

  • start_date and end_date is in YYYY-MM-DD format

Note that the results are stored in the export dir

About

POC of navitia stat consolider using apache spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages