Skip to content

Gihub mining replication package for the article "Microservices in the Wild: the Github Landscape". It's A short node program that takes a prefiltered set of github repositories (Filtered with Google BigQuery) and uses GitHub API to find the ones that have a X nubmer of stars

License

Notifications You must be signed in to change notification settings

gpdeltedesco/mining-github-microservices

 
 

Repository files navigation

GitHub microservices miner - GraphQL version

Fetches repositories data, using GitHub's GraphQL API, and keeps a local index for further analysis.

What does it do?

The program obtains data in two stages:

  • index: Fetches, given a set of GraphQL queries, a list of repositories data.
  • fetch: Fetches, for every indexed repository, it's corresponding git repository.

Also, it pushes an excerpt of collected data to Solr, creating a collection ready to be queried.

Usage with Docker

This tool is ready for usage with docker compose. It will configure two services/containers:

Requirements

This version requires:

Install

  1. Clone this repo, and go to project directory

    git clone https://github.com/gpdeltedesco/mining-github-microservices
    cd mining-github-microservices
    
  2. Configure application, setting (at least) your Github OAuth token

    cp config.ini.dist config.ini
    sensible-editor config.ini
    

Everything else will be built/configured on first run:

docker-compose up

If you make config changes later, please remember to rebuild the php container, executing:

docker-compose build php

Run

The main executable lives in the php container. Run docker-compose run php bin/miner for a list of available commands.

Note that, for persistence, a runtime directory will be created (this directory, and any file on it, will be owned by the root user).

Usage from CLI (development)

Requirements

This version requires:

Install

  1. Clone this repo, and go to project directory

    git clone https://github.com/gpdeltedesco/mining-github-microservices
    cd mining-github-microservices
    
  2. Create runtime directory (for local data storage, and log)

    mkdir runtime
    
  3. Create and provision database

    sqlite3 runtime/store.sqlite < database.sql
    
  4. Install composer dependencies

    composer install
    
  5. Configure application, setting (at least) your OAuth token

    cp config.ini.dist config.ini
    sensible-editor config.ini
    

You are ready to go!

Run

Execute bin/miner for a list of commands.

About

Gihub mining replication package for the article "Microservices in the Wild: the Github Landscape". It's A short node program that takes a prefiltered set of github repositories (Filtered with Google BigQuery) and uses GitHub API to find the ones that have a X nubmer of stars

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PHP 97.6%
  • Dockerfile 2.4%