bigquery-autojob

Note: Documentation is currently not in sync with the rewrite, updating soon...

A Google Cloud Function providing a simple and configurable way to automatically load data from GCS files into Big Query tables.

It features a convention over configuration approches, and provides a sensible default configuration for common file formats (CSV, JSON, AVRO, ORC, Parquet)

The table name is automatically derived from the file's name, minus the extension, and date/timestamp suffix if any.
Autodetect features enabled
Avro logical types are used
New data is appended to the table

If the default behaviour does not suit your needs, it can be modified for all or certain files through mapping files or custom metadata.

Quickstart

Create a new bq-autoload Google Cloud Storage bucket

$> gsutil mb -c regional -l europe-west1 "gs://bq-autoload"

Create a new Staging BigQuery dataset
```
$> bq mk --dataset "Staging"
```

Clone and deploy this repository as a cloud function triggered by changes on this GCS bucket (do not forget to replace the project id)

$> git clone "https://github.com/tfabien/bigquery-autoload/"              \
   && cd "bigquery-autoload"                                              \
   && npm install -g typescript                                           \
   && npm install                                                         \
   && npm build                                                           \
   && gcloud functions deploy "bq-autoload"                               \
          --entry-point autoload                                          \
          --trigger-bucket "bq-autoload"                                  \
          --set-env-vars "PROJECT_ID={{YOUR_GCP_PROJECT_ID}}"             \
          --runtime "nodejs10"                                            \
          --memory "128MB"                                                \
          --region europe-west1

That's it 👍

Any file you upload to the bq_autoload GCS bucket will now automatically be loaded into a BigQuery table within seconds.

Usage

See the wiki for usage samples and advanced configuration

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.github		.github
.vscode		.vscode
config/auto-load		config/auto-load
samples		samples
src		src
tests		tests
.eslintrc.json		.eslintrc.json
.gcloudignore		.gcloudignore
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tslint.json		tslint.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bigquery-autojob

Quickstart

Usage

About

Releases

Packages

Contributors 4

Languages

tfabien/bigquery-autojob

Folders and files

Latest commit

History

Repository files navigation

bigquery-autojob

Quickstart

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages