serum_to_s3

A Node JS data pipeline to scrape Serum DEX events and push them to an S3 bucket for use in a Snowflake cluster.

Event schema derived from SerumTaxTime, node pipeline architecture inspired by 0x Data Pipeline, Serum event scraper code from Mango Markets' Serum History.

The basic logic of the scraper works like this:

Pull the Serum markets from Serum's repo
Iterate over the markets, scrape the new trades and append into a CSV
Check if the CSV is over 250 MB or not (Snowflake's recommended batch size)
If the CSV is over 250 mb, upload it to S3 (Snowpipe takes it over from here) and start a new CSV for the scrapers to append to.

How the scraper works

The Serum DEX scraper is dependent on the structure of the Serum DEX contracts. The Serum DEX contracts store all filled orders in a rotating buffer - with each order that's pushed onto the queue, the Sequence Number is ratcheted up by one. Thus, you can get the difference in Sequence Number from the JSON RPC response header, and pull the part of the JSON response that corresponds to that number of events.

So, each scraper is responsible for a single market, and keeps track of the Last Sequence Number that it saw in a file in pipeline/, pulls the new events based off of this watermark, and then writes the new watermark to it.

Installation

Use yarn to install. Set up .env in the same folder as sample.env.

Running

Use yarn dev to run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

serum_to_s3

How the scraper works

Installation

Running

Files

README.md

Latest commit

History

README.md

File metadata and controls

serum_to_s3

How the scraper works

Installation

Running