ETL Service (REST API performing extract, transform and load operations) using Scala & http4s
Normally, I uses Play Framework & AkkaHTTP to build REST API with the Scala & Java. But in this code example I used http4s as I needs to complete this design using FP fundamentals. And http4s made it quite easy for me as it supports Typeful and functional design. Composability is quite easy task using this tool because of Cats library ecosystem.
- Extensible and maintainable design
- Used FP fundamentals to complete this design.
- Enough Unit tests and API (or integration tests) are available.
- Well logging & monitoring implemented as and when it’s needed.
- Good Error handling implementation.
- EtlSequence - This is main object of this application and it contains whole details about sequence operations which we can perform with the tool we have created. More specific will be composing operations one after the other.
- Operation - Contains info about operations like word count, word frequency, caps and replace with the body details.
Transformation and Aggregation Operation: - Below objects are used to transform http request into objects and send it to the service layer for further processing.
- EtlRequest
- CountRequest
- FrequencyRequest
- CapsRequest
- ReplaceRequest
- SequenceRequest
- EtlResult
- AggregationResult
- TransformationResult
- All the routes and request/response validation level test cases are available in the
package test.scala.io.github.etl.route
like:- AggregationRouteSpec - contains all tests related to aggregation operations (word count and word frequency)
- TransformationRouteSpec - contains all tests related to transformation operations (caps and replace)
- SequenceRouteSpec - contains all tests related to sequence validation and operations on both aggregation and transformation
- Service layer execution and validation level test cases are available in the
package test.scala.io.github.etl.service
like:- AggregationServiceSpec - Unit tests of aggregation operations (word count and word frequency)
- TransformationServiceSpec - Unit tests of transformation operations (caps and replace)
- SequenceServiceSpec - Unit tests of sequence validation and operations on both aggregation and transformation
- All requests are tracked by a http header called
Request-Id
. This implementation mainly done for the tracking and monitoring purpose. Also a logger enabled for each request and response to track the details. - System generates an unique
responseId
to track the incomingrequestId
and both are returns with the response header. - Info and Error logging has been implemented as and when it’s needed.
- A well descriptive response header available for each request which contains information about the process state.
{
"header": {
"requestId": "f045faf4-845d-4f5b-a7bd-76d6fbcf8f44",
"responseId": "f5bc3c6a-7d91-4e35-b5db-9b05fe695b90",
"statusCode": 2000,
"statusMessage": "SUCCESS"
},
"result": {}
}
- Application throws
EtlException
with appropriate error code for theNonFatal
andValidation
exceptions. Below error codes has been implemented for the same.
CODE_2000 = 2000 // Success status code
CODE_5000 = 5000 // Unexpected errors
CODE_5001 = 5001 // System unavailable
CODE_5002 = 5002 // Scheduled downtime
CODE_4000 = 4000 // Invalid request - Malformed Json
CODE_4001 = 4001 // Invalid request - Mandatory data missing
CODE_4002 = 4002 // Invalid request - Pattern Syntax error
CODE_4003 = 4003 // Invalid request - Non logical operations.
CODE_3000 = 3000 // Resource reader errors
- The Github code for the project is at : https://github.com/anand-singh/etl-service
- Clone the project into local system
- To run this sbt project, you need JDK 8 or later
- Execute
sbt clean compile
to build the product - Execute
sbt run
to start the etl-service - Finally etl-service should be now accessible at localhost:8080
- Execute
sbt clean test
to test the product - Run the tests with enabled coverage to generate the coverage report:
sbt clean coverage test
- To generate the coverage reports run:
sbt coverageReport
Coverage reports will be in target/scoverage-report
. There are HTML and XML reports.
[info] Statement coverage.: 93.54%
[info] Branch coverage....: 100.00%
- Execute
sbt scalastyle
to check the code quality
[info] scalastyle Processed 14 file(s)
[info] scalastyle Found 0 errors
[info] scalastyle Found 0 warnings
[info] scalastyle Found 0 infos
- Scala - Scala combines object-oriented and functional programming in one concise, high-level language.
- http4s - Http4s is a minimal, idiomatic Scala interface for HTTP services.
- circe - A JSON library for Scala powered by Cats
- Specs2 - Software Specifications for Scala
- Logback - Logback is intended as a successor to the popular log4j project, picking up where log4j leaves off.
- sbt-scoverage - Plugin for SBT that integrates the scoverage code coverage library.
- Scalastyle - Scalastyle examines your Scala code and indicates potential problems with it.