Skip to content

Commit

Permalink
[build] Refactored module structure (#1323)
Browse files Browse the repository at this point in the history
Created a new integrations top-level directory, and moved venice-pulsar and
venice-samza to this, from clients.

Also added a navigating_project.md page to the docs.
  • Loading branch information
FelixGV authored Nov 19, 2024
1 parent 72ce72e commit 5542657
Show file tree
Hide file tree
Showing 25 changed files with 77 additions and 113 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/VeniceCI-StaticAnalysisAndUnitTests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,10 @@ jobs:
arg: :clients:da-vinci-client:jacocoTestCoverageVerification :clients:da-vinci-client:diffCoverage
:clients:venice-admin-tool:jacocoTestCoverageVerification :clients:venice-admin-tool:diffCoverage
:clients:venice-producer:jacocoTestCoverageVerification :clients:venice-producer:diffCoverage
:clients:venice-pulsar:jacocoTestCoverageVerification :clients:venice-pulsar:diffCoverage
:integrations:venice-pulsar:jacocoTestCoverageVerification :integrations:venice-pulsar:diffCoverage
:clients:venice-client:jacocoTestCoverageVerification :clients:venice-client:diffCoverage
:clients:venice-push-job:jacocoTestCoverageVerification :clients:venice-push-job:diffCoverage
:clients:venice-samza:jacocoTestCoverageVerification :clients:venice-samza:diffCoverage
:integrations:venice-samza:jacocoTestCoverageVerification :integrations:venice-samza:diffCoverage
:clients:venice-thin-client:jacocoTestCoverageVerification :clients:venice-thin-client:diffCoverage --continue

Internal:
Expand Down
9 changes: 7 additions & 2 deletions docs/dev_guide/dev_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,10 @@ permalink: /docs/dev_guide
---
# Developer Guide

This section includes guides for Venice developers. New contributors should visit the [How to Contribute to Venice](./how_to/how_to.md)
subsection to get started. The rest of the section is to document implementation details.
This section includes guides for Venice developers.

New contributors should visit the [How to Contribute to Venice](./how_to/how_to.md) subsection to get started.

Those looking for a general overview of the project structure may be interested to look at [Navigating the Project](navigating_project.md).

The rest of this section is to document implementation details.
58 changes: 58 additions & 0 deletions docs/dev_guide/navigating_project.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
layout: default
title: Navigating the Project
parent: Developer Guides
permalink: /docs/dev_guide/navigating_project
---
# Navigating the Project

The Venice codebase is split across these directories:

- `clients`, which contains the user-facing libraries that most Venice users might be interested in. Those include:
- `da-vinci-client`, which is the stateful client, providing "eager caching" for the Venice datasets. [Learn more](../user_guide/read_api/da_vinci_client.md).
- `venice-admin-tool`, which is the shell tool intended for Venice operators.
- `venice-client`, which is a one-stop-shop for all clients which an online application might need, including
thin-client, fast-client, da-vinci-client, consumer.
- `venice-producer`, which enables an application to perform real-time writes to Venice.
- `venice-push-job`, which enables an offline job to push batch data to Venice.
- `venice-thin-client`, which is the most minimal dependency one can get to issue remote reads to the Venice backend,
by delegating as much of the query logic as possible to the Venice router tier.
- `integrations`, which contains additional libraries that some Venice users might be interested in, to connect Venice
with other third-party systems. The rule of thumb for including a module in this directory is that it should have
minimal Venice-specific logic, and be mostly just glue code to satisfy the contracts expected by the third-party
system. Also, these modules are intended to minimize the dependency burden of the other client libraries. Those
include:
- `venice-pulsar`, which contains an implementation of a Pulsar [Sink](https://pulsar.apache.org/docs/next/io-overview/#sink),
in order to feed data from Pulsar topics to Venice.
- `venice-samza`, which contains an implementation of a Samza [SystemProducer](https://samza.apache.org/learn/documentation/latest/api/javadocs/org/apache/samza/system/SystemProducer.html),
in order to let Samza stream processing jobs emit writes to Venice.
- `internal`, which contains libraries not intended for public consumption. Those include:
- `alpini`, which is a Netty-based framework used by the router service. It was forked from some code used by
LinkedIn's proprietary [Espresso](https://engineering.linkedin.com/espresso/introducing-espresso-linkedins-hot-new-distributed-document-store)
document store. At this time, Venice is the only user of this library, so there should be no concern of breaking
compatibility with other dependents.
- `venice-client-common`, which is a minimal set of APIs and utilities which the thin-client and other modules need
to depend on. This module used to be named `venice-schema-common`, as one can see if digging into the git history.
- `venice-common`, which is a larger set of APIs and utilities used by most other modules, except the thin-client.
- `services`, which contains the deployable components of Venice. Those include:
- `venice-controller`, which acts as the control plane for Venice. Dataset creation, deletion and configuration,
schema evolution, dataset to cluster assignment, and other such tasks, are all handled by the controller.
- `venice-router`, which is the stateless tier responsible for routing thin-client queries to the correct server
instance. It can also field certain read-only metadata queries such as cluster discovery and schema retrieval to
take pressure away from the controller.
- `venice-server`, which is the stateful tier responsible for hosting data, serving requests from both routers and
fast-client library users, executing write operations and performing cross-region replication.
- `tests`, which contains some modules exclusively used for testing. Note that unit tests do not belong here, and those
are instead located into each of the other modules above.

Besides code, the repository also contains:

- `all-modules`, which is merely an implementation detail of the Venice build and release pipeline. No code is expected
to go here.
- `docker`, which contains our various docker files.
- `docs`, which contains the wiki you are currently reading.
- `gradle`, which contains various hooks and plugins used by our build system.
- `scripts`, which contains a few simple operational scripts that have not yet been folded into the venice-admin-tool.
- `specs`, which contains formal specifications, in TLA+ and FizzBee, for some aspects of the Venice architecture.

If you have any questions about where some contribution belongs, do not hesitate to reach out on the [community slack](http://slack.venicedb.org)!
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ dependencies {
exclude group: 'org.scala-lang'
}
implementation project(':clients:venice-thin-client')
implementation project(':clients:venice-samza')

// TODO: clean this... Pulsar should not depend on Samza
implementation project(':integrations:venice-samza')

implementation libraries.samzaApi

Expand Down
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion internal/venice-test-common/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ dependencies {
implementation project(':internal:venice-common')
implementation project(':services:venice-controller')
implementation project(':services:venice-router')
implementation project(':clients:venice-samza')
implementation project(':integrations:venice-samza')
implementation project(':clients:venice-producer')
implementation project(':internal:venice-client-common')
implementation project(':services:venice-server')
Expand Down
6 changes: 4 additions & 2 deletions settings.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,7 @@ include 'clients:da-vinci-client'
include 'clients:venice-admin-tool'
include 'clients:venice-client'
include 'clients:venice-producer'
include 'clients:venice-pulsar'
include 'clients:venice-push-job'
include 'clients:venice-samza'
include 'clients:venice-thin-client'

// Service modules
Expand Down Expand Up @@ -70,6 +68,10 @@ include 'internal:alpini:router:alpini-router-api'
include 'internal:alpini:router:alpini-router-base'
include 'internal:alpini:router:alpini-router-impl'

// 3rd-party system integration modules
include 'integrations:venice-pulsar'
include 'integrations:venice-samza'

// integration tests
include 'tests:venice-pulsar-test'

Expand Down
2 changes: 1 addition & 1 deletion tests/docker-images/pulsar-sink/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

FROM apachepulsar/pulsar:2.10.3 as pulsar-venice

COPY clients/venice-pulsar/build/libs/pulsar-venice-sink.nar /pulsar/connectors/
COPY integrations/venice-pulsar/build/libs/pulsar-venice-sink.nar /pulsar/connectors/

CMD bash
2 changes: 1 addition & 1 deletion tests/venice-pulsar-test/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ sourceSets {
}

dependencies {
testImplementation project(':clients:venice-pulsar')
testImplementation project(':integrations:venice-pulsar')
testImplementation libraries.log4j2api
testImplementation libraries.testng
testImplementation libraries.testcontainers
Expand Down
103 changes: 0 additions & 103 deletions wc-test

This file was deleted.

0 comments on commit 5542657

Please sign in to comment.