Skip to content

Commit

Permalink
Fixes #51 - Added ingress documentation and sample client application…
Browse files Browse the repository at this point in the history
… to docs (#140)

* Added Traefik related files

* Clean up SSL directory and add docs to README

* Moved traefik configuration in to dedicated README
Developed introductions for all implementations we are looking to provide references for.

* Fleshed out the TCP section a bit more

* Updated traefik documentation to match newer traefik from Helm repo

* Updated example to use 3 nodes

* Updated sample application to be a bit more verbose

* Finished reference implementations for Traefik

* Added reference sample application

* Removed references that we don't have samples for yet

* Renamed proxy to ingress
* Fixed reference to WhiteList LBP
  • Loading branch information
bradfordcp authored Jul 5, 2020
1 parent 5e22324 commit 4dd8e7c
Show file tree
Hide file tree
Showing 26 changed files with 1,028 additions and 0 deletions.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -240,3 +240,8 @@ tags
# End of https://www.toptal.com/developers/gitignore/api/go,osx,vim,linux,emacs,visualstudiocode,intellij+all

/build

# Certificates
*.pem
*.csr
*.p12
133 changes: 133 additions & 0 deletions docs/ingress/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Connecting applications to Cassandra running on Kubernetes

## Background

As long as applications run within a Kubernetes (k8s) cluster there will be a need to access those services from outside of the cluster. Connecting to a Cassandra (C*) cluster running within k8s can range from trivial to complex depending on where the client is running, latency requirements, and / or security concerns. This document aims to provide a number of solutions to these issues along with the rationale and motivation for each. The following approaches all assume a C* cluster is already up and reported as running.

## Pod Access

Any pod running within a Kubernetes cluster may communicate with any other pod given the container network policies permit it. Most communication and service discovery within a K8s cluster will not be an issue.

### Network Supported Direct Access

The simplest method, from an architecture perspective, for communicating with Cassandra pods involves having Kubernetes run in an environment where the pod network address space is known and advertised with routes at the network layer. In these types of environments, BGP and static routes may be defined at layer 3 in the OSI model. This allows for IP connectivity / routing directly to pods and services running within Kubernetes from **both** inside and outside the cluster. Additionally, this approach will allow for the consumption of service addresses externally. Unfortunately, this requires an advanced understanding of both k8s networking and the infrastructure available within the enterprise or cloud where it is hosted.

**Pros**

* Zero additional configuration within the application
* Works inside and outside of the Kubernetes network

**Cons**

* Requires configuration at the networking layer within the cloud / enterprise environment
* Not all environments can support this approach. Some cloud environments do not have the tooling exposed for customers to enable this functionality.

### Host Network

Host Network configuration exposes all network interfaces to the underlying pod instead of a single virtual interface. This will allow Cassandra to bind on the worker's interface with an externally accessible IP. Any container that is launched as part of the pod will have access to the host's interface, it cannot be fenced off to a specific container.

Enabling this behavior is done by passing hostNetwork: true in the podTemplateSpec at the top level.

**Pros**

* External connectivity is possible as the service is available at the nodes IP instead of an IP internal to the Kubernetes cluster.

**Cons**

* If a pod is rescheduled the IP address of the pod can change
* In some K8s distributions this is a privileged operation
* Additional automation will be required to identify the appropriate IP and set it for listen_address and broadcast_address
* Only one Cassandra pod may be started per worker, regardless of `allowMultiplePodsPerWorker` setting.

### Host Port

Host port is similar to host network, but instead of being applied at the pod level, it is applied to specified containers within the pod. For each port listed in the container's block a hostPort: external_port key value is included. external_port is the port number on the Kubernetes worker that should be forwarded to this container's port.

At this time we do not allow for modifying the cassandra container via podTemplateSpec, thus configuring this value is not possible without patching each rack's stateful set.

**Pros**

* External connectivity is possible as the service is available at the nodes IP instead of an IP internal to the Kubernetes cluster.
* Easier configuration a separate container to determine the appropriate IP is no longer required.

**Cons**

* If a pod is rescheduled the IP address of the pod can change
* In some K8s distributions this is a privileged operation
* Only one Cassandra pod may be started per worker, regardless of allowMultiplePodsPerWorker setting.
* Not recommended according to K8s [Configuration Best Practices](https://kubernetes.io/docs/concepts/configuration/overview/#services).

## Services

If the application is running within the same Kubernetes cluster as the Cassandra cluster connectivity is simple. cass-operator exposes a number of services representing a Cassandra cluster, datacenters, and seeds. Applications running within the same Kubernetes cluster may leverage these services to discover and identify pods within the target C* cluster.

External applications do not have access to this information via DNS as internal applications do. It is possible to forward DNS requests to Kubernetes from outside the cluster and resolve configured services. Unfortunately, this will provide the internal pod IP addresses and not those routable unless Network Supported Direct Access is possible within the environment. In most scenarios, external applications will not be able to leverage the exposed services from cass-operator.

### Load Balancer

It is possible to configure a service within Kubernetes outside of those provided by cass-operator that is accessible from outside of the Kubernetes cluster. These services have a type: LoadBalancer key in the spec block. In most cloud environments this results in a native cloud load balancer being provisioned to point at the appropriate pods with an external IP. Once the load balancer is provisioned running kubectl get svc will display the external IP address that is pointed at the C* nodes.

**Pros**

* Available from outside of the cluster

**Cons**

* Requires use of an `AddressTranslator` client side to restrict attempts by the drivers to connect directly with pods and instead direct connnections to the load balancer.
* Removes the possibility of TokenAwarePolicy LBP
* Does not support TLS termination at the service layer, but rather within the application.

## Ingresses

Ingresses forward requests to services running within a Kubernetes cluster based on rules. These rules may include specifying the protocol, port, or even path. They may provide additional functionality like termination of SSL / TLS traffic, load balancing across a number of protocols, and name-based virtual hosting. Behind the Ingress K8s type is an Ingress Controller. There are a number of controllers available with varying features to service the defined ingress rules. Think of Ingress as an interface for routing and an Ingress Controller as the implementation of that interface. In this way, any number of Ingress Controllers may be used based on the workload requirements. Ingress Controllers function at Layer 4 & 7 of the OSI model.

When the ingress specification was created it focused specifically on HTTP / HTTPS workloads. From the documentation, "An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.`Type=NodePort` or Service.`Type=LoadBalancer`." Cassandra workloads do NOT use HTTP as a protocol, but rather a specific TCP protocol.

Ingress Controllers we are looking to leverage require support for TCP load balancing. This will provide routing semantics similar to those of LoadBalancer Services. If the Ingress Controller also supports SSL termination with [SNI](https://en.wikipedia.org/wiki/Server_Name_Indication). Then secure access is possible from outside the cluster while _keeping Token Aware routing support_. Additionally, operators should consider whether the chosen Ingress Controller supports client SSL certificates allowing for [mutual TLS](https://en.wikipedia.org/wiki/Mutual_authentication) to restrict access from unauthorized clients.

**Pros**

* Highly-available, entrypoint in to the cluster
* _Some_ implementations support TCP load balancing
* _Some_ implementations support Mutual TLS
* _Some_ implementations support SNI

**Cons**

* No _standard_ implementation. Requires careful selection.
* Initially designed for HTTP/HTTPS only workloads
* Many ingresses support pure TCP workloads, but it is _NOT_ defined in the original design specification. Some configurations require fairly heavy handed templating of base configuration files. This may lead to difficult upgrade paths of those components in the future.
* _Only some_ implementations support TCP load balancing
* _Only some_ implementations support mTLS
* _Only some_ implementations support SNI with TCP workloads

### Traefik

[Traefik](https://containo.us/traefik/) is an open-source Edge Router that is designed to work in a number of environments, not just Kubernetes. When running on Kubernetes, Traefik is generally installed as an Ingress Controller. Traefik supports TCP load balancing along with SSL termination and SNI. It is automatically included as the default Ingress Controller of [K3s](https://k3s.io/) and [K3d](https://k3d.io/).

#### Sample Implementations

* [Simple load balancing](traefik/load-balancing)
* [mTLS with load balancing](traefik/mtls-load-balancing)
* [mTLS with SNI](traefik/mtls-sni)

## Service Meshes


## Java Driver Configuration

Each of the three reference implementations has a corresponding configuration in the [sample application](sample-java-application) with associated configuration files and sample code.

## Sample `CassandraDatacenter` Reference

See [`sample-cluster-sample-dc.cassdc.yaml`](sample-cluster-sample-dc.cassdc.yaml)

## SSL Certificate Generation

See [ssl/README.md](ssl/README.md) for directions around creating a CA, client, and ingress certificates.

## References

1. [Accessing Kubernetes Pods from Outside of the Cluster](http://alesnosek.com/blog/2017/02/14/accessing-kubernetes-pods-from-outside-of-the-cluster/)
1. [Traefik Docs](https://docs.traefik.io/)
1. [Kubernetes Configuration Best Practices](https://kubernetes.io/docs/concepts/configuration/overview/#services)
34 changes: 34 additions & 0 deletions docs/ingress/sample-cluster-sample-dc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Sized to work on 3 k8s workers nodes with 1 core / 4 GB RAM
# See neighboring example-cassdc-full.yaml for docs for each parameter
apiVersion: cassandra.datastax.com/v1beta1
kind: CassandraDatacenter
metadata:
name: sample-dc
spec:
clusterName: sample-cluster
serverType: cassandra
serverVersion: "3.11.6"
serverImage: datastax/cassandra:3.11.6-ubi7
configBuilderImage: datastax/cass-config-builder:1.0.0-ubi7
managementApiAuth:
insecure: {}
racks:
- name: sample-rack
size: 3
allowMultipleNodesPerWorker: true
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: local-path
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
config:
cassandra-yaml: {}
# authenticator: org.apache.cassandra.auth.PasswordAuthenticator
# authorizer: org.apache.cassandra.auth.CassandraAuthorizer
# role_manager: org.apache.cassandra.auth.CassandraRoleManager
jvm-options:
initial_heap_size: "800M"
max_heap_size: "800M"
5 changes: 5 additions & 0 deletions docs/ingress/sample-java-application/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
target
client.keystore
client.truststore
*.iml
.idea
32 changes: 32 additions & 0 deletions docs/ingress/sample-java-application/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Sample Application with Kubernetes Ingress

This project is to illustrates how to configure and validate connectivity to Cassandra clusters running within Kubernetes. There are three reference client implementations available:

* mTLS and SNI based balancing
* Load balancing with mTLS
* Simple load balancing

At this time there is some _slight_ tweaking required to the configuration files to specify the keystore, truststore, and approach to use.

Any connections requiring TLS support should place their keystore an truststore in the `src/main/resources/` directory. If you followed the [SSL](../ssl) guide then you should already have these files available.

## Building and Running

```
mvn package
java -cp target/sample-k8s-connectivity-1.0-SNAPSHOT-jar-with-dependencies.jar com.datastax.examples.SampleApp
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Discovered Nodes
sample-dc:sample-rack:fd280adc-e55e-4f3d-97d1-138a1e1abef4
sample-dc:sample-rack:fd280adc-e55e-4f3d-97d1-138a1e1abef4
sample-dc:sample-rack:fd280adc-e55e-4f3d-97d1-138a1e1abef4
Coordinator: sample-dc:sample-rack:fd280adc-e55e-4f3d-97d1-138a1e1abef4
[data_center:'sample-dc', rack:'sample-rack', host_id:a7a45d6e-70e3-4e6d-b29c-5dba9a61a282, release_version:'3.11.6']
Coordinator: sample-dc:sample-rack:fd280adc-e55e-4f3d-97d1-138a1e1abef4
[data_center:'sample-dc', rack:'sample-rack', host_id:7e2921a6-e170-4a4f-bf0f-011ab83b3739, release_version:'3.11.6']
[data_center:'sample-dc', rack:'sample-rack', host_id:fd280adc-e55e-4f3d-97d1-138a1e1abef4, release_version:'3.11.6']
```
65 changes: 65 additions & 0 deletions docs/ingress/sample-java-application/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.datastax.examples</groupId>
<artifactId>sample-k8s-connectivity</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>

<name>sample-k8s-connectivity</name>
<url>http://maven.apache.org</url>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<driver.version>4.7.2</driver.version>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>

<dependencies>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>1.4</version>
</dependency>
<dependency>
<groupId>com.datastax.oss</groupId>
<artifactId>java-driver-core</artifactId>
<version>${driver.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<!-- NOTE: We don't need a groupId specification because the group is
org.apache.maven.plugins ...which is assumed by default.
-->
<artifactId>maven-assembly-plugin</artifactId>
<version>3.3.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id> <!-- this is used for inheritance merges -->
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
package com.datastax.examples;

import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.config.DriverConfigLoader;
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.metadata.Node;
import com.datastax.oss.driver.internal.core.metadata.SniEndPoint;

import java.net.InetSocketAddress;

public class SampleApp {
public static void main( String[] args ) throws Exception {
SampleApp app = new SampleApp();
app.run();
}

public void run() throws Exception {
CqlSession session = getLoadBalancedSession();

System.out.println("Discovered Nodes");
for (Node n : session.getMetadata().getNodes().values()) {
System.out.println(String.format("%s:%s:%s", n.getDatacenter(), n.getRack(), n.getHostId()));
}
System.out.println();

ResultSet rs = session.execute("SELECT data_center, rack, host_id, release_version FROM system.local");
Node n = rs.getExecutionInfo().getCoordinator();
System.out.println(String.format("Coordinator: %s:%s:%s", n.getDatacenter(), n.getRack(), n.getHostId()));
rs.forEach(row -> {
System.out.println(row.getFormattedContents());
});
System.out.println();

rs = session.execute("SELECT data_center, rack, host_id, release_version FROM system.peers");
n = rs.getExecutionInfo().getCoordinator();
System.out.println(String.format("Coordinator: %s:%s:%s", n.getDatacenter(), n.getRack(), n.getHostId()));
rs.forEach(row -> {
System.out.println(row.getFormattedContents());
});

session.close();
}

private CqlSession getLoadBalancedSession() {
return CqlSession.builder()
.withConfigLoader(DriverConfigLoader.fromClasspath("load-balanced.conf"))
.build();
}

private CqlSession getMtlsLoadBalancedSession() {
return CqlSession.builder()
.withConfigLoader(DriverConfigLoader.fromClasspath("mtls-load-balanced.conf"))
.build();
}

private CqlSession getMtlsSniSession() {
// Ingress address
InetSocketAddress ingressAddress = new InetSocketAddress("traefik.k3s.local", 9042);

// Endpoint (contact point)
SniEndPoint endPoint = new SniEndPoint(ingressAddress, "ec448e83-8b83-407b-b342-13ce0250001c");

return CqlSession.builder()
.withConfigLoader(DriverConfigLoader.fromClasspath("mtls-sni.conf"))
.build();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
package com.datastax.kubernetes;

import com.datastax.oss.driver.api.core.addresstranslation.AddressTranslator;
import com.datastax.oss.driver.api.core.context.DriverContext;
import edu.umd.cs.findbugs.annotations.NonNull;

import java.net.InetSocketAddress;

public class KubernetesIngressAddressTranslator implements AddressTranslator {
private DriverContext driverContext;

public KubernetesIngressAddressTranslator(DriverContext driverContext) {
this.driverContext = driverContext;
}

@NonNull
@Override
public InetSocketAddress translate(@NonNull InetSocketAddress address) {
String ingressAddress = driverContext.getConfig().getDefaultProfile().getString(KubernetesIngressOption.INGRESS_ADDRESS);
int ingressPort = driverContext.getConfig().getDefaultProfile().getInt(KubernetesIngressOption.INGRESS_PORT);

return new InetSocketAddress(ingressAddress, ingressPort);
}

@Override
public void close() {
// NOOP
}
}
Loading

0 comments on commit 4dd8e7c

Please sign in to comment.