Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs front-page refresh #1071

Merged
merged 6 commits into from
Nov 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 52 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,65 @@
# K8ssandra Operator

**[Documentation site](https://docs.k8ssandra.io/)**

This is the Kubernetes operator for K8ssandra.

K8ssandra is a Kubernetes-based distribution of Apache Cassandra that includes several tools and components that automate and simplify configuring, managing, and operating a Cassandra cluster.
**[Documentation site](https://docs.k8ssandra.io/)**

k8ssandra-operator is a turnkey solution to manage [Apache Cassandra](https://cassandra.apache.org/_/index.html) and [DSE](https://www.datastax.com/products/datastax-enterprise) on Kubernetes. Apache Cassandra is the premiere wide column NoSQL data store, offering low latency, geo-replication, and the capacity to store petabytes of data. Apache Cassandra is in use in 90% of the Fortune 500 in some capacity.

DataStax Enterprise, DSE, is the DataStax distribution of Apache Cassandra, offering additional features such as advanced security, search, and graph, as well as features not yet available in Cassandra like vector search for generative AI applications.

k8ssandra-operator allows for the deployment of multiple Apache Cassandra datacenters, spanned over multiple Kubernetes clusters. The intention of this architecture is to provide geo-replication to enhance latency (by moving data closer to the end user) and availability (by providing multiple datacenters to serve requests in the event of a datacenter failure or network partition).

Apache Cassandra offers rack and failure zone aware data replication which is both replicated and sharded for performance and protection.

It incorporates the following functionality;

### Deployment

Apache Cassandra can be deployed into multiple datacenters in separate regions or availability/failure zones. k8ssandra-operator makes this possible by enabling communication between multiple Kubernetes clusters and deploying Cassandra datacenters into them.

This distinguishes k8ssandra-operator from [cass-operator](https://github.com/k8ssandra/cass-operator) (which is used internally within k8ssandra-operator) which does not automate multi-region deployments.

A single k8ssandra-operator instance in a control plane cluster can manage many data plane DCs across multiple Kubernetes clusters, and split across multiple Cassandra clusters. Clusters of up to 1000 nodes have been [tested](https://dok.community/blog/1000-node-cassandra-cluster-on-amazons-eks/) and confirmed to perform well.

Advanced Cassandra features such as Change Data Capture (CDC) are supported and can be configured using Kubernetes manifests.

### Monitoring

Monitoring is a critical service in any distributed system, and k8ssandra-operator provides a rich suite of Apache Cassandra metrics via an [agent](https://github.com/k8ssandra/management-api-for-apache-cassandra) added to the Cassandra JVM.

By integrating with [Vector](https://vector.dev/), k8ssandra-operator allows metrics to flow to a location of the user's choice, including an existing [Prometheus](https://prometheus.io/) or [Mimir](https://grafana.com/oss/mimir/) instance. A variety of other protocols and systems such as AMQP, Elasticsearch, Kafka, or Redis (see [here](https://vector.dev/docs/reference/configuration/sinks/) for a full list of integrations) are also supported.

Metrics pipelines can be configured using Kubernetes custom resources, allowing for the creation of multiple pipelines to support different use cases across many clusters.

Cassandra auditing and monitoring features such as full query logging are supported and can be configured direct from a K8ssandraCluster manifest.

### Repairs and data maintenance

Apache Cassandra requires regular maintenance to ensure data is replicated consistently across the cluster. k8ssandra-operator automates this process by running repairs on a regular schedule using [Reaper](https://cassandra-reaper.io/), a widely adopted solution for anti-entropy repairs in Cassandra maintained by the K8ssandra team.

Using k8ssandra-operator, you can use Kubernetes manifests to configure and monitor the success of repair schedules across many Cassandra datacenters and clusters.

### Backup and restore

k8ssandra-operator uses [Medusa](https://github.com/thelastpickle/cassandra-medusa) to enable backup of Cassandra's SSTables to cloud storage locations such as S3 buckets, GCS and Azure storage.

Backup and restore schedules can be configured using Kubernetes manifests, allowing for declarative, auditable management of backup and restore processes.

### Flexible APIs

[Stargate](https://stargate.io/) for Apache Cassandra offers advanced APIs including integration with the [Mongoose](https://mongoosejs.com/) object modelling framework for node.js, GraphQL, and REST. It can also enhance Cassandra's native CQL performance in some cluster topologies.

K8ssandra includes the following components:
Using k8ssandra-operator, Stargate can be deployed and configured via simple Kubernetes manifests.

* [Cassandra](https://cassandra.apache.org/)
* [Stargate](https://stargate.io/)
* [Medusa](https://github.com/thelastpickle/cassandra-medusa)
* [Reaper](http://cassandra-reaper.io/)
* [Grafana](https://grafana.com/)
* [Prometheus](https://prometheus.io/)
### Where to from here?

K8ssandra 1.x is configured, packaged, and deployed via Helm charts. Those Helm charts can be found in the [k8ssandra](https://github.com/k8ssandra/k8ssandra) repo.
This documentation covers everything from install details, deployed components, configuration references, and guided outcome-based tasks.

K8ssandra 2.x will be based on this operator.
To install k8ssandra-operator start [here] ({{< relref "install/" >}}).

One of the primary features of this operator is multi-cluster support which will facilitate multi-region Cassandra clusters.
Be sure to leave us a <a class="github-button" href="https://github.com/k8ssandra/k8ssandra" data-icon="octicon-star" aria-label="Star k8ssandra/k8ssandra on GitHub">star</a> on Github!

## Architecture
The K8ssandra operator is being developed with multi-cluster support first and foremost in mind. It can be used seamlessly in single-cluster deployments as well.
Expand Down
64 changes: 58 additions & 6 deletions docs/content/en/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,70 @@ description: "K8ssandra documentation: architecture, configuration, guided tasks
type: docs
---

k8ssandra-operator is a turnkey solution to manage [Apache Cassandra](https://cassandra.apache.org/_/index.html) and [DSE](https://www.datastax.com/products/datastax-enterprise) on Kubernetes. Apache Cassandra is the premiere wide column NoSQL data store, offering low latency, geo-replication, and the capacity to store petabytes of data. Apache Cassandra is in use in 90% of the Fortune 500 in some capacity.

DSE is the DataStax distribution of Apache Cassandra, offering additional features such as advanced security, analytics, and search, as well as features not yet available in Cassandra like vector search for generative AI applications.

k8ssandra-operator allows for the deployment of multiple Apache Cassandra datacenters, spanned over multiple Kubernetes clusters. The intention of this architecture is to provide geo-replication to enhance latency (by moving data closer to the end user) and availability (by providing multiple datacenters to serve requests in the event of a datacenter failure or network partition).

Apache Cassandra offers rack and failure zone aware data replication which is both replicated and sharded for performance and protection.

It incorporates the following functionality;

### Deployment

Apache Cassandra can be deployed into multiple datacenters in separate regions or availability/failure zones. k8ssandra-operator makes this possible by enabling communication between multiple Kubernetes clusters and deploying Cassandra datacenters into them.

This distinguishes k8ssandra-operator from [cass-operator](https://github.com/k8ssandra/cass-operator) (which is used internally within k8ssandra-operator) which does not automate multi-region deployments.

A single k8ssandra-operator instance in a control plane cluster can manage many data plane DCs across multiple Kubernetes clusters, and split across multiple Cassandra clusters. Clusters of up to 1000 nodes have been [tested](https://dok.community/blog/1000-node-cassandra-cluster-on-amazons-eks/) and confirmed to perform well.

Advanced Cassandra features such as Change Data Capture (CDC) are supported and can be configured using Kubernetes manifests.

### Monitoring

Monitoring is a critical service in any distributed system, and k8ssandra-operator provides a rich suite of Apache Cassandra metrics via an [agent](https://github.com/k8ssandra/management-api-for-apache-cassandra) added to the Cassandra JVM.

By integrating with [Vector](https://vector.dev/), k8ssandra-operator allows metrics to flow to a location of the user's choice, including an existing [Prometheus](https://prometheus.io/) or [Mimir](https://grafana.com/oss/mimir/) instance. A variety of other protocols and systems such as AMQP, Elasticsearch, Kafka, or Redis (see [here](https://vector.dev/docs/reference/configuration/sinks/) for a full list of integrations) are also supported.

Metrics pipelines can be configured using Kubernetes custom resources, allowing for the creation of multiple pipelines to support different use cases across many clusters.

Cassandra auditing and monitoring features such as full query logging are supported and can be configured direct from a K8ssandraCluster manifest.

### Repairs and data maintenance

Apache Cassandra requires regular maintenance to ensure data is replicated consistently across the cluster. k8ssandra-operator automates this process by running repairs on a regular schedule using [Reaper](https://cassandra-reaper.io/), a widely adopted solution for anti-entropy repairs in Cassandra maintained by the K8ssandra team.

Using k8ssandra-operator, you can use Kubernetes manifests to configure and monitor the success of repair schedules across many Cassandra datacenters and clusters.

### Backup and restore

k8ssandra-operator uses [Medusa](https://github.com/thelastpickle/cassandra-medusa) to enable backup of Cassandra's SSTables to cloud storage locations such as S3 buckets, GCS and Azure storage.

Backup and restore schedules can be configured using Kubernetes manifests, allowing for declarative, auditable management of backup and restore processes.

### Flexible APIs

[Stargate](https://stargate.io/) for Apache Cassandra offers advanced APIs including integration with the [Mongoose](https://mongoosejs.com/) object modelling framework for node.js, GraphQL, and REST. It can also enhance Cassandra's native CQL performance in some cluster topologies.

Using k8ssandra-operator, Stargate can be deployed and configured via simple Kubernetes manifests.

### Where to from here?

This documentation covers everything from install details, deployed components, configuration references, and guided outcome-based tasks.

To install k8ssandra-operator start [here] ({{< relref "install/" >}}).

Be sure to leave us a <a class="github-button" href="https://github.com/k8ssandra/k8ssandra" data-icon="octicon-star" aria-label="Star k8ssandra/k8ssandra on GitHub">star</a> on Github!

## Features for single- and multi-cluster Kubernetes environments

| K8ssandra Operator: enhanced capabilities | Initial K8ssandra project|
| ----------- | ----------- |
| K8ssandra Operator is our most recent offering. In a **unified operator**, K8ssandra Operator provides an entirely new, solidified set of features for Kubernetes + Cassandra deployments. The features include robust management (cass-operator), API integration (Stargate), anti-entropy repairs (Reaper), and backup/restore (Medusa). Important enhancements include **multi-cluster** and **multi-region** support, which enables greater scalability and availability for enterprise apps and data. Single cluster/region deployments are also supported with K8ssandra Operator.| K8ssandra v1.4.x is our project's initial implementation. It continues to provide a set of separate Helm charts that you can use to configure and deploy Apache Cassandra&reg; into a single-cluster, single-region Kubernetes environment. |
| For enhanced capabilities, we recommend that you explore K8ssandra Operator [local install]({{< relref "install/local" >}}) topic, which focuses on single- or multi-cluster deployments on local dev **kind** Kubernetes clusters, using the provided `make` commands, `helm`, or `kustomize`. | Start in the K8ssandra v1.4.x [install](https://docs-v1.k8ssandra.io/install/local/) topics, which include the steps for single-cluster installs on local or cloud-provider Kubernetes platforms. |
## K8ssandra 1x

We previously released a product named "k8ssandra" (as distinct from k8ssandra-operator). This product comprised a set of Helm charts and is still available for existing users.

We strongly advise new users to adopt k8ssandra-operator since that is where future development is continuing.

If you're using K8ssandra v1.4.x, you may continue to do so. Or consider stepping up to the project's latest implementation with K8ssandra Operator.
A comparison between the two can be found [here]({{< relref "reference/old-k8ssandra" >}}).

## Compatibility matrix

Expand Down
8 changes: 8 additions & 0 deletions docs/content/en/reference/old-k8ssandra/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
## Features for single- and multi-cluster Kubernetes environments

| K8ssandra Operator: enhanced capabilities | Initial K8ssandra project|
| ----------- | ----------- |
| K8ssandra Operator is our most recent offering. In a **unified operator**, K8ssandra Operator provides an entirely new, solidified set of features for Kubernetes + Cassandra deployments. The features include robust management (cass-operator), API integration (Stargate), anti-entropy repairs (Reaper), and backup/restore (Medusa). Important enhancements include **multi-cluster** and **multi-region** support, which enables greater scalability and availability for enterprise apps and data. Single cluster/region deployments are also supported with K8ssandra Operator.| K8ssandra v1.4.x is our project's initial implementation. It continues to provide a set of separate Helm charts that you can use to configure and deploy Apache Cassandra&reg; into a single-cluster, single-region Kubernetes environment. |
| For enhanced capabilities, we recommend that you explore K8ssandra Operator [local install]({{< relref "install/local" >}}) topic, which focuses on single- or multi-cluster deployments on local dev **kind** Kubernetes clusters, using the provided `make` commands, `helm`, or `kustomize`. | Start in the K8ssandra v1.4.x [install](https://docs-v1.k8ssandra.io/install/local/) topics, which include the steps for single-cluster installs on local or cloud-provider Kubernetes platforms. |

If you're using K8ssandra v1.4.x, you may continue to do so. Or consider stepping up to the project's latest implementation with K8ssandra Operator.
Loading