Production Ready K8s cluster

A K8S cluster implementation ready for heavy production load

Components Used

Name : Version	Purpose	Alternatives	Advantages
Terraform 1.5.4 Docs	Hardware Provisioner Initial Setup	`Salt` `Anible`	1. Easy syntax 2. Sufficient community and documentation 3. Much better suited for hardware provisioning
Hetzner Provider 1.42.1 Docs	Deploying servers	`Vultr` `DigitalOcean`	1. Cheaper :) 2. Good community overlooking provider
Ansible 1.5.6 Docs	Automating Tasks	`Salt`	1. No footprint on target hosts
Helm 3.12.2 Docs	Resource Controll	`Non-I-know-Of`	:)
S3cmd 2.3.0 Docs	Backup on 3s	`Cyberduck` `Rclone`	1. Easey to setup 2. Huge community and documentation 3. Python (Easy to customize if needed)
K8s 1.25.0-00 Docs	Orchestrator	`Docker Swarm` `Nomad`	1. k8s to swarm is like ocean to a puddle 2. Nomad is quite greate (Needs R&D)
Cri-o 1.24.6 Docs	Container Runtime Interface	`Containerd`	1. Very efficient 2. Supported by k8s sigs 3. Very light (But lacks some functionality)
Nginx Ingress Controller Chart: 0.18.1 Docs	Ingress Controller	`Traefik` `Api Gateway`	1. Much faster that Traefik 2. Proven on production 3. Good community and documentation 4. Api Gateway is quite amazing (Needs R&D)
Openebs Cstore 3.8.0 Docs	Storage Solution	`Openebs (Jiva)` `Ceph fs` `Rook fs` `Longhorn`	1. Much less complex than ceph on both setup and management 2. Good community and documentation 3. Longhorn is quite nice (Needs R&D)
Ubuntu 22.04 Docs	Operating system	`Debian` `Centos`	1. Bigger community 2. Faster releases than debian 3. Bigger community than any other OS 4. Not cash grapping like centos (Yet :))
Cert Manager Chart: v1.12.3 Docs	Certificate Controller	`Non-I-know-Of`	:)
Fluentbit Chart: 0.37.1Docs	Log Collctor/Shipper	`Logstash` `fluentd`	1. No seperate component for shipper and collector 2. No extra dependency 3. Very efficient (faster than fluentd) 4. Almost zero foot print (Comparing to alternatives) 5. Much easier to setup and manage 6. Good number of useful plugins
Elaticsearch Chart: 2.9.0 Docs	Log Analysis	`Loki`	1. More rigorious indexing 2. Loki needs more R&D
Kube Prometheus Stack Docs	Monitoring	`Prometheus`+`Grafana`	1. One single chart (so easier to manage and setup) 2. Preconfigured for k8s components
Haproxy latest Docs	Control plain loadbalancer	`CDN`	1. Easier to setup 2. Custome health check rules 3. Since cluster is initiated on domain, CDN can be used too
Calico 3.26.1 Docs	Container Network Interface	`Flannel` `Cillium` `Canal`	1. Support for network policy 2. Multi AZ support 3. Quite easy to setup 4. Great documentation and community 5. eFFICIENT l3 NETWORK 6. Configureable BGP (bird agent)
Kibana 8.9.1 Docs	Log Visualizer	`Grafana` `Datadog`	1. Free (comparing to datadog which is awsome) 2. Customized specifically for ealstic search so they are much more compatible 3. Easier to setup 4. Very light weight

Before you begin

Note Each ansible role has a general and a specific Readme file. It is strongly encouraged to read them before firing off

p.s: Start with the readme file of main setup playbook

Create an Api on hetzner
Create a server as terraform and ansible provisioner (Needless to say that ansible and terraform must be installed)
Clone the project
In modular_terraform folder create a terraform.tfvars
- The file must contain the following variables
  - hcloud_token "APIKEY"
  - image_name = "ubuntu-22.04"
  - server_type = "cpx31"
  - location = "hel1"
Run terraform init to create the required lock file
Before firing off, run terraform plan to see if everything is alright
Run terraform apply
Go drink a cup of coffe and come back in 30 minutes or so (Hopefully everything must be up and running by then (: )

Known issues

When creating SDS, Coredns and webhook addmision controller must be deleted other wise CSPC would not be applied correctly
No alert manager
Haproxy could be a single point of failure (if ther is no backup (namely CDN))
Audit policy is way too general which would result in huge overhead
Terraform is limited to Hetzner
Communication is over public network (Encrypted but still vulnerable to Zero-day exploits since its observable)
- Firewall policies minimize the observable scope
Since updating procedure on k8s is differnt from version to version, currently, only update form V1.25 to 1.26 is supported

Work flow

Run the following command for terraform to install dependencies and create the lock file

terraform init

Run the following command and check if there are any problems with terraform

terraform plan

Apply terraform modules and get started

terraform apply

Note Add haproxy ip as the A record for control plain record Add worker IP addreses for Grafana, Prometheus and kibana

Check if Prometheus works
Note

Check if all metrics are exposed properly

Check if Grafana works
Note

All dashboard are provisioned in config map To add custom dashbaord on load, add it to dashbaord as a .json file. It would automatically be loaded to Grafana

Check if Elasticsearch is green

kubectl get elasticsearch -n elastic-system

Check if Kibana works

Check if Fluentbit works

To Clean up everything (including the nodes themselvs)

terraform destroy

Name		Name	Last commit message	Last commit date
Latest commit History 189 Commits
Manifest		Manifest
Operation_Guide		Operation_Guide
inventory		inventory
modular_terraform		modular_terraform
playbook		playbook
roles		roles
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Production Ready K8s cluster

Components Used

Before you begin

Known issues

Work flow

About

Releases

Packages

Languages

xogoodnow/Kubernetes_Cluster

Folders and files

Latest commit

History

Repository files navigation

Production Ready K8s cluster

Components Used

Before you begin

Known issues

Work flow

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages