Implementing CockroachDB on Kubernetes

--- tags: CockroachDB, Kubernetes, Helm, Prometheus, InfluxDB, Grafana --- # Implementing CockroachDB on Kubernetes This document gives an overview on a summer student project by Albert Iho in 2019. It can also be used as an introductional tutorial into the topics discussed in the document. In the project [CockroachDB](https://www.cockroachlabs.com/) was installed using the [Helm](https://helm.sh/) chart, which is suboptimal for production clusters. The project also included [testing](https://codimd.web.cern.ch/s/BkSShRAGr) of CockroachDB. Production clusters should be initiated from hand configured YAML-files for better control of the resources and overall better performance. [TOC] ## Setting up kubernetes To enable automated [certificate requesting](https://kubernetes.io/docs/tasks/tls/managing-tls-in-a-cluster/#create-a-certificate-signing-request) in your Kubernetes cluster, you need to insert the label ```--labels cert_manager_api=true``` into your cluster creation clause. Hopefully in the future you wont need to list all the labels, but currently a Kubernetes cluster where CockroachDB installation is possible to be created with the following cluster creation clause: ``` last updated on: [aiho@linuxvm ~]$ date Thu Aug 1 15:06:32 CEST 2019 openstack coe cluster create <clustername> --keypair <yourkey> --node-count 3 --cluster-template kubernetes-1.13.3-2 --flavor m2.large --labels ingress_controller=traefik --labels kube_csi_enabled=True --labels kube_csi_version=v0.3.2 --labels kube_tag=v1.13.3-12 --labels container_infra_prefix=gitlab-registry.cern.ch/cloud/atomic-system-containers/ --labels manila_enabled=True --labels cgroup_driver=cgroupfs --labels cephfs_csi_enabled=True --labels cvmfs_csi_version=v0.3.0 --labels admission_control_list=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Priority --labels flannel_backend=vxlan --labels manila_version=v0.3.0 --labels cvmfs_csi_enabled=True --labels cvmfs_tag=qa --labels cephfs_csi_version=v0.3.0 --labels cert_manager_api=true --labels monitoring_enabled=true ``` The CockroachDB recommedations for production cluster setup can be found [here](https://www.cockroachlabs.com/docs/stable/recommended-production-settings.html). ## Installation In my implementation I followed this [tutorial](https://www.cockroachlabs.com/docs/stable/orchestrate-cockroachdb-with-kubernetes.html) using [Helm](https://helm.sh/) with the help of [CERN docs](https://clouddocs.web.cern.ch/clouddocs/containers/tutorials/helm.html). The other approach includes YAML configuration files which I found to be way harder to manage than Helm. First step was to create the Tiller serviceaccount to manage helm, and after that I installed cockroachdb using manually configured values for Storage and StorageClass. ``` helm install --name <name> --set Secure.Enabled=true stable/cockroachdb \ --set Storage=<your storage size>Gi \ --set StorageClass=geneva-cephfs-testing ``` The ```geneva-cephfs-testing``` StorageClass creates Manila shares which means **it cannot be used from your private project**. Private projects are currently not allowed to use [dynamic provisioning](https://kubernetes.io/docs/concepts/storage/dynamic-provisioning/), which means you need to move into a (preferrably) test environment to install CockroachDB. After installing the Helm-chart, you should check that the dynamic provision worked, any debugging can be done by checking the Pod logs (```kubectl logs <pod name>```) or the cluster events (```kubectl get events```). If the manila shares exist and were succesfully bound, you can move on to confirming succesful certificate request creation. Certificate signing requests can be viewed by using ```kubectl get csr``` and the result should somewhat like this: ``` # kubectl get csr NAME AGE REQUESTOR CONDITION default.client.root 21s system:serviceaccount:default:my-release-cockroachdb Pending default.node.my-release-cockroachdb-0 15s system:serviceaccount:default:my-release-cockroachdb Pending default.node.my-release-cockroachdb-1 16s system:serviceaccount:default:my-release-cockroachdb Pending default.node.my-release-cockroachdb-2 15s system:serviceaccount:default:my-release-cockroachdb Pending ``` After listing all of the csr, you should check that they look somewhat like this ``` # kubectl get csr default.node.my-release-cockroachdb-0 Name: default.node.my-release-cockroachdb-0 Labels: <none> Annotations: <none> CreationTimestamp: Fri, 26 July 2019 07:36:35 -0500 Requesting User: system:serviceaccount:default:my-release-cockroachdb Status: Pending Subject: Common Name: node Serial Number: Organization: Cockroach Subject Alternative Names: DNS Names: localhost my-release-cockroachdb-0.my-release-cockroachdb.default.svc.cluster.local my-release-cockroachdb-0.my-release-cockroachdb my-release-cockroachdb-public my-release-cockroachdb-public.default.svc.cluster.local IP Addresses: 127.0.0.1 10.48.1.6 Events: <none> ``` if everything seems okay you can approve the cockroach *node* certificates with ``` kubectl certificate approve <node certificate name> ``` After that you should be able to see the pods in a running state ``` # kubectl get pods NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 0/1 Running 0 6m my-release-cockroachdb-1 0/1 Running 0 6m my-release-cockroachdb-2 0/1 Running 0 6m my-release-cockroachdb-init-r5pn6 0/1 Init:0/1 0 6m ``` If there are any problems with the pod creation, use logs and events to find out what is wrong with the creation. The newest action (and usually the reason why the pod isnt working) will be the lowest message in the list. If you are unable to debug it by yourself, you can present the logs and ask for help. Succesfull installation of the pods looks like this. ``` # kubectl get pods NAME READY STATUS RESTARTS AGE my-release-cockroachdb-0 1/1 Running 0 10m my-release-cockroachdb-1 1/1 Running 0 10m my-release-cockroachdb-2 1/1 Running 0 10m my-release-cockroachdb-init-r5pn6 0/1 Completed 0 10m ``` ## Accessing the database CockroachDB has an inbuilt SQL client. To access the client a pod that runs it needs to be launched. The pods yaml can be found [here](https://github.com/cockroachdb/cockroach/blob/master/cloud/kubernetes/client-secure.yaml), and the pod is by default named ```client-secure```. Incase the link doesn't work, the [yaml file](https://codimd.web.cern.ch/s/HybblSufH#Creating-the-secure-client-to-run-CockroachDB) to create the pod is in the attachments-section in the end of this document. After downloading the file, you will need to rename ```serviceAccountName``` to ```(helm-installation-name)-cockroachdb``` which is ```my-release``` by default in the tutorial. Next up is creating the clients pod with: ``` $ kubectl apply -f /PATH/TO/FILE/client-secure.yaml ``` ```kubectl apply``` creates or modifies an existing pod. You can also use ```kubectl create``` to create kubernetes-objects from files, but objects created with ```create``` usually warn when modified using ```apply```. ### Running the client After succesfully creating the pod (the pods state should be _Running_), you can access the Cockroach-client by executing the [sql](https://www.cockroachlabs.com/docs/stable/use-the-built-in-sql-client.html) command and asking for a terminal inside the pod: ``` $ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach sql \ --certs-dir=/cockroach-certs \ --host=my-release-cockroachdb-public ``` On a succesfull access, the client should display its welcome-message. After that you can use the [cockroach sql statements](https://www.cockroachlabs.com/docs/stable/sql-statements.html) to control the database. The tutorial suggests [creating a new user](https://www.cockroachlabs.com/docs/stable/create-user.html) to access the admin control panel. **You will need to create an user to connect to the database with a database-string (connecting to a secure database that the helm-installation creates demands an user and a password).** ### Other client commands Once initiated as a pod, you can use all [client commands](https://www.cockroachlabs.com/docs/stable/cockroach-commands.html) by giving them as parameters to kubectl exec. For example certificates could be generated by running the [certs](https://www.cockroachlabs.com/docs/stable/create-security-certificates.html) inside your client ``` $ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach cert create-ca \ --certs-dir=[directory where certs will be generated] \ --ca-key=[directory where key will begenerated] ``` (That is according to the CockroachDB documentation, personally I created a new account inside the database and used rootkey and rootcert in the database connection args. ((please dont do this in production))) ### Execute a bash terminal inside the container It is also possible to execute a bash terminal inside the client-secure pod. This execution can be used to check the insides of the pod and to understand the filesetup inside the pod better. For example if you want to check if the root certificates really exist you can bash into the container and then travel to the directory where they should be. ``` # kubectl exec -it cockroachdb-client-secure /bin/bash root@cockroachdb-client-secure:/cockroach# cd .. root@cockroachdb-client-secure:/# cd cockckroach-certs root@cockroachdb-client-secure:/cockroach-certs# ls -l total 8 lrwxrwxrwx. 1 root root 52 Aug 1 08:25 ca.crt -> /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -rw-r--r--. 1 root root 1139 Aug 1 08:25 client.root.crt -r--------. 1 root root 1679 Aug 1 08:25 client.root.key ``` A bash terminal is really useful in situations where your account creation fails. Being able to go through the insides of the container and check that everything is or isnt where it should be helps a lot with debugging. ## Exposing your database You can monitor your CockroachDB database by exposing your container and then accessing the CockroachDB [Admin UI](https://www.cockroachlabs.com/docs/stable/admin-ui-overview.html) via browser. There are two ways of exposing the container: - [Using kubernetes port-forward](https://codimd.web.cern.ch/s/HybblSufH#Using-Kubernetes-port-forward) - [Exposing your cockroachdb-public service](https://codimd.web.cern.ch/s/HybblSufH#Exposing-a-service) Port-forwarding is the easy way out here, if you are new to the system or want to just take a quick peek at the cluster while installing it, port-forwarding the thing you want to use. If you are going for a production-scale monitored cluster, you will have to expose a service and have your monitoring tool (preferrably [Prometheus](https://prometheus.io/)) scrape off that. The exposed address will be considered as dangerous by web-browsers, so far connecting to clusters has been tested on Google Chrome by manually accepting the connection to the site. (clicking through the red warning boxes) ### Using Kubernetes port-forward Port-forwarding is a nice and easy way to check that your setup is working or that the cluster is running. The idea is very simple, you set a server to point into the pod inside your cluster, which makes it accessible via ip and port. Port-forwarding is done by using: ``` kubectl port-forward <pod name/thing to expose> <port to expose> ``` You can expose any of the CockroachDB pods running. (For windows users, I think this is more simple on Linux): If you are running your Kubernetes in a virtual environment (for example in OpenStack), and the connection to the cluster is from a virtual machine (for example ```aiadmXX```, where XX is machine number), you will have to use an additional parameter while port forwarding: ``` kubectl port-forward --address 0.0.0.0 my-release-cockroachdb-0 8080 ``` will start listening to connections at ```aiadmXX:8080``` If you are running Kubernetes locally, you should be able to access the admin panel by ``` kubectl port-forward my-release-cockroachdb-0 8080 ``` and then going to localhost:8080 on your browser. To access the admin panel you will need to use an account you created inside the database, if you haven't created an account yet, please follow [running the client](https://codimd.web.cern.ch/s/HybblSufH#Running-the-client). ### Exposing a service In the case of CockroachDB, the service you want to expose is ```cockroachdb-public```, which in the tutorial has the helm-name attached in to it resulting in ```my-release-cockroachdb-public```. If you are installing CockroachDB on a fresh cluster, your default services should look somewhat like this: ``` # kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 9d my-release-cockroachdb ClusterIP None <none> 26257/TCP,8080/TCP 8d my-release-cockroachdb-public ClusterIP 10.254.35.127 <none> 26257/TCP,8080/TCP 8d ``` The [Kubernetes docs](https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address/) has a tutorial on exposing a [deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) by using a LoadBalancer. Unluckily **OpenStack doesn't support using LoadBalancer**, so we're going to edit the ```my-release-cockroachdb-public```'s type to ```NodePort``` Open the service in Kubernetes editor by using ``` kubectl edit service my-release-cockroachdb-public ``` Then search and edit the [Type](https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types) ``` ... #BEFORE type: ClusterIP ... ... #AFTER type: NodePort ... ``` then save and exit the editor. After editing the service we can expose it to make it accessible from outside of the cluster. Exposing is done by using ```kubectl expose```: ``` kubectl expose service my-release-cockroachdb-public --type=NodePort ``` The end result should look like this ``` # kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 9d my-release-cockroachdb ClusterIP None <none> 26257/TCP,8080/TCP 8d my-release-cockroachdb-public NodePort 10.254.35.127 <none> 26257:30952/TCP,8080:30496/TCP 8d ``` where the ports mean ``` # kubectl get service ... CLUSTER-IP EXTERNAL-IP PORT(S) AGE ... 10.254.35.127 <none> <internal>:<external>/<protocol> 8d ``` After exposing the service you should be able to connect into it by pointing your browser to ```node-ip:<external port>```, your clusters node-ips can be found by using ```kubectl describe nodes```. To access the admin panel you will need to use an account you created inside the database, if you haven't created an account yet, please follow [running the client](https://codimd.web.cern.ch/s/HybblSufH#Running-the-client). ## Monitoring an exposed database CockroachDB provides a ton of data from inside the database. This data can be scraped with a monitoring tool, and then viewed inside the tool or written forward into another visualizing service. By default CockroachDB offers a [Prometheus](https://prometheus.io/)-formatted metrics service, and the metrics can be found at ```<exposed_ip>:<exposed_port>/_status/vars```. Accessing metrics does not require logging on to the Admin UI page of CockroachDB. ### Deploying Prometheus CERN docs include [a guide for Prometheus](https://clouddocs.web.cern.ch/clouddocs/containers/tutorials/prometheus.html) which uses Helm to download and use **Prometheus-operator**, which is a tool that implements and makes controlling Prometheus easier. Unluckily as of now **there is no way to reconfigure that version of Prometheus-operator to perform remote_writing** due to it automatically updating its configuration settings. In this document we will use a manually deplyed Prometheus following the a [tutorial](https://devopscube.com/setup-prometheus-monitoring-on-kubernetes/) from devopscube. All of the YAML-files will be found in the [attachments](https://codimd.web.cern.ch/s/HybblSufH#Attachments) section in the end of this document. To follow the tutorial fully, Prometheus will be deployed into a new [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) called "monitoring". Deploying steps are - Create a namespace - Create and apply [ClusterRole](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#role-and-clusterrole) [(YAML)](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-ClusterRole-YAML) - Create and apply [ConfigMap](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#understanding-configmaps-and-pods) [(YAML)](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-ConfigMap-YAML) - Create and apply [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment) [(YAML)](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-Deployment-YAML) ### Create a namespace Creating a namespace is very simple, it is done with ``` kubectl create namespace <namespace name> ``` and all the objects that are created into the namespace can be accessed by adding ```-n <namespace name>``` to your command. The YAML-files are configured to use a namespace called ```monitoring```. ``` kubectl create namespace monitoring ``` ### Create a ClusterRole [ClusterRoles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#role-and-clusterrole) are used to define accesses inside the cluster. Next step is to create a new ClusterRole for Prometheus so that it can access the metrics inside Kubernetes and CockroachDB. You can use the [YAML](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-ClusterRole-YAMLL) from attachments to create a working setup. Please check that the YAML isn't outdated before applying it. Save the YAML as a .yaml-file and then create it with: ``` kubectl apply -f /PATH/TO/FILE/clusterrole.yaml ``` where clusterrole.yaml is the name of the file saved on your computer. ### Create a ConfigMap A [ConfigMap](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#understanding-configmaps-and-pods) is used to set [Prometheus configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/). While manually deploying Prometheus you can mainly ignore every other part but [scraping](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) (data collection settings), and due to deploying on kubernetes the [kubernetes_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config) (how to target Kubernetes-objects) is also useful. The [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) parameter was used in the project to transfer the data read by Prometheus to InfluxDB which projected data into [Grafana](https://grafana.com/). The given ConfigMap [YAML](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-ConfigMap-YAML) is configured to only scrape CockroachDB data, it scrapes (collects data) from minions by targetting the pods based by name. The cluster is scraped by targetting the [exposed service](https://codimd.web.cern.ch/s/HybblSufH#Exposing-a-service). Save it and deploy it to monitoring-namespace by using: ``` kubectl apply -f /PATH/TO/FILE/configmap.yaml -n monitoring ``` The deployed ConfigMap should be listed in the configmap objects list listed under the monitoring namespace ``` # kubectl get configmap --all-namespaces NAMESPACE NAME DATA AGE ... monitoring prometheus-server-conf 2 7d23h ... ``` ### Create a Deployment To deploy Prometheus you need to create a Deployment which will request Kubernetes to create a pod. The [Prometheus deployment YAML](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-Deployment-YAML) from attachments deploys Prometheus running in the monitoring-namespace. Please check for possibly newer images from the internet, old images might lead to securityrisks and outdated apps. Save the deployment as a .yaml-file and turn Prometheus on with: ``` kubectl apply -f /PATH/TO/FILE/deployment.yaml --namespace=monitoring ``` In this deployment the [prometheus-server-conf](https://codimd.web.cern.ch/s/HybblSufH#Prometheus-ConfigMap-YAML)-ConfigMap is mounted as a file inside the Prometheus pod, which means that you can rewrite the file inside the pod by applying a new ConfigMap named ```prometheus-server-conf``` that will fully reconfigure Prometheus. For example if you want to also scrape Kubernetes-related data (for example node CPU usage) you can modify the configmap.yaml on your computer, then apply the changes: ``` # kubectl apply -f /PATH/TO/FILE/configmap.yaml prometheus-server-conf configured ``` And delete the currently running prometheus-pod. ``` # kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE ... monitoring prometheus-deployment-7c878596ff-9t9kx 1/1 Running 0 4h38m ... # kubectl delete pod prometheus-deployment-7c878596ff-9t9kx -n monitoring ``` Because no prometheus-pod is running, Kubernetes starts to set up a new pod following the updated ConfigMap, and a new prometheus-pod should be running very soon. You should follow the pod creation by using: ``` kubectl get pods -n monitoring ``` and if the pod isn't creating or it is not running, check the logs of the pod with: ``` kubectl get logs <prometheus pod name> -n monitoring ``` to see what is wrong with the pod. Normally most of the failed pods are due to bad alignment in the configmap.yaml-file. I used [YAMLLint](http://www.yamllint.com/) to fix biggest mistakes in my YAML, and if the pod creation is failing after having correctly linted the file, then the configuration alignment structure is wrong. [Prometheus conf.good](https://github.com/prometheus/prometheus/blob/master/config/testdata/conf.good.yml) can be used to check for correct indentation. A [Bourne shell](https://en.wikipedia.org/wiki/Bourne_shell) can be opened inside the container to explore the insides of a running prometheus-pod with: ``` kubectl exec -it -n monitoring <prometheus pod name> /bin/sh ``` Which should allow you to check that the ConfigMap is correctly mounted and perform other debugging inside the container. (NOTE: root inside container is actually user#1000, which means that by default the container might contain files written by docker (root outside the container(=user#0)), that you will not be able to access. Always try to reconfigure ConfigMap to make changes to prometheus-pod.) ### Prometheus dashboard Prometheus pods can be [port-forwarded](https://codimd.web.cern.ch/s/HybblSufH#Using-Kubernetes-port-forward) by also specifying the namespace that the pod is in. Usually Prometheus is forwarded through port 9090. After turning on a listener you can connect to the Prometheus dashboard, which should be after port-forwarding at: ``` http://<exposed_host>:9090 ``` which should be redirected to ```/graph```. First thing to check is that the configuration matches the one you have deployed. Prometheus' active configuration is found at ```/config```, or status > configuration. If everything seems fine on the ```/config```-page, check that the configuration is landing scrapes on targets at ```/targets```. If everything is green, move on to creating a graph to visually confirm getting data out of your cluster. A good graph to test connection with is ```sys_uptime```. ## Visualising data In the project the data from Prometheus was forwarded to an InfluxDB instance which forwarded to Grafana. This meant that Grafana could be used with the simplicity of InfluxDB querying, but with the efficiency of Prometheus' scraping straight from CockroachDB. If you are planning to perform a similar setup, all explanations and query names can be found at ```<exposed_service>:<external_port>/_status/vars```. Using ctrl+f to search for keywords like 'node' or 'sys_cpu' makes finding correct endpoints very much faster. ## Connecting to the database with an application In the project [SQLAlchemy](https://www.sqlalchemy.org/), which is a python database connection toolkit, was used to connect and perform modifications in the database. Connecting to a database in SQLAlchemy is done by passing the [engine](https://docs.sqlalchemy.org/en/13/core/engines.html) a string. After receiving the string, the engine connects to the database and performs the given actions using the information given in the connection-string. The connection string is formed of the following elements: ```python= 'dialect://username:password@hostname:port/database?connect_args' ``` The CockroachDB database created in this document is secure by default so additional connection arguments are needed. In my implementation I used the root-users key and certificate to acces the database. **This is not recommended for production clusters**. ```python= # connection args to connect into a secure cockroach database connect_args = { 'sslmode': 'require', 'sslrootcert': 'cockroach-certs/ca.crt', 'sslkey': 'cockroach-certs/client.root.key', 'sslcert': 'cockroach-certs/client.root.crt' } ``` The connection arguments will be passed onto the ```create_engine``` function with the connection-string, and the creation clause looks somewhat like this: ```python= # engine for database connection engine = create_engine( 'cockroachdb://username:password@188.185.117.55:30952/testing', connect_args=connect_args, echo=True ) ``` Echo=true means that every action performed by the engine will be printed out into the command line, the printed feedback helps a lot while learning and debugging. If you get an error which says that you lack the rights to perform a SQL command (for example ```INSERT``` or ```SELECT```), open the [CockroachDB client](https://codimd.web.cern.ch/s/HybblSufH#Running-the-client) and [grant](https://www.cockroachlabs.com/docs/stable/grant.html) the user in connection-string all rights needed to perform the actions you wish to do: ```sql= GRANT ALL ON DATABASE testing TO username; ``` With this information you should be able to open a [SQLAlchemy session](https://docs.sqlalchemy.org/en/13/orm/session.html) and start learning how to use SQLAlchemy or using the database actively in your code. ## Attachments ### Creating the secure client to run CockroachDB Remember to rename ```serviceAccountName``` to ```helm-installation-name``` which is ```my-release``` by default in the tutorial. ```yaml= apiVersion: v1 kind: Pod metadata: name: cockroachdb-client-secure labels: app: cockroachdb-client spec: serviceAccountName: cockroachdb initContainers: # The init-certs container sends a certificate signing request to the # kubernetes cluster. # You can see pending requests using: kubectl get csr # CSRs can be approved using: kubectl certificate approve <csr name> # # In addition to the client certificate and key, the init-certs entrypoint will symlink # the cluster CA to the certs directory. - name: init-certs image: cockroachdb/cockroach-k8s-request-cert:0.4 imagePullPolicy: IfNotPresent command: - "/bin/ash" - "-ecx" - "/request-cert -namespace=${POD_NAMESPACE} -certs-dir=/cockroach-certs -type=client -user=root -symlink-ca-from=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt" env: - name: POD_NAMESPACE valueFrom: fieldRef: fieldPath: metadata.namespace volumeMounts: - name: client-certs mountPath: /cockroach-certs containers: - name: cockroachdb-client image: cockroachdb/cockroach:v19.1.3 imagePullPolicy: IfNotPresent volumeMounts: - name: client-certs mountPath: /cockroach-certs # Keep a pod open indefinitely so kubectl exec can be used to get a shell to it # and run cockroach client commands, such as cockroach sql, cockroach node status, etc. command: - sleep - "2147483648" # 2^31 # This pod isn't doing anything important, so don't bother waiting to terminate it. terminationGracePeriodSeconds: 0 volumes: - name: client-certs emptyDir: {} ``` ### Prometheus ClusterRole YAML ```yaml= apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: prometheus rules: - apiGroups: [""] resources: - nodes - nodes/proxy - services - endpoints - pods verbs: ["get", "list", "watch"] - apiGroups: - extensions resources: - ingresses verbs: ["get", "list", "watch"] - nonResourceURLs: ["/metrics"] verbs: ["get"] --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: prometheus subjects: - kind: ServiceAccount name: default namespace: monitoring ``` ### Prometheus ConfigMap YAML The ConfigMap was linted automatically by [YAML Lint](http://www.yamllint.com/). ```yaml= --- apiVersion: v1 data: prometheus.rules: "" prometheus.yml: |- global: scrape_interval: 5s evaluation_interval: 5s rule_files: - /etc/prometheus/prometheus.rules # forwarding the data read by Prometheus remote_write: - url: "https://url-to-write-to" tls_config: insecure_skip_verify: true scrape_configs: # scraping targetting the service which has all the minion endpoints - job_name: cockroach-cluster scheme: 'http' tls_config: insecure_skip_verify: true metrics_path: /_status/vars kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name] action: keep regex: default;my-release-cockroachdb-public;http # scraping to get data from single node - job_name: cockroach-minion0 # job name in prometheus scheme: 'http' # connection type tls_config: insecure_skip_verify: true metrics_path: /_status/vars # url to scrape kubernetes_sd_configs: - role: pod # role inside the kubernetes cluster relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name, __meta_kubernetes_pod_container_port_number] action: keep regex: default;my-release-cockroachdb-0;8080 # information to match the kubernetes labels given in source_labels # scraping to get data from single node - job_name: cockroach-minion1 # job name in prometheus scheme: 'http' # connection type tls_config: insecure_skip_verify: true metrics_path: /_status/vars # url to scrape kubernetes_sd_configs: - role: pod # role inside the kubernetes cluster relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name, __meta_kubernetes_pod_container_port_number] action: keep regex: default;my-release-cockroachdb-1;8080 # information to match the kubernetes labels given in source_labels # scraping to get data from single node - job_name: cockroach-minion2 # job name in prometheus scheme: 'http' # connection type tls_config: insecure_skip_verify: true metrics_path: /_status/vars # url to scrape kubernetes_sd_configs: - role: pod # role inside the kubernetes cluster relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_pod_name, __meta_kubernetes_pod_container_port_number] action: keep regex: default;my-release-cockroachdb-2;8080 # information to match the kubernetes labels given in source_labels kind: ConfigMap metadata: labels: name: prometheus-server-conf name: prometheus-server-conf namespace: monitoring ``` ### Prometheus Deployment YAML ```yaml= apiVersion: extensions/v1beta1 kind: Deployment metadata: name: prometheus-deployment namespace: monitoring spec: replicas: 1 template: metadata: labels: app: prometheus-server spec: containers: - name: prometheus image: prom/prometheus:v2.2.1 # please use newest image args: - "--config.file=/etc/prometheus/prometheus.yml" - "--storage.tsdb.path=/prometheus/" ports: - containerPort: 9090 volumeMounts: - name: prometheus-config-volume mountPath: /etc/prometheus/ - name: prometheus-storage-volume mountPath: /prometheus/ volumes: - name: prometheus-config-volume configMap: defaultMode: 420 name: prometheus-server-conf - name: prometheus-storage-volume emptyDir: {} ```