Hello,
There is minimum documentation on CHT installation on Kubernetes: Kubernetes vs Docker | Community Health Toolkit
I would like to know if there is anyone who has implemented this and if they can share with me their experience or documentation so that I can referee to it and make a case of migration from Docker to Kubernetes. My focus is having a much more robust infrastructure.
Thanks,
Job
Hi Job,
We’ve migrated CHT projects from docker to kubernetes and are working on adding guides in PR#1557. I’d be happy to walk through the process with you when ready.
Hello @elijah
at ICRC we’d like to deploy our first CHT instance on Kubernetes. Is there any resources ( helm, yaml files) we can use ?
thanks
Hi Frederic,
CHT’s helm charts are hosted in this repository: GitHub - medic/helm-charts
Thanks @elijah
I suppose we should use helm-charts/charts/cht-chart-4x at main · medic/helm-charts · GitHub correct ?
Confirmed Frederic, this is the correct link for the recommended CHT v4 deployment
Hello,
I was looking at the helms charts, tried to set them up locally on Docker environment with Kubernetes but I couldn’t find a clear documentation or process. Would you please share with us a documentation for both Windows and Linux environment.
Hi Job,
Helm charts can be deployed to a kubernetes cluster using this command helm install /path/to/chart
. Additional details are available on the helm documentation site: Helm | Using Helm
Hi Elijah,
Yes that’s exactly what I thought, but I kept on getting this error on a couple of yaml files:
Error: INSTALLATION FAILED: YAML parse error on cht-chart-4x/templates/api-deployment.yaml: error converting YAML to JSON: yaml: line 35: mapping values are not allowed in this context
Please share the values.yaml file that you’re using for this installation
project_name: "msfecare" # e.g. mrjones-dev
namespace: "ecare-dev" # e.g. "mrjones-dev"
chtversion: 4.10.0
# cht_image_tag: 4.1.1-4.1.1 #- This is filled in automatically by the deploy script. Don't uncomment this line.
# If images are cached, the same image tag will never be pulled twice. For development, this means that it's not
# possible to upgrade to a newer version of the same branch, as the old image will always be reused.
# For development instances, set this value to false.
cache_images: true
# Don't change upstream-servers unless you know what you're doing.
upstream_servers:
docker_registry: "public.ecr.aws/medic"
builds_url: "https://staging.dev.medicmobile.org/_couch/builds_4"
upgrade_service:
tag: 0.32
# CouchDB Settings
couchdb:
password: "Password" # Avoid using non-url-safe characters in password
secret: "f9053a0a-ef77-4be3-994d-87d6732600fd" # for prod, change to output of `uuidgen
user: "medic"
uuid: "7300115e-1a98-4607-a37c-50e0c9913767" # for prod, change to output of `uuidgen`
clusteredCouch_enabled: false
couchdb_node_storage_size: 100Mi
clusteredCouch:
noOfCouchDBNodes: 3
toleration: # This is for the couchdb pods. Don't change this unless you know what you're doing.
key: "dev-couchdb-only"
operator: "Equal"
value: "true"
effect: "NoSchedule"
ingress:
annotations:
groupname: "dev-cht-alb"
tags: "Environment=dev,Team=QA"
certificate: "arn:aws:iam::720541322708:server-certificate/2024-wildcard-dev-medicmobile-org-chain"
# Ensure the host is not already taken. Valid characters for a subdomain are:
# a-z, 0-9, and - (but not as first or last character).
host: "<subdomain>.dev.medicmobile.org" # e.g. "mrjones.dev.medicmobile.org"
hosted_zone_id: "Z3304WUAJTCM7P"
load_balancer: "dualstack.k8s-devchtalb-3eb0781cbb-694321496.eu-west-2.elb.amazonaws.com"
environment: "remote" # "local", "remote"
cluster_type: "eks" # "eks" or "k3s-k3d"
cert_source: "eks-medic" # "eks-medic" or "specify-file-path" or "my-ip-co"
certificate_crt_file_path: "/path/to/certificate.crt" # Only required if cert_source is "specify-file-path"
certificate_key_file_path: "/path/to/certificate.key" # Only required if cert_source is "specify-file-path"
nodes:
# If using clustered couchdb, add the nodes here: node-1: name-of-first-node, node-2: name-of-second-node, etc.
# Add equal number of nodes as specified in clusteredCouch.noOfCouchDBNodes
node-1: "" # This is the name of the first node where couchdb will be deployed
node-2: "" # This is the name of the second node where couchdb will be deployed
node-3: "" # This is the name of the third node where couchdb will be deployed
# For single couchdb node, use the following:
# Leave it commented out if you don't know what it means.
# Leave it commented out if you want to let kubernetes deploy this on any available node. (Recommended)
# single_node_deploy: "gamma-cht-node" # This is the name of the node where all components will be deployed - for non-clustered configuration.
# Applicable only if using k3s
k3s_use_vSphere_storage_class: "false" # "true" or "false"
# vSphere specific configurations. If you set "true" for k3s_use_vSphere_storage_class, fill in the details below.
vSphere:
datastoreName: "DatastoreName" # Replace with your datastore name
diskPath: "path/to/disk" # Replace with your disk path
# -----------------------------------------
# Pre-existing data section
# -----------------------------------------
couchdb_data:
preExistingDataAvailable: "false" #If this is false, you don't have to fill in details in local_storage or remote.
dataPathOnDiskForCouchDB: "data" # This is the path where couchdb data will be stored. Leave it as data if you don't have pre-existing data.
# To mount to a specific subpath (If data is from an old 3.x instance for example): dataPathOnDiskForCouchDB: "storage/medic-core/couchdb/data"
# To mount to the root of the volume: dataPathOnDiskForCouchDB: ""
# To use the default "data" subpath, remove the subPath line entirely from values.yaml or name it "data" or use null.
# for Multi-node configuration, you can use %d to substitute with the node number.
# You can use %d for each node to be substituted with the node number.
# If %d doesn't exist, the same path will be used for all nodes.
# example: test-path%d will be test-path1, test-path2, test-path3 for 3 nodes.
# example: test-path will be test-path for all nodes.
partition: "0" # This is the partition number for the EBS volume. Leave it as 0 if you don't have a partitioned disk.
# If preExistingDataAvailable is true, fill in the details below.
# For local_storage, fill in the details if you are using k3s-k3d cluster type.
local_storage: #If using k3s-k3d cluster type and you already have existing data.
preExistingDiskPath-1: "/var/lib/couchdb1" #If node1 has pre-existing data.
preExistingDiskPath-2: "/var/lib/couchdb2" #If node2 has pre-existing data.
preExistingDiskPath-3: "/var/lib/couchdb3" #If node3 has pre-existing data.
# For ebs storage when using eks cluster type, fill in the details below.
ebs:
preExistingEBSVolumeID-1: "vol-0123456789abcdefg" # If you have already created the EBS volume, put the ID here.
preExistingEBSVolumeID-2: "vol-0123456789abcdefg" # If you have already created the EBS volume, put the ID here.
preExistingEBSVolumeID-3: "vol-0123456789abcdefg" # If you have already created the EBS volume, put the ID here.
preExistingEBSVolumeSize: "100Gi" # The size of the EBS volume.
Thanks Job,
The following deployment files within the template directory reference cht_image_tag
which has been replaced by chtversion
in values.yaml:
- api-deployment.yaml
- couchdb-single-deployment.yaml
- haproxy-deployment.yaml
- healthcheck-deployment.yaml
- sentinel-deployment.yaml
After making this update the charts will compile successfully.
Hello Elijah,
Thank you very much for this, I made the update and now am able to run installations. Now onto my next issue:
- All pods except couchdb and haproxy even when running a single couchdb or a clustered.
- The couchdb logs:
[notice] 2025-01-08T11:18:18.773849Z couchdb@127.0.0.1 <0.109.0> -------- config: [admins] medic set to '****' for reason nil
[info] 2025-01-08T11:18:18.812620Z couchdb@127.0.0.1 <0.254.0> -------- Apache CouchDB has started. Time to relax.
[notice] 2025-01-08T11:18:18.818237Z couchdb@127.0.0.1 <0.353.0> -------- rexi_server : started servers
[notice] 2025-01-08T11:18:18.819214Z couchdb@127.0.0.1 <0.357.0> -------- rexi_buffer : started servers
[warning] 2025-01-08T11:18:18.838070Z couchdb@127.0.0.1 <0.365.0> -------- creating missing database: _nodes
[info] 2025-01-08T11:18:18.838123Z couchdb@127.0.0.1 <0.366.0> -------- open_result error {not_found,no_db_file} for _nodes
[warning] 2025-01-08T11:18:18.889595Z couchdb@127.0.0.1 <0.381.0> -------- creating missing database: _dbs
[warning] 2025-01-08T11:18:18.889595Z couchdb@127.0.0.1 <0.382.0> -------- creating missing database: _dbs
[info] 2025-01-08T11:18:18.889640Z couchdb@127.0.0.1 <0.384.0> -------- open_result error {not_found,no_db_file} for _dbs
[notice] 2025-01-08T11:18:18.907307Z couchdb@127.0.0.1 <0.396.0> -------- mem3_reshard_dbdoc start init()
[notice] 2025-01-08T11:18:18.926356Z couchdb@127.0.0.1 <0.398.0> -------- mem3_reshard start init()
[notice] 2025-01-08T11:18:18.926461Z couchdb@127.0.0.1 <0.399.0> -------- mem3_reshard db monitor <0.399.0> starting
[notice] 2025-01-08T11:18:18.930542Z couchdb@127.0.0.1 <0.398.0> -------- mem3_reshard starting reloading jobs
[notice] 2025-01-08T11:18:18.930639Z couchdb@127.0.0.1 <0.398.0> -------- mem3_reshard finished reloading jobs
[info] 2025-01-08T11:18:18.952029Z couchdb@127.0.0.1 <0.405.0> -------- Apache CouchDB has started. Time to relax.
[info] 2025-01-08T11:18:18.952116Z couchdb@127.0.0.1 <0.405.0> -------- Apache CouchDB has started on http://0.0.0.0:5984/
[notice] 2025-01-08T11:18:18.967965Z couchdb@127.0.0.1 <0.426.0> -------- chttpd_auth_cache changes listener died because the _users database does not exist. Create the database to silence this notice.
[error] 2025-01-08T11:18:18.968299Z couchdb@127.0.0.1 emulator -------- Error in process <0.427.0> on node 'couchdb@127.0.0.1' with exit value:
{database_does_not_exist,[{mem3_shards,load_shards_from_db,[<<"_users">>],[{file,"src/mem3_shards.erl"},{line,430}]},{mem3_shards,load_shards_from_disk,1,[{file,"src/mem3_shards.erl"},{line,405}]},{mem3_shards,load_shards_from_disk,2,[{file,"src/mem3_shards.erl"},{line,434}]},{mem3_shards,for_docid,3,[{file,"src/mem3_shards.erl"},{line,100}]},{fabric_doc_open,go,3,[{file,"src/fabric_doc_open.erl"},{line,39}]},{chttpd_auth_cache,ensure_auth_ddoc_exists,2,[{file,"src/chttpd_auth_cache.erl"},{line,214}]},{chttpd_auth_cache,listen_for_changes,1,[{file,"src/chttpd_auth_cache.erl"},{line,160}]}]}
[error] 2025-01-08T11:18:18.968424Z couchdb@127.0.0.1 emulator -------- Error in process <0.427.0> on node 'couchdb@127.0.0.1' with exit value:
{database_does_not_exist,[{mem3_shards,load_shards_from_db,[<<"_users">>],[{file,"src/mem3_shards.erl"},{line,430}]},{mem3_shards,load_shards_from_disk,1,[{file,"src/mem3_shards.erl"},{line,405}]},{mem3_shards,load_shards_from_disk,2,[{file,"src/mem3_shards.erl"},{line,434}]},{mem3_shards,for_docid,3,[{file,"src/mem3_shards.erl"},{line,100}]},{fabric_doc_open,go,3,[{file,"src/fabric_doc_open.erl"},{line,39}]},{chttpd_auth_cache,ensure_auth_ddoc_exists,2,[{file,"src/chttpd_auth_cache.erl"},{line,214}]},{chttpd_auth_cache,listen_for_changes,1,[{file,"src/chttpd_auth_cache.erl"},{line,160}]}]}
[notice] 2025-01-08T11:18:19.015081Z couchdb@127.0.0.1 <0.474.0> -------- Missing system database _users
Waiting for cht couchdb
- Haproxy logs
# servers are added at runtime, in entrypoint.sh, based on couchdb-1.ecare.svc.cluster.local,couchdb-2.ecare.svc.cluster.local,couchdb-3.ecare.svc.cluster.local
server couchdb-1.ecare.svc.cluster.local couchdb-1.ecare.svc.cluster.local:5984 check agent-check agent-inter 5s agent-addr healthcheck.ecare.svc.cluster.local agent-port 5555
server couchdb-2.ecare.svc.cluster.local couchdb-2.ecare.svc.cluster.local:5984 check agent-check agent-inter 5s agent-addr healthcheck.ecare.svc.cluster.local agent-port 5555
server couchdb-3.ecare.svc.cluster.local couchdb-3.ecare.svc.cluster.local:5984 check agent-check agent-inter 5s agent-addr healthcheck.ecare.svc.cluster.local agent-port 5555
[alert] 007/111000 (1) : parseBasic loaded
[alert] 007/111000 (1) : parseCookie loaded
[alert] 007/111000 (1) : replacePassword loaded
[NOTICE] (1) : haproxy version is 2.6.17-a7cab98
[NOTICE] (1) : path to executable is /usr/local/sbin/haproxy
[ALERT] (1) : config : [/usr/local/etc/haproxy/backend.cfg:7] : 'server couchdb-servers/couchdb-1.ecare.svc.cluster.local' : parsing agent-addr failed. Check if 'healthcheck.ecare.svc.cluster.local' is correct address..
[ALERT] (1) : config : [/usr/local/etc/haproxy/backend.cfg:8] : 'server couchdb-servers/couchdb-2.ecare.svc.cluster.local' : parsing agent-addr failed. Check if 'healthcheck.ecare.svc.cluster.local' is correct address..
[ALERT] (1) : config : [/usr/local/etc/haproxy/backend.cfg:9] : 'server couchdb-servers/couchdb-3.ecare.svc.cluster.local' : parsing agent-addr failed. Check if 'healthcheck.ecare.svc.cluster.local' is correct address..
[ALERT] (1) : config : Error(s) found in configuration file : /usr/local/etc/haproxy/backend.cfg
[ALERT] (1) : config : Fatal errors found in configuration.`
Hi Job,
Couch logs indicate that it’s trying to create databases with failing writes.
I would recommend setting up CHT locally using k3s starting with a single-node deployment then move to multi-node deployment and finally the production environment. This approach will make it easier to isolate where specific issues are occurring.