Trouble with multi-node docker instance: error while creating mount source path

I’m trying to set up a multi-node instance in docker compose. The steps I’ve taken are follow the 4.x compose steps, but with a few changes. Specifically instead of downloading the single node compose file, i’ve downloaded the multi-node file.

Otherwise my directory structure is:

multi-couch-test
    - compose
        - cht-core.yml
        - cht-couchdb-clustered.yml  
    - compose.yml  
    - couchdb  
    - .env

And my .env looks like this:

NGINX_HTTP_PORT=8080
NGINX_HTTPS_PORT=8443
COUCHDB_USER=medic
COUCHDB_PASSWORD=password
CHT_COMPOSE_PROJECT_NAME=cht_4_app_developer
DOCKER_CONFIG_PATH=/var/home/mrjones/Documents/medicmobile/multi-couch-test
COUCHDB_SECRET=19f3b9fb1d7aba1ef4d1c5ed709512ee
COUCHDB_UUID=e7122b1e463de4449fb05b0c494b0224
COUCHDB_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-test/couchdb
CHT_COMPOSE_PATH=/var/home/mrjones/Documents/medicmobile/multi-couch-test/compose
CHT_NETWORK=cht_4_app_developer

The error I get is in my cht-upgrade-service container after I run docker compose up -d

Error response from daemon: error while creating mount source path '/docker-compose/srv3': mkdir /docker-compose: operation not permitted

    at ChildProcess.<anonymous> (/app/src/docker-compose-cli.js:35:25)
    at ChildProcess.emit (node:events:513:28)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:293:12)

I note that this alternates between srv3, srv2 and srv1.

Looking at the mount in the upgrade service compose file I see it’s bound to ${CHT_COMPOSE_PATH} and my .env file above points this to /var/home/mrjones/Documents/medicmobile/multi-couch-test/compose. This path is valid and exists. If I manually create the three srv* directories in there, the error persists.

I seem to recall that multi-node “just worked” back in the early days of 4.x - is multi-node docker still supported? Or do you need to use k8s? I just want to set up a development instance of multi-node and docker is a lot easier for me than k8s (or k3s etc).

I’m trying on CHT 4.11, but I found the same error on CHT 4.17. I’m on Docker version 28.0.2, build 0442a73 on Bluefin.

Doing some more testing, I re-ran my local setup as root user - which yielded the same error. As well, docker ps output is the same, showing the couch instances as Created, but not running:

CONTAINER ID   IMAGE                                                 COMMAND                   CREATED         STATUS         PORTS                                                                                NAMES
b39c35255459   public.ecr.aws/medic/cht-nginx:4.11.0                 "/docker-entrypoint.…"    6 minutes ago   Up 5 minutes   0.0.0.0:8080->80/tcp, [::]:8080->80/tcp, 0.0.0.0:8443->443/tcp, [::]:8443->443/tcp   cht_4_app_developer-nginx-1
06d2530c5359   public.ecr.aws/medic/cht-api:4.11.0                   "/bin/bash /service/…"    6 minutes ago   Up 6 minutes   5988/tcp                                                                             cht_4_app_developer-api-1
33e0a3ea69c7   public.ecr.aws/medic/cht-sentinel:4.11.0              "/bin/bash /service/…"    6 minutes ago   Up 6 minutes                                                                                        cht_4_app_developer-sentinel-1
8be42fc355a5   public.ecr.aws/medic/cht-haproxy:4.11.0               "/entrypoint.sh"          6 minutes ago   Up 6 minutes   5984/tcp                                                                             cht_4_app_developer-haproxy-1
5f3bde1b98e0   public.ecr.aws/medic/cht-couchdb:4.11.0               "tini -- /docker-ent…"    6 minutes ago   Created                                                                                             cht_4_app_developer-couchdb-2.local-1
165bb9f401b8   public.ecr.aws/medic/cht-couchdb:4.11.0               "tini -- /docker-ent…"    6 minutes ago   Created                                                                                             cht_4_app_developer-couchdb-3.local-1
4cfbf25429ac   public.ecr.aws/medic/cht-couchdb:4.11.0               "tini -- /docker-ent…"    6 minutes ago   Created                                                                                             cht_4_app_developer-couchdb-1.local-1
7f0333a3e62f   public.ecr.aws/medic/cht-haproxy-healthcheck:4.11.0   "/bin/sh -c \"/app/ch…"   6 minutes ago   Up 6 minutes                                                                                        cht_4_app_developer-healthcheck-1
42bfd4d8f726   public.ecr.aws/s5s3h4s7/cht-upgrade-service:latest    "node /app/src/index…"    6 minutes ago   Up 6 minutes                                                                                        multi-couch-test-cht-upgrade-service-1

However, I spun up an Ubuntu 22.04 VM and did not get this error! As well, the couch instances all started! I’m having a seperate error (502 in the browser), but I suspect this is unrelated. Here’s the output of docker ps on my Ubuntu VM:

CONTAINER ID   IMAGE                                                 COMMAND                   CREATED          STATUS          PORTS                                                                      NAMES
28f6312adc17   public.ecr.aws/medic/cht-nginx:4.11.0                 "/docker-entrypoint.…"    13 minutes ago   Up 13 minutes   0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp   cht-nginx-1
0889a22c8847   public.ecr.aws/medic/cht-sentinel:4.11.0              "/bin/bash /service/…"    13 minutes ago   Up 13 minutes                                                                              cht-sentinel-1
9248a3e20be9   public.ecr.aws/medic/cht-api:4.11.0                   "/bin/bash /service/…"    13 minutes ago   Up 13 minutes   5988/tcp                                                                   cht-api-1
8cf8c1c6e6e8   public.ecr.aws/medic/cht-couchdb:4.11.0               "tini -- /docker-ent…"    13 minutes ago   Up 13 minutes   4369/tcp, 5984/tcp, 9100/tcp                                               cht-couchdb-3.local-1
d8a1184be7c9   public.ecr.aws/medic/cht-haproxy-healthcheck:4.11.0   "/bin/sh -c \"/app/ch…"   13 minutes ago   Up 13 minutes                                                                              cht-healthcheck-1
5ceaf058b12a   public.ecr.aws/medic/cht-couchdb:4.11.0               "tini -- /docker-ent…"    13 minutes ago   Up 13 minutes   4369/tcp, 5984/tcp, 9100/tcp                                               cht-couchdb-1.local-1
f99cb5f067ad   public.ecr.aws/medic/cht-haproxy:4.11.0               "/entrypoint.sh"          13 minutes ago   Up 13 minutes   5984/tcp                                                                   cht-haproxy-1
a461451d42be   public.ecr.aws/medic/cht-couchdb:4.11.0               "tini -- /docker-ent…"    13 minutes ago   Up 13 minutes   4369/tcp, 5984/tcp, 9100/tcp                                               cht-couchdb-2.local-1
215f7c908ce1   public.ecr.aws/s5s3h4s7/cht-upgrade-service:latest    "node /app/src/index…"    13 minutes ago   Up 13 minutes                                                                              upgrade-service-cht-upgrade-service-1

I did a pair debug session with @jkuester and he pointed out that both the .env file and the upgrade service compose file are missing declarations for DB*_DATA for the three couch serves. If you look at the bind mounts for those in the cht-couchdb-clustered.yml file, you see the default value is ./srv1 (and 2 and 3 for 2nd and 3rd):

services:
  couchdb-1.local:
    image: public.ecr.aws/medic/cht-couchdb:4.2.0
    volumes:
      - ${DB1_DATA:-./srv1}:/opt/couchdb/data

This would resolve to the path I saw in my error (/docker-compose/srv3) - makes a lot of sense!

So to fix this, my environment: section of my compose.yaml for upgrade server now looks like this:

    environment:
      - COUCHDB_USER
      - COUCHDB_PASSWORD
      - COUCHDB_SECRET
      - COUCHDB_UUID
      - COUCHDB_DATA
      - COUCHDB_SERVERS
      - CLUSTER_PEER_IPS
      - SVC_NAME
      - SVC1_NAME
      - SVC2_NAME
      - SVC3_NAME
      - COUCHDB_LOG_LEVEL
      - MARKET_URL_READ
      - BUILDS_SERVER
      - NGINX_HTTP_PORT
      - NGINX_HTTPS_PORT
      - CERTIFICATE_MODE
      - SSL_VOLUME_MOUNT_PATH
      - SSL_CERT_FILE_PATH
      - SSL_KEY_FILE_PATH
      - COMMON_NAME
      - EMAIL
      - COUNTRY
      - STATE
      - LOCALITY
      - ORGANISATION
      - DEPARTMENT
      - DOCKER_CONFIG=/config
      - CHT_COMPOSE_PROJECT_NAME=${CHT_COMPOSE_PROJECT_NAME:-cht}
      - CHT_NETWORK=${CHT_NETWORK:-cht-net}
      - DOCKER_CONFIG_PATH
      - CHT_COMPOSE_PATH
      - DB1_DATA
      - DB2_DATA
      - DB3_DATA

And now my .env file looks like this - where I created srv3, srv1 and srv2 in the multi-couch-4.2/couchdb directory:

NGINX_HTTP_PORT=8081
NGINX_HTTPS_PORT=8444
COUCHDB_USER=medic
COUCHDB_PASSWORD=password
CHT_COMPOSE_PROJECT_NAME=4-2-multi
DOCKER_CONFIG_PATH=/var/home/mrjones/Documents/medicmobile/multi-couch-4.2
COUCHDB_SECRET=19f3b9fb1d7aba1ef4d1c5ed709512ee
COUCHDB_UUID=e7122b1e463de4449fb05b0c494b0224
COUCHDB_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-4.2/couchdb
CHT_COMPOSE_PATH=/var/home/mrjones/Documents/medicmobile/multi-couch-4.2/compose
CHT_NETWORK=4-2-multi
DB1_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-4.2/couchdb/srv1
DB2_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-4.2/couchdb/srv2
DB3_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-4.2/couchdb/srv3

Now the three couch server start! As well, there’s data in each of the srv* directories:

$ ls couchdb/srv* 

couchdb/srv1:
_dbs.couch  _nodes.couch  _replicator.couch  shards  _users.couch

couchdb/srv2:
_dbs.couch  _nodes.couch  _replicator.couch  shards  _users.couch

couchdb/srv3:
_dbs.couch  _nodes.couch  _replicator.couch  shards  _users.couch

Now I’m chasing down a 502 error in the browser caused by HAProxy not being able to talk to the cluster correctly:

[ALERT]    (1) : config : [/usr/local/etc/haproxy/backend.cfg:7] : 'server couchdb-servers/couchdb' : could not resolve address 'couchdb'.
[ALERT]    (1) : config : Failed to initialize server(s) addr.

It looks like HAProxy has single node config in there? Between restarts of HAProxy I can verify that couchdb is indeed not reachable, but node one is:

$ docker exec  -it 4-2-multi-haproxy-1 curl  http://medic:password@couchdb-1.local:5984
{"couchdb":"Welcome","version":"2.3.1","git_sha":"c298091a4","uuid":"e7122b1e463de4449fb05b0c494b0224","features":["pluggable-storage-engines","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

$ docker exec  -it 4-2-multi-haproxy-1 curl  http://medic:password@couchdb:5984
curl: (6) Could not resolve host: couchdb

(ignore the old couch 2.x version - this is a CHT 4.2 install :sweat_smile: from when I thought maybe newer CHT versions had a multi-node regression. Or maybe don’t ignore it! I’m happy to redo all this on 4.18 since 4.2 is so far out of support)

OK!!! Problem solved. The final trick needed to make this all was to, as I suspected above, set COUCHDB_SERVERS to not the default value. By adding this line to my .env file:

COUCHDB_SERVERS=couchdb-1.local,couchdb-2.local,couchdb-3.local

My instance came right up!

To wrap up, the issue was 3 fold:

  1. Add 3 extra variables to cht-upgrade-service compose file in the environment section:
       - DB1_DATA
       - DB2_DATA
       - DB3_DATA
    
  2. Add 3 correct paths for each of the above env vars in your .env file. For me this was as shown below, but it will be unique to each deployment:
    DB1_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-test/couchdb/srv1
    DB2_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-test/couchdb/srv2
    DB3_DATA=/var/home/mrjones/Documents/medicmobile/multi-couch-test/couchdb/srv3
    
  3. Define each of the three couch nodes, also in the .env file:
    COUCHDB_SERVERS=couchdb-1.local,couchdb-2.local,couchdb-3.local
    

After all that is set up, you should just need to do docker compose up -d and away you go!

2 Likes

In case anyone is looking to do this, we’ve made a few slight code tweaks and update our documentation to make this a much easier process to get set up locally. Please see the manual install steps for app developer hosting and note that there’s now a “Multi-node” option for two of the steps:

1 Like