CHT Instance crashing and failing to sync

@hareet, we are running docker version: ‘3.7’, image medicmobile/medic-os:cht-3.9.0-rc.2.

Running lsblk on the server returns the following, still looking for the culprit.
root@0ac0da41a982:/srv# lsblk
lsblk: dm-0: failed to get device path
lsblk: dm-0: failed to get device path
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop1 7:1 0 55.6M 1 loop
loop8 7:8 0 70.8M 1 loop
loop6 7:6 0 49.8M 1 loop
loop4 7:4 0 91.9M 1 loop
loop11 7:11 0 11.9M 1 loop
sr0 11:0 1 1.2G 0 rom
loop2 7:2 0 63.2M 1 loop
fd0 2:0 1 4K 0 disk
loop0 7:0 0 55.6M 1 loop
loop9 7:9 0 49.6M 1 loop
loop7 7:7 0 70.8M 1 loop
sda 8:0 0 3.4T 0 disk
|-sda2 8:2 0 1G 0 part
|-sda3 8:3 0 249G 0 part
`-sda1 8:1 1M 0 part
loop5 7:5 0 63.3M 1 loop
loop3 7:3 0 91.8M 1 loop
loop10 7:10 0 72.9M 1 loop

Three services fail at the same time: ‘horticulturalist’, ‘couchdb’, ‘nginx’, anyone with an idea, data collection has stalled and we have teams in the field?

@oyierphil Hey Philip, is it possible for you to set me up with SSH access for a few hours so I can poke around your setup? Please let me know and I’ll share my ssh key, or if we could set up a screen share at some time

Sorry you are having issues here, but we will get this up and running!

2 Likes

@hareet, this is fine, ready to facilitate

@oyierphil Thanks! I have found the culprit!

@boda-project:~$ lsblk
sda                         8:0    0   3.4T  0 disk
├─sda1                      8:1    0     1M  0 part
├─sda2                      8:2    0     1G  0 part /boot
└─sda3                      8:3    0   249G  0 part
  └─ubuntu--vg-ubuntu--lv 253:0    0 124.5G  0 lvm  /

We have added a 3TB disk, but we will need to extend the filesystem on our partition to utilize that new space. From the output above, you can see that disk sda has 3 partitions, and sda3 is parted in half for our ubuntu vm and data. Our ubuntu vm and data logical volume only has access to 124.5G, and is completely full.

To verify this idea, we logged into the container and can see the same lvm as full on the directory that the cht application saves everything to (/srv)

# docker exec -it medic-os /bin/bash
# df -h
/dev/mapper/ubuntu--vg-ubuntu--lv  123G  117G     0 100% /srv

We will want to extend the partition to capture more of that 3TB you have added and extend the lvm to utilize that space for the vm where you are running docker.

Before attempting this, I would recommend you do a complete back up of your disks and partition.

Please read over these and let me know if you have any questions.
Here are some helpful links:

1 Like

@hareet, thank you, this is well noted, we will perform backup of the disk, which is taking sometime, will let it run at night, we complete the process in the morning

@hareet, we let backup run throughput the night using our virtualization software, vSphere, it didn’t complete, exploring other ways of creating the backup now, will update

@hareet
We have managed to provide more space to the volume, which I realized was provisioned but but not added by the LLVM as below, now the app has refused to wake up as follows:
Package ‘horticulturalist’:
Service ‘horticulturalist’:
Status: Failure
Up: 1 seconds, Restarts: 150
Attributes: watched, expected
Service PID: None, Supervisor PID: 344

Package ‘medic-api’:
Service ‘medic-api’:
Status: Up
Up: 1 seconds, Restarts: 130
Attributes: watched, expected
Service PID: None, Supervisor PID: 387

Package ‘medic-core’:
Service ‘couchdb’:
Status: Up
Up: 428 seconds, Restarts: 0
Attributes: watched, running, expected
Service PID: 444, Supervisor PID: 427
Service ‘nginx’:
Status: Down
Up: 0 seconds, Restarts: -1
Attributes: watched, down
Service PID: None, Supervisor PID: 466
Service ‘openssh’:
Status: Up
Up: 428 seconds, Restarts: 0
Attributes: watched, running, expected
Service PID: 516, Supervisor PID: 501

Package ‘medic-couch2pg’:
Service ‘medic-couch2pg’:
Status: Failure
Up: 0 seconds, Restarts: 166
Attributes: watched, expected
Service PID: None, Supervisor PID: 562

Package ‘medic-rdbms’:
Service ‘postgresql’:
Status: Up
Up: 429 seconds, Restarts: 0
Attributes: watched, running, expected
Service PID: 632, Supervisor PID: 610

Package ‘medic-sentinel’:
Service ‘medic-sentinel’:
Status: Up
Up: 1 seconds, Restarts: 164
Attributes: watched, running, expected
Service PID: 19022, Supervisor PID: 659

Package ‘system-services’:
Service ‘cron’:
Status: Up
Up: 429 seconds, Restarts: 0
Attributes: watched, running, expected
Service PID: 725, Supervisor PID: 700
Service ‘syslog’:
Status: Up
Up: 429 seconds, Restarts: 0
Attributes: watched, running, expected
Service PID: 749, Supervisor PID: 737

And error logs as below:
[2023-01-10 06:47:56] 2023-01-10 06:47:56 ERROR: Error watching sentinel changes, restarting: { FetchError: invalid json response body at http://haproxy:5984/medic-sentinel/ reason: Unexpected token < in JSON at position 0
[2023-01-10 06:47:56] at /srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/node-fetch/lib/index.js:272:32
[2023-01-10 06:47:56] at
[2023-01-10 06:47:56] at process._tickCallback (internal/process/next_tick.js:188:7)
[2023-01-10 06:47:56] message: ‘invalid json response body at http://haproxy:5984/medic-sentinel/ reason: Unexpected token < in JSON at position 0’,
[2023-01-10 06:47:56] type: ‘invalid-json’,
[2023-01-10 06:47:56] [stack]: ‘FetchError: invalid json response body at http://haproxy:5984/medic-sentinel/ reason: Unexpected token < in JSON at position 0\n at /srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/node-fetch/lib/index.js:272:32\n at \n at process._tickCallback (internal/process/next_tick.js:188:7)’ }
Expected a 401 when accessing db without authentication.
Instead we got a 503
[2023-01-10 06:47:56] 2023-01-10 06:47:56 ERROR: Fatal error initialising medic-api
[2023-01-10 06:47:56] 2023-01-10 06:47:56 ERROR: { Error: CouchDB security seems to be misconfigured, see: cht-core/DEVELOPMENT.md at master · medic/cht-core · GitHub
[2023-01-10 06:47:56] at ClientRequest.net.get (/srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/@medic/server-checks/src/checks.js:63:16)
[2023-01-10 06:47:56] at Object.onceWrapper (events.js:315:30)
[2023-01-10 06:47:56] at emitOne (events.js:116:13)
[2023-01-10 06:47:56] at ClientRequest.emit (events.js:211:7)
[2023-01-10 06:47:56] at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:543:21)
[2023-01-10 06:47:56] at HTTPParser.parserOnHeadersComplete (_http_common.js:112:17)
[2023-01-10 06:47:56] at Socket.socketOnData (_http_client.js:440:20)
[2023-01-10 06:47:56] at emitOne (events.js:116:13)
[2023-01-10 06:47:56] at Socket.emit (events.js:211:7)
[2023-01-10 06:47:56] at addChunk (_stream_readable.js:263:12)
[2023-01-10 06:47:56] [stack]: ‘Error: CouchDB security seems to be misconfigured, see: https://github.com/medic/cht-core/blob/master/DEVELOPMENT.md#enabling-a-secure-couchdb\n at ClientRequest.net.get (/srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/@medic/server-checks/src/checks.js:63:16)\n at Object.onceWrapper (events.js:315:30)\n at emitOne (events.js:116:13)\n at ClientRequest.emit (events.js:211:7)\n at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:543:21)\n at HTTPParser.parserOnHeadersComplete (_http_common.js:112:17)\n at Socket.socketOnData (_http_client.js:440:20)\n at emitOne (events.js:116:13)\n at Socket.emit (events.js:211:7)\n at addChunk (_stream_readable.js:263:12)’,
[2023-01-10 06:47:56] [message]: ‘CouchDB security seems to be misconfigured, see: cht-core/DEVELOPMENT.md at master · medic/cht-core · GitHub’ }
[2023-01-10 06:47:59] 2023-01-10 06:47:59 INFO: Running server checks…
[2023-01-10 06:47:59] Node Environment Options: ‘–max_old_space_size=8192’
[2023-01-10 06:47:59] Node Version: 8.11.4 in production mode
[2023-01-10 06:47:59] COUCH_URL http://haproxy:5984/medic
[2023-01-10 06:47:59] COUCH_NODE_NAME couchdb@127.0.0.1
Expected a 401 when accessing db without authentication.
Instead we got a 503
[2023-01-10 06:48:02] 2023-01-10 06:48:02 ERROR: Fatal error initialising medic-api
[2023-01-10 06:48:02] 2023-01-10 06:48:02 ERROR: { Error: CouchDB security seems to be misconfigured, see: cht-core/DEVELOPMENT.md at master · medic/cht-core · GitHub
[2023-01-10 06:48:02] at ClientRequest.net.get (/srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/@medic/server-checks/src/checks.js:63:16)
[2023-01-10 06:48:02] at Object.onceWrapper (events.js:315:30)
[2023-01-10 06:48:02] at emitOne (events.js:116:13)
[2023-01-10 06:48:02] at ClientRequest.emit (events.js:211:7)
[2023-01-10 06:48:02] at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:543:21)
[2023-01-10 06:48:02] at HTTPParser.parserOnHeadersComplete (_http_common.js:112:17)
[2023-01-10 06:48:02] at Socket.socketOnData (_http_client.js:440:20)
[2023-01-10 06:48:02] at emitOne (events.js:116:13)
[2023-01-10 06:48:02] at Socket.emit (events.js:211:7)
[2023-01-10 06:48:02] at addChunk (_stream_readable.js:263:12)
[2023-01-10 06:48:02] [stack]: ‘Error: CouchDB security seems to be misconfigured, see: https://github.com/medic/cht-core/blob/master/DEVELOPMENT.md#enabling-a-secure-couchdb\n at ClientRequest.net.get (/srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/@medic/server-checks/src/checks.js:63:16)\n at Object.onceWrapper (events.js:315:30)\n at emitOne (events.js:116:13)\n at ClientRequest.emit (events.js:211:7)\n at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:543:21)\n at HTTPParser.parserOnHeadersComplete (_http_common.js:112:17)\n at Socket.socketOnData (_http_client.js:440:20)\n at emitOne (events.js:116:13)\n at Socket.emit (events.js:211:7)\n at addChunk (_stream_readable.js:263:12)’,
[2023-01-10 06:48:02] [message]: ‘CouchDB security seems to be misconfigured, see: cht-core/DEVELOPMENT.md at master · medic/cht-core · GitHub’ }
[2023-01-10 06:48:05] 2023-01-10 06:48:05 INFO: Running server checks…
[2023-01-10 06:48:05] Node Environment Options: ‘–max_old_space_size=8192’
[2023-01-10 06:48:05] Node Version: 8.11.4 in production mode
[2023-01-10 06:48:05] COUCH_URL http://haproxy:5984/medic
[2023-01-10 06:48:05] COUCH_NODE_NAME couchdb@127.0.0.1
Expected a 401 when accessing db without authentication.
Instead we got a 503
[2023-01-10 06:48:08] 2023-01-10 06:48:08 ERROR: Fatal error initialising medic-api
[2023-01-10 06:48:08] 2023-01-10 06:48:08 ERROR: { Error: CouchDB security seems to be misconfigured, see: cht-core/DEVELOPMENT.md at master · medic/cht-core · GitHub
[2023-01-10 06:48:08] at ClientRequest.net.get (/srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/@medic/server-checks/src/checks.js:63:16)
[2023-01-10 06:48:08] at Object.onceWrapper (events.js:315:30)
[2023-01-10 06:48:08] at emitOne (events.js:116:13)
[2023-01-10 06:48:08] at ClientRequest.emit (events.js:211:7)
[2023-01-10 06:48:08] at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:543:21)
[2023-01-10 06:48:08] at HTTPParser.parserOnHeadersComplete (_http_common.js:112:17)
[2023-01-10 06:48:08] at Socket.socketOnData (_http_client.js:440:20)
[2023-01-10 06:48:08] at emitOne (events.js:116:13)
[2023-01-10 06:48:08] at Socket.emit (events.js:211:7)
[2023-01-10 06:48:08] at addChunk (_stream_readable.js:263:12)
[2023-01-10 06:48:08] [stack]: ‘Error: CouchDB security seems to be misconfigured, see: https://github.com/medic/cht-core/blob/master/DEVELOPMENT.md#enabling-a-secure-couchdb\n at ClientRequest.net.get (/srv/software/medic-api/md5-IaUJW7p4rpaMwbWhKl5C1A==/node_modules/@medic/server-checks/src/checks.js:63:16)\n at Object.onceWrapper (events.js:315:30)\n at emitOne (events.js:116:13)\n at ClientRequest.emit (events.js:211:7)\n at HTTPParser.parserOnIncomingClient [as onIncoming] (_http_client.js:543:21)\n at HTTPParser.parserOnHeadersComplete (_http_common.js:112:17)\n at Socket.socketOnData (_http_client.js:440:20)\n at emitOne (events.js:116:13)\n at Socket.emit (events.js:211:7)\n at addChunk (_stream_readable.js:263:12)’,
[2023-01-10 06:48:08] [message]: ‘CouchDB security seems to be misconfigured, see: cht-core/DEVELOPMENT.md at master · medic/cht-core · GitHub’ }
[2023-01-10 06:48:11] 2023-01-10 06:48:11 INFO: Running server checks…
[2023-01-10 06:48:11] Node Environment Options: ‘–max_old_space_size=8192’
[2023-01-10 06:48:11] Node Version: 8.11.4 in production mode
[2023-01-10 06:48:11] COUCH_URL http://haproxy:5984/medic
[2023-01-10 06:48:11] COUCH_NODE_NAME couchdb@127.0.0.1

I have gone through the configs again and I find everything fine, for some reason, the app isn’t loading, anyone with an idea where else to check? @diana, @mrjones, @hareet

Hi @oyierphil

It looks like your CouchDb server is still down. Can you check the logs there?

[notice] 2023-01-11T12:48:21.560519Z couchdb@127.0.0.1 <0.18632.1> b917e8915b haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:21.572907Z couchdb@127.0.0.1 <0.18633.1> 0b2bcc546c couch_httpd_auth: Authentication failed for user medic-sentinel from 172.18.$[notice] 2023-01-11T12:48:21.573299Z couchdb@127.0.0.1 <0.18633.1> 0b2bcc546c haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[warning] 2023-01-11T12:48:22.185433Z couchdb@127.0.0.1 <0.18639.1> 16a54b9a22 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:22.185934Z couchdb@127.0.0.1 <0.18639.1> 16a54b9a22 haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[notice] 2023-01-11T12:48:22.186729Z couchdb@127.0.0.1 <0.18640.1> aee821a744 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:22.210467Z couchdb@127.0.0.1 <0.18641.1> 9973e547cb couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:22.210933Z couchdb@127.0.0.1 <0.18641.1> 9973e547cb haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[warning] 2023-01-11T12:48:22.211499Z couchdb@127.0.0.1 <0.18642.1> 545d5faf31 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:22.211960Z couchdb@127.0.0.1 <0.18642.1> 545d5faf31 haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[warning] 2023-01-11T12:48:22.774982Z couchdb@127.0.0.1 <0.18643.1> d45e81fdfe couch_httpd_auth: Authentication failed for user horticulturalist from 172.1$[notice] 2023-01-11T12:48:22.775549Z couchdb@127.0.0.1 <0.18643.1> d45e81fdfe haproxy:5984 172.18.0.2 undefined GET /medic/ 401 ok 1
[notice] 2023-01-11T12:48:24.177432Z couchdb@127.0.0.1 <0.18671.1> f413cb88d5 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:24.190766Z couchdb@127.0.0.1 <0.18673.1> bb6113ad05 couch_httpd_auth: Authentication failed for user medic-sentinel from 172.18.$[notice] 2023-01-11T12:48:24.191305Z couchdb@127.0.0.1 <0.18673.1> bb6113ad05 haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[warning] 2023-01-11T12:48:25.416446Z couchdb@127.0.0.1 <0.18739.1> 936b34eb64 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:25.416568Z couchdb@127.0.0.1 <0.18740.1> dbc4247062 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[notice] 2023-01-11T12:48:25.416921Z couchdb@127.0.0.1 <0.18739.1> 936b34eb64 haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[warning] 2023-01-11T12:48:25.439895Z couchdb@127.0.0.1 <0.18741.1> 46f66b16e3 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:25.440411Z couchdb@127.0.0.1 <0.18741.1> 46f66b16e3 haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[warning] 2023-01-11T12:48:25.442124Z couchdb@127.0.0.1 <0.18742.1> 06455fee12 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:25.442535Z couchdb@127.0.0.1 <0.18742.1> 06455fee12 haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[warning] 2023-01-11T12:48:25.537578Z couchdb@127.0.0.1 <0.18743.1> 6c094d3ff7 couch_httpd_auth: Authentication failed for user horticulturalist from 172.1$[notice] 2023-01-11T12:48:25.538016Z couchdb@127.0.0.1 <0.18743.1> 6c094d3ff7 haproxy:5984 172.18.0.2 undefined GET /medic/ 401 ok 1
[notice] 2023-01-11T12:48:26.882319Z couchdb@127.0.0.1 <0.18753.1> 5fb73f56aa haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:26.894769Z couchdb@127.0.0.1 <0.18754.1> 720ab94d0a couch_httpd_auth: Authentication failed for user medic-sentinel from 172.18.$[notice] 2023-01-11T12:48:26.895154Z couchdb@127.0.0.1 <0.18754.1> 720ab94d0a haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[warning] 2023-01-11T12:48:28.483554Z couchdb@127.0.0.1 <0.18783.1> 4a747fe794 couch_httpd_auth: Authentication failed for user horticulturalist from 172.1$[notice] 2023-01-11T12:48:28.484056Z couchdb@127.0.0.1 <0.18783.1> 4a747fe794 haproxy:5984 172.18.0.2 undefined GET /medic/ 401 ok 1
[warning] 2023-01-11T12:48:28.780454Z couchdb@127.0.0.1 <0.18784.1> 8873165338 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:28.780959Z couchdb@127.0.0.1 <0.18784.1> 8873165338 haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[notice] 2023-01-11T12:48:28.781412Z couchdb@127.0.0.1 <0.18785.1> 2659d0e966 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:28.805398Z couchdb@127.0.0.1 <0.18786.1> c20d227712 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[warning] 2023-01-11T12:48:28.805735Z couchdb@127.0.0.1 <0.18787.1> 258222178a couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:28.805863Z couchdb@127.0.0.1 <0.18786.1> c20d227712 haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1

[notice] 2023-01-11T12:48:42.545765Z couchdb@127.0.0.1 <0.19153.1> b036d6e2b6 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:42.556265Z couchdb@127.0.0.1 <0.19154.1> 9ef5746e57 couch_httpd_auth: Authentication failed for user medic-sentinel from 172.18.$[notice] 2023-01-11T12:48:42.556797Z couchdb@127.0.0.1 <0.19154.1> 9ef5746e57 haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[notice] 2023-01-11T12:48:45.227787Z couchdb@127.0.0.1 <0.19204.1> 575619db88 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:45.239706Z couchdb@127.0.0.1 <0.19205.1> 70520a2a30 couch_httpd_auth: Authentication failed for user medic-sentinel from 172.18.$[notice] 2023-01-11T12:48:45.240156Z couchdb@127.0.0.1 <0.19205.1> 70520a2a30 haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[warning] 2023-01-11T12:48:45.351570Z couchdb@127.0.0.1 <0.19206.1> 43bd9fa683 couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:45.352208Z couchdb@127.0.0.1 <0.19206.1> 43bd9fa683 haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[notice] 2023-01-11T12:48:45.353328Z couchdb@127.0.0.1 <0.19207.1> a5c3998c35 haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:45.365584Z couchdb@127.0.0.1 <0.19208.1> 0984091fc7 couch_httpd_auth: Authentication failed for user horticulturalist from 172.1$[notice] 2023-01-11T12:48:45.366002Z couchdb@127.0.0.1 <0.19208.1> 0984091fc7 haproxy:5984 172.18.0.2 undefined GET /medic/ 401 ok 1
[warning] 2023-01-11T12:48:45.373724Z couchdb@127.0.0.1 <0.19209.1> 371186236c couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:45.374315Z couchdb@127.0.0.1 <0.19209.1> 371186236c haproxy:5984 172.18.0.2 undefined GET /medic-sentinel/ 401 ok 1
[warning] 2023-01-11T12:48:45.374724Z couchdb@127.0.0.1 <0.19210.1> 139031d24e couch_httpd_auth: Authentication failed for user medic-api from 172.18.0.2
[notice] 2023-01-11T12:48:45.375082Z couchdb@127.0.0.1 <0.19210.1> 139031d24e haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1
[notice] 2023-01-11T12:48:47.836661Z couchdb@127.0.0.1 <0.19243.1> db60cbbbda haproxy:5984 172.18.0.2 undefined GET /medic 401 ok 1
[warning] 2023-01-11T12:48:47.848960Z couchdb@127.0.0.1 <0.19244.1> cd052a75c0 couch_httpd_auth: Authentication failed for user medic-sentinel from 172.18.$[notice] 2023-01-11T12:48:47.849484Z couchdb@127.0.0.1 <0.19244.1> cd052a75c0 haproxy:5984 172.18.0.2 undefined GET /_membership 401 ok 1

@diana, restarting CouchDb and nginx services brings the following error:

Debug: Service ‘medic-core/couchdb’ exited with status 143
Info: Service ‘medic-core/couchdb’ restarted successfully
Success: Finished restarting services in package ‘medic-core’

Running curl -k https://YYYY@XXXX/api/v1/monitoring gives:

404 Not Found

404 Not Found


nginx/1.13.6

Looks nginx is also not running, have been finding out since morning why CouchDb and nginx services are not running, no solution yet

Hi @oyierphil

It looks like CouchDb is up now, but the authentication it gets with requests is incorrect.
Can you try accessing CouchDb directly with CURL from within the container and check if the old admin (or medic) password still works?

@diana, I get the same feedback:

404 Not Found

404 Not Found


nginx/1.13.6

curl: (7) Failed to connect to dharc-medic.jkuat.ac.ke port 5984: Connection refused

Restarting nginx exits with status 143
Debug: Service ‘medic-core/nginx’ exited with status 143
Info: Service ‘medic-core/nginx’ restarted successfully
Success: Finished restarting services in package ‘medic-core’

The service medic-api keeps on failing, the log as below:
[2023-01-11 12:48:22] 2023-01-11 12:48:22 ^[[32mINFO^[[39m: Running server checks ^
[2023-01-11 12:48:22] Node Environment Options: ‘–max_old_space_size=8192’
[2023-01-11 12:48:22] Node Version: 8.11.4 in production mode
[2023-01-11 12:48:22] COUCH_URL http://haproxy:5984/medic
[2023-01-11 12:48:22] COUCH_NODE_NAME couchdb@127.0.0.1
[2023-01-11 12:48:22] 2023-01-11 12:48:22 ^[[31mERROR^[[39m: Error watching sentinel changes, restarting: { error: 'una$
[2023-01-11 12:48:22] reason: ‘Name or password is incorrect.’,
[2023-01-11 12:48:22] status: 401,
[2023-01-11 12:48:22] name: ‘unauthorized’,
[2023-01-11 12:48:22] message: ‘Name or password is incorrect.’ }
[2023-01-11 12:48:22] 2023-01-11 12:48:22 ^[[31mERROR^[[39m: Error watching sentinel changes, restarting: { error: 'una$
[2023-01-11 12:48:22] reason: ‘Name or password is incorrect.’,
[2023-01-11 12:48:22] status: 401,
[2023-01-11 12:48:22] name: ‘unauthorized’,
[2023-01-11 12:48:22] message: ‘Name or password is incorrect.’ }
[2023-01-11 12:48:22] 2023-01-11 12:48:22 ^[[31mERROR^[[39m: Fatal error initialising medic-api
[2023-01-11 12:48:22] 2023-01-11 12:48:22 ^[[31mERROR^[[39m: 'Environment variable 'COUCH_NODE_NAME' set to "couchdb@$[2023-01-11 12:48:25] 2023-01-11 12:48:25 ^[[32mINFO^[[39m: Running server checks ^
[2023-01-11 12:48:25] Node Environment Options: ‘–max_old_space_size=8192’
[2023-01-11 12:48:25] Node Version: 8.11.4 in production mode
[2023-01-11 12:48:25] COUCH_URL http://haproxy:5984/medic
[2023-01-11 12:48:25] COUCH_NODE_NAME couchdb@127.0.0.1
[2023-01-11 12:48:25] 2023-01-11 12:48:25 ^[[31mERROR^[[39m: Error watching sentinel changes, restarting: { error: 'una$[2023-01-11 12:48:25] reason: ‘Name or password is incorrect.’,
[2023-01-11 12:48:25] status: 401,

I have gone through the configs again, can’t see the culprit, have followed all the previous troubleshooting sessions to see if missed something, our app is still down, any idea where I should check next?

Okay - it’s back up!

From the logs you posted above, we can see authentication failed attempts for several users: medic-api, medic-sentinel. Based on those logs, we checked to see if we can log in as medic-api and medic-sentinel from /srv/storage/medic-core/passwd docs on medic-os internals.

It looks like when only the CouchDB data was moved, medic-os boot-up scripts ran and configured new passwords for the service to use. CouchDB still had the old passwords and needed to be updated in /srv/settings/medic-core/couchdb/local.ini:

[admins]
medic-api = secret_password

Some notes:
Inside that VM, it looks like some old previous cht-core tests and installations along with their data are around in various locations. I’d advise you to label or archive those other installations, it will make it a lot easier to maintain.

2 Likes

@hareet, @diana, thank you very much for the support, we are up and running now, will perform clean up as advised tomorrow

1 Like