Getting error 500 when loading docs

I have some users who are logging in for the first time and when the docs start to load, the randomly get 500.

Medic API logs:-

[2022-07-04 10:09:56] 2022-07-04 10:09:56 INFO: 56d31ab2-a94e-4754-8a92-a4ec791a55af Error while requesting ‘normal’ changes feed [2022-07-04 10:09:56] 2022-07-04 10:09:56 INFO: {​​​​​​badmatch,{​​​​​​’EXIT’,noproc}​​​​​​,[2022-07-04 10:09:56] [{​​​​​​couch_file,pread_binary,2,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_file.erl”}​​​​​​,{​​​​​​line,169}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_file,pread_term,2,[{​​​​​​file,“src/couch_file.erl”}​​​​​​,{​​​​​​line,157}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,get_node,2,[{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,434}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,lookup,3,[{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,284}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,lookup_kpnode,5,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,304}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,lookup,2,[{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,274}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_bt_engine,open_docs,2,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_bt_engine.erl”}​​​​​​,{​​​​​​line,327}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_changes,send_changes_doc_ids,6,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_changes.erl”}​​​​​​,{​​​​​​line,583}​​​​​​]}​​​​​​]}​​​​​​ [2022-07-04 10:09:56] 2022-07-04 10:09:56 INFO: c6620af5-ba32-40b9-afeb-b662274fdab6 Error while requesting ‘normal’ changes feed [2022-07-04 10:09:56] 2022-07-04 10:09:56 INFO: {​​​​​​badmatch,{​​​​​​’EXIT’,noproc}​​​​​​,[2022-07-04 10:09:56] [{​​​​​​couch_file,pread_binary,2,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_file.erl”}​​​​​​,{​​​​​​line,169}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_file,pread_term,2,[{​​​​​​file,“src/couch_file.erl”}​​​​​​,{​​​​​​line,157}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,get_node,2,[{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,434}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,lookup,3,[{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,284}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,lookup_kpnode,5,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,304}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_btree,lookup,2,[{​​​​​​file,“src/couch_btree.erl”}​​​​​​,{​​​​​​line,274}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_bt_engine,open_docs,2,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_bt_engine.erl”}​​​​​​,{​​​​​​line,327}​​​​​​]}​​​​​​,[2022-07-04 10:09:56] {​​​​​​couch_changes,send_changes_doc_ids,6,[2022-07-04 10:09:56] [{​​​​​​file,“src/couch_changes.erl”}​​​​​​,{​​​​​​line,583}​​​​​​]}​​​​​​]}​​​​​​

[2022-07-04 10:09:56] RES 56d31ab2-a94e-4754-8a92-a4ec791a55af 105.160.99.227 - GET /medic/_changes?timeout=600000&style=all_docs&heartbeat=10000&since=5366995-g1AAAAJ7eJyd0EsKwjAQBuBgBcWdC9d6Akk0SaMbvYMH0M4kpRQfK1cu9CZ6EdGTqIdwKzUmpasitAQmkGE-_smaENJOAk26uNtjomHORuGQ2sPWttWICPSzLEuTAEhH9jb2rQWKUwRZNvNHgoGtMCswqh0mEajQqio2_2HLAguvDoulMHEoqmKrH3YssPHBYZpPDJq4IrZt2kpO9rLe2YPTjwNRKSPQ1AIvHrzl6z4cKKTgqOslvHvwmSd8O5BFqGFSL-HLg_kfmoUDI6Y4o7xsNP0CbhWnJA&limit=100 HTTP/1.0 500 - 4479.850 ms

CouchDB logs:-

[error] 2022-07-04T10:13:23.968735Z couchdb@127.0.0.1 <0.26306.2494> -------- CRASH REPORT Process (<0.26306.2494>) with 0 neighbors exited with reason: no function clause matching couch_db_updater:maybe_tag_doc({​​​​​​doc,<<“ba1e817b-4d1f-4708-a41e-8753c86f757b”>>,{​​​​​​0,[]}​​​​​​,[<<128,0,0,25>>,<<233,240,209,55,27,174,…>>,…],…}​​​​​​)(line:267) at couch_db_updater:‘-sort_and_tag_grouped_docs/2-lc$^0/1-0-’/2(line:264) <= lists:map/2(line:1239) <= couch_db_updater:handle_info/2(line:165) <= gen_server:try_dispatch/4(line:601) <= gen_server:handle_msg/5(line:667) <= couch_db_updater:init/1(line:45) <= proc_lib:init_p_do_apply/3(line:247) at gen_server:terminate/7(line:812) <= couch_db_updater:init/1(line:45) <= proc_lib:init_p_do_apply/3(line:247); initial_call: {​​​​​​couch_db_updater,init,[‘Argument__1’]}​​​​​​, ancestors: [<0.6536.2502>], messages: [], links: [<0.218.0>], dictionary: [{​​​​​​idle_limit,61000}​​​​​​,{​​​​​​io_priority,{​​​​​​db_update,<<“shards/00000000-1fffffff/me…”>>}​​​​​​}​​​​​​], trap_exit: false, status: running, heap_size: 6772, stack_size: 27, reductions: 1335

[info] 2022-07-04T10:13:23.993714Z couchdb@127.0.0.1 <0.218.0> -------- db shards/00000000-1fffffff/medic.1632510268 died with reason {​​​​​​function_clause,[{​​​​​​couch_db_updater,maybe_tag_doc,[{​​​​​​doc,<<“6197b8f4-2650-45e6-ab3e-b7e755e59560”>>,{​​​​​​0,[]}​​​​​​,[<<128,0,0,25>>,<<233,240,209,55,27,174,216,56,116,74,41,142,69,49,20,211>>,<<131,104,2,109,0,0,0,7,1,4,12,131,104,1,106,109,0,0,0,5,1,2,4,131,106>>],[],false,[{​​​​​​comp_body,<<1,4,12,131,104,1,106>>}​​​​​​,{​​​​​​size_info,[]}​​​​​​,{​​​​​​atts_stream,nil}​​​​​​,{​​​​​​ejson_size,4}​​​​​​,{​​​​​​ref,#Ref<0.0.1248067591.73865>}​​​​​​]}​​​​​​],[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,267}​​​​​​]}​​​​​​,{​​​​​​couch_db_updater,‘-sort_and_tag_grouped_docs/2-lc$^0/1-0-’,2,[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,264}​​​​​​]}​​​​​​,{​​​​​​lists,map,2,[{​​​​​​file,“lists.erl”}​​​​​​,{​​​​​​line,1239}​​​​​​]}​​​​​​,{​​​​​​couch_db_updater,handle_info,2,[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,165}​​​​​​]}​​​​​​,{​​​​​​gen_server,try_dispatch,4,[{​​​​​​file,“gen_server.erl”}​​​​​​,{​​​​​​line,601}​​​​​​]}​​​​​​,{​​​​​​gen_server,handle_msg,5,[{​​​​​​file,“gen_server.erl”}​​​​​​,{​​​​​​line,667}​​​​​​]}​​​​​​,{​​​​​​couch_db_updater,init,1,[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,45}​​​​​​]}​​​​​​,{​​​​​​proc_lib,init_p_do_apply,3,[{​​​​​​file,“proc_lib.erl”}​​​​​​,{​​​​​​line,247}​​​​​​]}​​​​​​]}​​​​​​

[error] 2022-07-04T10:13:23.993778Z couchdb@127.0.0.1 <0.24805.2509> -------- rexi_server: from: couchdb@127.0.0.1(<0.12614.2498>) mfa: fabric_rpc:update_docs/3 exit:{​​​​​​function_clause,[{​​​​​​couch_db_updater,maybe_tag_doc,[{​​​​​​doc,<<“6197b8f4-2650-45e6-ab3e-b7e755e59560”>>,{​​​​​​0,[]}​​​​​​,[<<128,0,0,25>>,<<233,240,209,55,27,174,216,56,116,74,41,142,69,49,20,211>>,<<131,104,2,109,0,0,0,7,1,4,12,131,104,1,106,109,0,0,0,5,1,2,4,131,106>>],[],false,[{​​​​​​comp_body,<<1,4,12,131,104,1,106>>}​​​​​​,{​​​​​​size_info,[]}​​​​​​,{​​​​​​atts_stream,nil}​​​​​​,{​​​​​​ejson_size,4}​​​​​​,{​​​​​​ref,#Ref<0.0.1248067591.73865>}​​​​​​]}​​​​​​],[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,267}​​​​​​]}​​​​​​,{​​​​​​couch_db_updater,‘-sort_and_tag_grouped_docs/2-lc$^0/1-0-’,2,[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,264}​​​​​​]}​​​​​​,{​​​​​​lists,map,2,[{​​​​​​file,“lists.erl”}​​​​​​,{​​​​​​line,1239}​​​​​​]}​​​​​​,{​​​​​​couch_db_updater,handle_info,2,[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,165}​​​​​​]}​​​​​​,{​​​​​​gen_server,try_dispatch,4,[{​​​​​​file,“gen_server.erl”}​​​​​​,{​​​​​​line,601}​​​​​​]}​​​​​​,{​​​​​​gen_server,handle_msg,5,[{​​​​​​file,“gen_server.erl”}​​​​​​,{​​​​​​line,667}​​​​​​]}​​​​​​,{​​​​​​couch_db_updater,init,1,[{​​​​​​file,“src/couch_db_updater.erl”}​​​​​​,{​​​​​​line,45}​​​​​​]}​​​​​​,{​​​​​​proc_lib,init_p_do_apply,3,[{​​​​​​file,“proc_lib.erl”}​​​​​​,{​​​​​​line,247}​​​​​​]}​​​​​​]}​​​​​​ [{​​​​​​couch_db,collect_results,3,[{​​​​​​file,“src/couch_db.erl”}​​​​​​,{​​​​​​line,1251}​​​​​​]}​​​​​​,{​​​​​​couch_db,collect_results_with_metrics,3,[{​​​​​​file,“src/couch_db.erl”}​​​​​​,{​​​​​​line,1233}​​​​​​]}​​​​​​,{​​​​​​couch_db,write_and_commit,4,[{​​​​​​file,“src/couch_db.erl”}​​​​​​,{​​​​​​line,1263}​​​​​​]}​​​​​​,{​​​​​​couch_db,update_docs,4,[{​​​​​​file,“src/couch_db.erl”}​​​​​​,{​​​​​​line,1155}​​​​​​]}​​​​​​,{​​​​​​fabric_rpc,with_db,3,[{​​​​​​file,“src/fabric_rpc.erl”}​​​​​​,{​​​​​​line,334}​​​​​​]}​​​​​​,{​​​​​​rexi_server,init_p,3,[{​​​​​​file,“src/rexi_server.erl”}​​​​​​,{​​​​​​line,140}​​​​​​]}​​​​​​]

[notice] 2022-07-04T10:13:23.994029Z couchdb@127.0.0.1 <0.22169.2508> b56fc793d3 echis-production-go-ke.lg-apps.com 44.199.169.171 medic-replication PUT /medic/6197b8f4-2650-45e6-ab3e-b7e755e59560?new_edits=false 500 ok 5

[error] 2022-07-04T10:13:23.994399Z couchdb@127.0.0.1 <0.19969.2513> -------- gen_server <0.19969.2513> terminated with reason: no function clause matching couch_db_updater:maybe_tag_doc({​​​​​​doc,<<“6197b8f4-2650-45e6-ab3e-b7e755e59560”>>,{​​​​​​0,[]}​​​​​​,[<<128,0,0,25>>,<<233,240,209,55,27,174,…>>,…],…}​​​​​​)(line:267) at couch_db_updater:‘-sort_and_tag_grouped_docs/2-lc$^0/1-0-’/2(line:264) <= lists:map/2(line:1239) <= couch_db_updater:handle_info/2(line:165) <= gen_server:try_dispatch/4(line:601) <= gen_server:handle_msg/5(line:667) <= couch_db_updater:init/1(line:45) <= proc_lib:init_p_do_apply/3(line:247)

last msg: {​​​​​​update_docs,<0.24805.2509>,[[{​​​​​​doc,<<“6197b8f4-2650-45e6-ab3e-b7e755e59560”>>,{​​​​​​0,[]}​​​​​​,[<<128,0,0,25>>,<<233,240,209,55,27,174,216,56,116,74,41,142,69,49,20,211>>,<<131,104,2,109,0,0,0,7,1,4,12,131,104,1,106,109,0,0,0,5,1,2,4,131,106>>],[],false,[{​​​​​​comp_body,<<1,4,12,131,104,1,106>>}​​​​​​,{​​​​​​size_info,[]}​​​​​​,{​​​​​​atts_stream,nil}​​​​​​,{​​​​​​ejson_size,4}​​​​​​,{​​​​​​ref,#Ref<0.0.1248067591.73865>}​​​​​​]}​​​​​​]],[],true,true}​​​​​​

state: {​​​​​​db,1,<<“shards/00000000-1fffffff/medic.1632510268”>>,“./data/shards/00000000-1fffffff/medic.1632510268.couch”,{​​​​​​couch_bt_engine,{​​​​​​st,“./data/shards/00000000-1fffffff/medic.1632510268.couch”,<0.8843.2497>,#Ref<0.0.2030829571.125105>,[before_header,after_header,on_file_open],{​​​​​​db_header,7,669219,0,{​​​​​​2081313823,{​​​​​​651687,29,{​​​​​​size_info,1440362789,2117074764}​​​​​​}​​​​​​,85345309}​​​​​​,{​​​​​​2081315376,651716,76980604}​​​​​​,{​​​​​​2081283354,[],583836}​​​​​​,nil,nil,1398582393,1000,<<“b840cb641fe249b112168718c85ac066”>>,[{​​​​​​’couchdb@127.0.0.1’,0}​​​​​​],567723,1000}​​​​​​,false,{​​​​​​btree,<0.8843.2497>,{​​​​​​2081313823,{​​​​​​651687,29,{​​​​​​size_info,1440362789,2117074764}​​​​​​}​​​​​​,85345309}​​​​​​,#Fun<couch_bt_engine.id_tree_split.1>,#Fun<couch_bt_engine.id_tree_join.2>,undefined,#Fun<couch_bt_engine.id_tree_reduce.2>,snappy}​​​​​​,{​​​​​​btree,<0.8843.2497>,{​​​​​​2081315376,651716,76980604}​​​​​​,#Fun<couch_bt_engine.seq_tree_split.1>,#Fun<couch_bt_engine.seq_tree_join.2>,undefined,#Fun<couch_bt_engine.seq_tree_reduce.2>,snappy}​​​​​​,{​​​​​​btree,<0.8843.2497>,{​​​​​​2081283354,[],583836}​​​​​​,#Fun<couch_bt_engine.local_tree_split.1>,#Fun<couch_bt_engine.local_tree_join.2>,undefined,nil,snappy}​​​​​​,snappy,{​​​​​​btree,<0.8843.2497>,nil,#Fun<couch_bt_engine.purge_tree_split.1>,#Fun<couch_bt_engine.purge_tree_join.2>,undefined,#Fun<couch_bt_engine.purge_tree_reduce.2>,snappy}​​​​​​,{​​​​​​btree,<0.8843.2497>,nil,#Fun<couch_bt_engine.purge_seq_tree_split.1>,#Fun<couch_bt_engine.purge_seq_tree_join.2>,undefined,#Fun<couch_bt_engine.purge_tree_reduce.2>,snappy}​​​​​​}​​​​​​}​​​​​​,<0.19969.2513>,nil,669219,<<“1656929603990655”>>,{​​​​​​user_ctx,null,[],undefined}​​​​​​,[{​​​​​​<<“admins”>>,{​​​​​​[{​​​​​​<<“names”>>,[]}​​​​​​,{​​​​​​<<“roles”>>,[<<“national_admin”>>]}​​​​​​]}​​​​​​}​​​​​​],[#Fun<couch_doc.7.79513450>,#Fun<couch_doc.7.79513450>],nil,nil,nil,[{​​​​​​default_security_object,[]}​​​​​​,{​​​​​​timeout,100}​​​​​​,{​​​​​​create_if_missing,true}​​​​​​,{​​​​​​user_ctx,{​​​​​​user_ctx,<<“medic-replication”>>,[<<“_admin”>>],<<“cookie”>>}​​​​​​}​​​​​​],undefined}​​​​​​

extra: []

[error] 2022-07-04T10:13:23.994643Z couchdb@127.0.0.1 <0.19969.2513> -------- CRASH REPORT Process (<0.19969.2513>) with 0 neighbors exited with reason: no function clause matching couch_db_updater:maybe_tag_doc({​​​​​​doc,<<“6197b8f4-2650-45e6-ab3e-b7e755e59560”>>,{​​​​​​0,[]}​​​​​​,[<<128,0,0,25>>,<<233,240,209,55,27,174,…>>,…],…}​​​​​​)(line:267) at couch_db_updater:‘-sort_and_tag_grouped_docs/2-lc$^0/1-0-’/2(line:264) <= lists:map/2(line:1239) <= couch_db_updater:handle_info/2(line:165) <= gen_server:try_dispatch/4(line:601) <= gen_server:handle_msg/5(line:667) <= couch_db_updater:init/1(line:45) <= proc_lib:init_p_do_apply/3(line:247) at gen_server:terminate/7(line:812) <= couch_db_updater:init/1(line:45) <= proc_lib:init_p_do_apply/3(line:247); initial_call: {​​​​​​couch_db_updater,init,[‘Argument__1’]}​​​​​​, ancestors: [<0.26577.2514>], messages: [], links: [<0.218.0>], dictionary: [{​​​​​​idle_limit,61000}​​​​​​,{​​​​​​io_priority,{​​​​​​db_update,<<“shards/00000000-1fffffff/me…”>>}​​​​​​}​​​​​​], trap_exit: false, status: running, heap_size: 6772, stack_size: 27, reductions: 1336

1 Like

Hi @melema

Thank you for reporting. It looks like CouchDb is the root cause of your replication error.
Could you please provide some details about:

  1. How many docs does your whole server have
  2. Server load at the time of the failure
  3. If this is happening consistently with some users and working fine for others.
  4. How you are running CHT-Core: for example are you using docker?
  5. Which version of CHT-Core and CouchDb you are using.

Thank you!

  1. 5252733 docs on medic database
  2. cpu usage was less than 50% and memory usage was low
  3. This is happening consistently with some users and working fine for most users.
  4. We are using docker
  5. CHT-Core 3.13

@diana any update on this?

Can you please also share storage drive usage?
I’m going to ask our Site Reliability Engineers about this.

Disk usage is at 40%. There is more than enough space.

Hm, we haven’t seen this issue before. After some quick searching, lets verify your disk iops. Can you find out that information and monitor that metric during load?

1 Like

The logs also mention specific docs, ba1e817b-4d1f-4708-a41e-8753c86f757b and 6197b8f4-2650-45e6-ab3e-b7e755e59560. Can you try accessing those docs with curl or with fauxton and see if you get any errors?

These two docs are not available when we searched

Disk IOPS is from AWS shows 3000

@diana any update on this issue?

Can you please elaborate? What did you try and what happened exactly?

@diana I tried searching with fauxton as shown below:-

Can you try with curl please? Pls share the result of the curl request.

Can you share the curl command?

curl <your instance url with authorization>/medic/<document id>

@diana
I am getting this:-

{
“error”: “not_found”,
“reason”: “missing”
}

@melema
are you getting the same for both documents?

Yes @diana , we get the same for both documents uuids…
{
“error”: “not_found”,
“reason”: “missing”
}

For existing document uuids (that we pick from the instance as a sample), we get the document as a response.

Looking at the couchdb logs, it seems that a put is crashing the shard?

[notice] 2022-07-04T10:13:23.994029Z couchdb@127.0.0.1 <0.22169.2508> b56fc793d3 echis-production-go-ke.lg-apps.com 44.199.169.171 medic-replication PUT /medic/6197b8f4-2650-45e6-ab3e-b7e755e59560?new_edits=false 500 ok 5

So medic-replication tries to write 6197b8f4-2650-45e6-ab3e-b7e755e59560 to the database.
And then, immediately the shard crashes.
What is medic-replication? Is it the user you were trying to log in with? Or is it some other user that pushes data into your DB?