Shards Issue after migrating an aws volume

Hello,so we have migrated a volume from an aws account to our on premises server and mounted things to the /srv folder as seen here version: '3.7'#################################### NOTICE ############ -, But on starting the services we are facing a shards issue with couch db as seen here [info] 2024-11-29T03:50:09.687215Z couchdb@ <0.9.0> -------- Applicatio - Any thoughts on how to fix this!,thanks

1 Like

Hi @Herbert

My suspicion is a bad copy, and the only recommendation would be to retry the migration.
Let us know how it went.

actually retried with a new copy several times but the same error

Hmm. how are you copying the data? What are your steps?

I am using rysnc to connect between the two servers ie the AWS one and the on premises and I run the command from the on premises one

Can you please share the exact command?
Is your AWS instance in use - responding to requests while you are running rsync?

no the vm on aws just has a mounted folder that i pull using rsync as per this command:

sudo rsync -avz --ignore-existing -e "ssh -i /home/myusername/.ssh/id_rsa" ubuntu@remoteipaddress:/folderonaws/ /srvFolderOnmyonPremises/

Thanks @Herbert . The command seems ok. I know we’ve had some experience with rsync failing over network (maybe @hareet remembers these).
Can you try running a df -h on both directories after the copy to confirm that everything is identical?

Hi @Herbert !

Hm, the rsync command looks okay → the “-a” parameter carries most of the weight for our case. Can you describe the rest of the on-prem setup server? How many resources does it have?

From your logs, CouchDB restarts a few times:

  1. [info] 2024-11-29T03:55:46.365164Z couchdb@ <0.212.0> -------- Apache CouchDB 2.3.1 is starting.
  2. [info] 2024-11-29T04:01:15.491701Z couchdb@ <0.212.0> -------- Apache CouchDB 2.3.1 is starting.

Is there any log entry between those timestamps that points to a crash dump? Are you running CouchDB or other processes at the destination server?

A few things we want to check

  • Sometimes, this error comes as a result of insufficient resources
  • Sometimes, we’ve seen this error as incorrect permissions on the destination server between the files that were transferred and the couchdb container
  • Can we try isolating the rsync transfer? (i.e. shut down all other processes and traffic, send data over, and then turn on processes)
  • Let’s also try copying one shard over at a time. Ensuring the .shard folder is copied before the shard directory.

Restarting the container worked previously, my suspicion is it worked in that scenario due to resource constraints

Other things we could try after the above steps are removing --ignore-existing from the rsync command. Here is our complete rsync command that works for the exact same scenarios we are trying to finish in this task:

sudo rsync -avhWt --no-compress --info=progress2 /folder/cht/couchdb-1/shards/eaaaaaa7-ffffffff/ /new-folder/cht/couchdb-1/shards/eaaaaaa7-ffffffff