Issues running wdio tests locally

We’ve been trying to run the following three tests locally, each one without success :

  1. default-wdio-mobile-local
  2. standard-wdio-local
  3. wdio-local

We’re running these tests to make sure when we implement certain changes, we’re not introducing any instability along with it. Because we’ve run into failures on the CHT owned branche tests themselves, we don’t have a base to work off of to prove our work behaves as expected.

We’ve followed the dev setup doc where applicable, but after some tinkering it seems we also needed to run npx grunt in root for ddocs to be generated.
Are there are any prerequisites or considerations that we might have missed that’s causing the test failures?
What would the order of script execution be, or what’s the recommended approach?

We’ve run into some timeouts, where elements are not showing up after 30000ms, can this be considered a flaky test?
In other cases some elements just could not be found via the element selector.

Environment:
VMware
Memory - 4 GB
Hard Disk - 35 GB
Processors - 2
Ubuntu - 22.04.2 lts
CHT core - branch version 4.2.x

Testing against:
Google chrome - 122.0.6261.69

First off, it sounds like you pretty much have everything correct, but for the record, here is some background details:

Executing the e2e tests on historical (non-current) versions of cht-core is always going to be a bit dicey in terms of getting the proper local setup. The latest documented local dev environment setup steps may be misleading since they are going to be written to apply to only the latest cht-core code.

Probably the most reliable information to go off of would be to look at the GitHub action config for your particular cht-core version and see what versions/dependencies are being used there. For example, for cht-core versions ~4.3, I think the main things you will need are:

  • Node 16
  • Python 3
  • pyxform-medic
  • grunt-cli
  • cht-conf (generally speaking, the latest should be backwards compatible with fairly old versions)
  • Google Chrome (this might be the most tricky for really old cht-core versions. WebdriverIO is pretty sensitive to the Chrome version (and CI just runs on the latest for Ubuntu). So, older WDIO versions might have issues with newer versions of Chrome. And, of course, it is a real pain to try and install previous versions of Chrome…)
  • Build the project:
    • npm ci
    • grunt
  • Run the tests (e.g. npm run wdio-local)

All that being said, it sounds like you have made it to the point where many of the tests pass, so I expect all of this is pretty much configured correctly.

We’ve run into some timeouts, where elements are not showing up after 30000ms, can this be considered a flaky test?
In other cases some elements just could not be found via the element selector.

From personal experience, I can confirm that test failures along these lines were a common symptom in the cht-core tests for flaky tests (and re-running the tests would often get them to pass). Often when running the tests locally, I found it useful to edit the wdio specs config to point to a particular test file to be able to quickly re-run just the tests in that file.

A considerable amount of work have been done in recent versions of cht-core to improve the reliability of the e2e tests (and reduce flakiness). So, the more recent your cht-core version, the less I would expect this to be an issue.

1 Like

Thank you for the detailed feedback @jkuester.

That is a point I was hoping to touch on. Would it not be advantageous to have documentation that’s split by CHT version? Similar to how Couch DB split up their docs by version. Although, I do see how this can introduce quite a bit of overhead. Perhaps the differences between versions aren’t substantial enough, or live long enough, to justify this?

Appreciate you taking the time to point us in the github action direction and writing a summarized list of concerns - thank you so much. The Chrome challenges have been interesting for sure.

We’re experiencing quite varying test results. When this thread was originally opened, we had a ~70% failure rate on the three mentioned tests (with all of the standard tests failing in the previous run).
After running the tests yesterday, without changing the setup, we’re now getting upwards of 75% success:
Mobile - 2 passed, 1 failed, 3 total (100% completed) in 00:03:22
Standard - 6 passed, 1 failed, 7 total (100% completed) in 00:15:08
Wdio-local - 58 passed, 18 failed, 76 total (100% completed) in 03:43:10

Below the test outputs for your interest (content too large to include in body):

Now that we have a more condensed list of failures, we’ll definitely be taking your advice on targeting specific tests.

We’re keen to be in step with the newest CHT releases, and are endeavouring to do so as soon as possible. This thread is helping us get over the line, thank you for being so attentive :slight_smile: .

Would it not be advantageous to have documentation that’s split by CHT version?

For our documentation about running and maintaining a CHT instance, I think we typically try to make it relevant for all recent versions of the CHT (even those which are no longer officially “supported”). We call out which is the minimum version required to use a particular feature, etc. However, for the dev environment setup docs, there is just not the justification for making the same effort. Basically all CHT “development” is happening against the latest version of the code (except for very specific use cases such as yours). And to be honest we already have quite a bit of a challenge keeping the docs up to date with just what you need for the latest environment… :grimacing: (mostly it is tricky with the differences between various OS’s).

We’re keen to be in step with the newest CHT releases, and are endeavouring to do so as soon as possible. This thread is helping us get over the line, thank you for being so attentive :slight_smile: .

:heart:

1 Like

Totally understandable, it’s no small undertaking. Joining the last round-up, it seems there’s quite an emphasis to tend to documentation throughout - which is awesome :slight_smile: !


After following the suggestion of trimming down tests based on successful runs and comparing my results with those of another colleague, we’re left with the following consistently failing items:

WDIO-local fails:

“./contacts/add-custom-hierarchy.wdio-spec.js”,
“./db/initial-replication.wdio-spec.js”,
“./enketo/edit-report-with-attachment.wdio-spec.js”,
“./purge/purge.wdio-spec.js”,
“./reports/reports-subject.wdio-spec.js”,
“./sms/gateway.wdio-spec.js”,
“./users/add-user.wdio-spec.js”

MOBILE fails:

‘./reports/bulk-delete.wdio-spec.js’

STANDARD fails:

‘./enketo/immunization-visit.wdio-spec.js’,

My colleague made sure to align closer to the pipeline setup suggested in the above conversations, while I continued to test with the initial setup. Yet our results are basically identical.
I’m not sure what we’re missing.
Does anything stand out to you, that these tests might have in common?

Unfortunately, there are just too many moving parts in those e2e tests to be able to easily narrow down a root cause (without extensive debugging). Looking at the logs you previously posted, nothing jumped out to me as an indicator of a specific problem. Which version of the cht-core code are you running tests for?

I am assuming your goal with getting these tests to run is that you have made additional code changes on top of your fork of cht-core and you are trying to make sure there has not been a regression in the existing functionality. If this is true, then one imperfect option would be to start by running the the tests for your base cht-core version (without your custom changes) and see which ones fail. Then use that list as your base-line. You know that you can expect those tests to continue to fail on your fork, but if any additional ones fail, there is a regression in the functionality on the fork.

1 Like

That’s understandable, thank you for taking a look at the logs. We’re using the 4.2.x branch, specifically commit hash 8d99a9af559b486b3691e8dfb7f3302bb6cd373a.
Just for my own clarity, does the tests have a retry policy? We’ve noticed that some tests fail initially, and then after a ~5 retries the values being asserted, reflects.

You’re 100% on the money with your assumption. Everything seems to work, but I have no guarantee that I haven’t broken something else, or that, in some cases, I haven’t yet considered how the code behaves differently.
Thank you for your suggestion, we’ve been doing so in the background. Initially it seems like additional totally unrelated tests seem to fail which switching to the feature branch.
I’ll run a few more iterations and revert back when the tests have settled.

Yes! It will retry the e2e tests 5 times.

1 Like