Unexpected search results

Good day,

When using the “Search everything” bar at the top of the people tab, we’ve noticed some unexpected results - specifically when it comes to numbers.

  1. Having a number at the end of the search term does not seem to work:



    image

  2. Search terms containing just numbers sometimes return patient names:
    image

  3. Search terms with large numbers at the start seem to work fine:

More details around this behavior would be greatly appreciated, as we plan to include the usage of this search bar in our CHW training.

1 Like

Hi @Anro !

I think I know what’s happening here. When indexing documents, the title and several other fields are split up into key words. For example: “training masi 1” becomes “training”, “masi” and “1”. Then the CHT ignores keywords that are less than 3 letters long, in your case “1” because these would often get too many results.

When you type in a search term it also gets split up into words but I think there’s a bug where the short words aren’t ignored, which means in your case you’re searching for “Masi” AND “1” which gets no results because the “1” isn’t found.

The workaround is to only search using words that are at least 3 letters long.

Does this work for your use case? Will your production facilities be numbered as in the training instance, or will they have longer names?

I found a bug already filed for this: Searching for a contact name that has a short value will return no results. · Issue #7288 · medic/cht-core · GitHub

1 Like

Hi @gareth

Thank you for all the provided info, this enabled us to look more critically at how were structuring our hierarchy place names; while also being able to delve deeper into the duplicate data capture issue.

I’m glad to hear the “inconsistent search result” bug has been resolved in the upcoming v4.6 version.

As for now, I’ve conveyed the 3 character workaround. While it will help with some cases, in the rural areas, which arguably requires the most care, there seems to be an established naming process:

So far location “addresses”/descriptions appear to be the main fields affected, specifically in the informal settlements where there are no roads or erfs. Every shack gets a number, and the area gets a letter or set of letters. Those two are then combined. The CHW adds more details in the description field, e.g. “With blue door next to dead tree, one row back.”
Whole suburbs in informal areas have also been described using single letters historically, e.g. Site B or Site C in Khayelitsha.

As outlined above this often leads to places being named “NW 1”, “NW 2”, etc.
It would be impossible to search for these entities, and would be quite laborious to find the correct place.
Implementing some sort of concatenation could be dangerous as a blanket implementation - for instance if we have a place called “15 Ys Street”, which for argument sake is a legitimate address.

We’ve dreamt up an idea of affixing special characters (like ~) to words with less than 3 characters. The search will also need to be tweaked to do the same thing in order to produce a match.
These special characters will then need to be ignoring from display everywhere else.
Seems like quite an undertaking, and essentially it will have the same effect as simply lowering the search count to 2 characters.

Do you know of another way we could approach this? Has anyone else run into such an issue?

The search indexes (almost) every field in every contact so you don’t need to just use name or address. In your example you could search for “blue door” and get a result. Or if you have a unique identifier like a national ID you could search for that either by typing in the number or soon, using barcode search. Unfortunately the search works particularly poorly for addresses where the number is usually too short to index.

I think we need to take another look at freetext search. The existing solution doesn’t perform well and frequently causes misunderstandings or fails to account for use cases such as yours. For example, we index all fields in the contact, but it’s clear most people just search by the name of the person/place so it would be much more efficient to just search the name, perhaps efficient enough that we could search much shorter keywords, potentially even single words

Tagging @michael here in case he has any thoughts about the overall search experience.

That is a good point. Is there a way to include a sort of “subtitle” in the list tile or list entry in order to visually represent the reason why this record got a hit on such a search term?
I assume it will have a screen real-estate impact, but perhaps it’s a relatively small trade off compared to the clarity it provides.
Eg. At the moment searching for “blue door”, in this case, would just a show a list entry with the name “NW 1” - which as you’ve noted is a bit jarring.
What are your thoughts?

I’m excited to hear about the barcode searching capabilities. I believe some of our stakeholders have expressed interest not too long ago.

Thank you for taking the time to further investigate the search functionalities, and the impact thereof.
Perhaps as a interim solution numbers, specifically, could be handled differently compared to normal words?

We’ve had a lot of questions and issues raised where people think search is broken because the term isn’t shown in the list view. While this is working exactly as intended it’s clearly not what people expect. Furthermore indexing all those properties is incredibly expensive (the freetext views are easily the largest views in any CHT instance) so if people are only using it to search the name then it would be more intuitive and far faster to only index the name. So fast in fact that we probably wouldn’t need the 3 letter keyword limit.

I suggest we have a look at the telemetry around searching and see if we can figure out what people are using the freetext search field for and then we can make an educated decision about how it should be optimised for both usability and performance.

1 Like

Hi @gareth, has the telemetry scrutiny yielded any results regarding the community’s usage of the search function?