Elasticsearch: needs better matching for non-ascii search #3550

Open
opened 2026-02-20 16:06:50 -05:00 by deekerman · 0 comments
Owner

Originally created by @vladaman on GitHub (Feb 2, 2019).

Searching using Elasticsearch is very unpredictable when using other language than English. Strings are indexed in utf-8 and searching in same locale causes bad results.

How to replicate:

  1. Create new organization and name it: Mánička
  2. Save and then search for Mánička
  3. I get some other results but none of them actually contains Mánička

So not only the Account can't be found but also search results contain invalid records not related to provided string.

Also suggestion: It would be nice to convert utf8 names to Ascii alternatives, since many users will search without local characters. Such as "Manicka" in this case.

Might be similar to a bug #6771

Originally created by @vladaman on GitHub (Feb 2, 2019). Searching using Elasticsearch is very unpredictable when using other language than English. Strings are indexed in utf-8 and searching in same locale causes bad results. How to replicate: 1. Create new organization and name it: Mánička 2. Save and then search for Mánička 3. I get some other results but none of them actually contains Mánička So not only the Account can't be found but also search results contain invalid records not related to provided string. Also suggestion: It would be nice to convert utf8 names to Ascii alternatives, since many users will search without local characters. Such as "Manicka" in this case. Might be similar to a bug #6771
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/SuiteCRM-SuiteCRM#3550
No description provided.