Recommendations for approximate text search in Data service?

rick-hangartner · September 6, 2015, 8:13pm

Hi, looking for your recommendations on the best way to do approximate text search in the Data service.

For simplicity, suppose a Data table with one string member that can be multi-word strings. We assume users are can mispell words and may not specify in a query in the same order they occur in the target string in the data table column.

One might use something like Lucene to do this kind of approxmate matching. Presumably one would have run a Lucene server that first matches a query string to a string (or strings) that occurs in the Data service table and using that result to actually query Backendless. It doesn’t appear one could run a Lucene instance on Backendless itself.

The Backendless Data service docs say that Backendless essentially implements the SQL-92 WHERE clause where the “LIKE” predicate is probably the only relevant approximate string matching predicate. Assuming “LIKE” is available, one might propose two general solutions as a half-way step in which one isn’t concerned with word order, only a general class of per-word misspellings:

One could reduce a query string to a pattern string and try to match to that.
One could create a column of stemmed strings in Backendless and reduce the user’s query string to a stemmed string, and then perhaps to a pattern string, and match to that.

Any suggestions for a better, more flexible solution?

mark-piller · September 8, 2015, 6:33pm

Hi Rick,

Great question! Yes, the LIKE operator is available and both options you described would work. Alternatively, you could also create an event handler for the tables with searchable content. The event handler would be attached to the “create”, “update” and “remove” events. The custom code in the handler would need to replicate the searchable data (with any relative metadata) into Amazon’s CloudSearch. Once the data is available in CloudSearch, you could use Amazon’s API to perform traditional full text search queries.

We’re contemplating providing direct APIs for this purpose as well which would eliminate the need for most of the steps I described.

Regards,
Mark

tobechukwu-ezenachukwu · April 20, 2018, 6:55pm

Hello Mark,

This would be an interesting feature. Is it available yet on 4.0 or 5.0 versions of backendless?

mark-piller · April 20, 2018, 7:01pm

Hi Tobechukwu,

We have not implemented the full text search yet. More than likely, we’ll introduce a service via the Marketplace which provides integration with an external resource (like CloudSearch).

Regards,
Mark

Ankit_Gupta · May 27, 2021, 12:04pm

Hey Mark, Did we ever get to this and launch anything on the marketplace?

mark-piller · May 27, 2021, 6:56pm

No, there were always other tasks with higher priorities.