Distance between two points (not geodata)

Hey guys,

I’m having a table with a text and an embedding of this text. The embedding is stored in a JSON field, as an array with 1536 floats.

My goal is to fetch the closest objects to a new given embedding. The logic itself is quite simple.
In a 2D coordinate system the formula would be d=√((x2 – x1)² + (y2 – y1)²). I adjusted the formula for the 1536 dimensions of the embeddings, which enables me to calculate the distance between any two embeddings. I planned to place the formula in the properties section [ABS(SQRT(formula)) AS distance] and then sort by distance ascending.

But I encountered several problems when trying to use the “Load table objects” block:

  1. ABS & SQRT are unknown formulas
  2. When I remove ABS/SQRT I receive the error “Parenthesis allowed only after keywords. There is no keyword or token before it.”

I hope that someone can point me in any direction to achieve the original goal (get objects with the closes distance to a new given embedding).

Hello @Anton_Henkelmann,

Unfortunately, Backendless doesn’t support ABS and SQRT functions.
I’ve created an internal ticket BKNDLSS-31414 to add them.

Regards,
Olha

Hello @olhadanylova

Thank you for adding the functions to BL. Do you have any ETA on this?

Right now I cannot provide any estimates, but our backend developer will start this task as soon as possible.

2D coordinate system names “Cartesian”.
If your POINTs are stored in this spatial reference system (SRS), the distance function will calculate correct distance. In such a case the return value is a dimensionless value (i.e. depends only on the coordinates).

E.g. the WGS84 (World Geodetic System, Web Mercator, etc.) spatial reference system intended for work with coordinates on the Earth (there are also other SRSs exist).
The Cartesian SRS is intended for work with the coordinates on plane.

We had a possibility to set SRS during column creation with types POINT/LINESTRING/POLYGON. But as i see, it was removed for some reasons.
We’ll consider its returning.

These types are native to the DB and the operations on it are much faster.
When the SRS selection is added, you have to convert your spatial data into POINT/LINESTRING/POLYGON.
I think it may appear with the next release.

Just for test:
It is possible now to create spatial type with the Cartesian SRS using export/import. The number of that SRS = 0.

  1. make the table export (schema only) where the column with POINT type is present;
  2. change the SRS to 0
  3. add to the csv file your points (e.g. POINT (12354 8273498) )
  4. import the file

Hey Oleg

thank you for your idea on how to solve the problem.
If I understood you correctly, setting SRS = 0 allows me to place points on a cartesian (2D) coordinate system and use the built in distance function? However my goal is to calculate the distance between two 1536 dimensional points. Is there a SRS for 1536 dimensions?

If I understood you correctly, setting SRS = 0 allows me to place points on a cartesian (2D) coordinate system and use the built in distance function?

Yes. Moreover not only distance but any spatial function will be adapted to the specified SRS.

Is there a SRS for 1536 dimensions?

Backendless uses MySQL, it has over 5000 SRSs, but 1536 isn’t exists.
Could you point me to the description of 1536.

In the OpenAI documentation for embeddings it says that the model “text-embedding-ada-002” outputs 1536 dimensions. That’s where the requirement is coming from.

But we are talking about spatial reference system.
Unfortunately, I didn’t find that SRS (1536), and think think that it has nothing to do with SRS at all.
If you know that your spatial data in Cartesian format – just use it.

An example of how SRS description is look like:
the one for the global usage

                SRS_NAME: WGS 84
                  SRS_ID: 4326
            ORGANIZATION: EPSG
ORGANIZATION_COORDSYS_ID: 4326
              DEFINITION: GEOGCS["WGS 84",DATUM["World Geodetic System 1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.017453292519943278,AUTHORITY["EPSG","9122"]],AXIS["Lat",NORTH],AXIS["Lon",EAST],AUTHORITY["EPSG","4326"]]

the one for the local usage:

                SRS_NAME: NAD83(NSRS2007) / California zone 1
                  SRS_ID: 3489
            ORGANIZATION: EPSG
ORGANIZATION_COORDSYS_ID: 3489
              DEFINITION: PROJCS["NAD83(NSRS2007) / California zone 1",GEOGCS["NAD83(NSRS2007)",DATUM["NAD83 (National Spatial Reference System 2007)",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6759"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.017453292519943278,AUTHORITY["EPSG","9122"]],AXIS["Lat",NORTH],AXIS["Lon",EAST],AUTHORITY["EPSG","4759"]],PROJECTION["Lambert Conic Conformal (2SP)",AUTHORITY["EPSG","9802"]],PARAMETER["Latitude of false origin",39.3333333333333,AUTHORITY["EPSG","8821"]],PARAMETER["Longitude of false origin",-122,AUTHORITY["EPSG","8822"]],PARAMETER["Latitude of 1st standard parallel",41.6777777777778,AUTHORITY["EPSG","8823"]],PARAMETER["Latitude of 2nd standard parallel",40,AUTHORITY["EPSG","8824"]],PARAMETER["Easting at false origin",2000000,AUTHORITY["EPSG","8826"]],PARAMETER["Northing at false origin",500000,AUTHORITY["EPSG","8827"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["X",EAST],AXIS["Y",NORTH],AUTHORITY["EPSG","3489"]]
             DESCRIPTION: NULL

Google doesn’t know about 1536…

One clarification, please.
When you are talking about embeddings what do you mean? Is it a polygon, or a line, what form this object have?
If it is not a figure but just a collection of points – in that case it is still better to organize the points like a POINTs and work with collections of them.

Hi, @Anton_Henkelmann

We have added the option to select the Cartesian SRS when creating a spatial data type column. Please check if this is available for you now?

Regards,
Marina

Yes, it’s available for me. Thanks for the support :slight_smile: