Integrate 20,000+ records from 3rd party API

Hi. I need to import/synch 20,000+ records from an external system (API). The synch should happen weekly. Any guidance on how to best do this to not exceed script execution time and optimize performance?

Hi,

Does the external API support paging? That is retrieval of only portions of the 20k data set?

Regards,
Mark

Hi. It doesn’t support paging. They give us an API endpoint that returns one large JSON response with 20K records in it. Users of our Backendless (codeless) app need to search through the records and make selections. Right now we have a subset of the data in our Backendless database and users query the data from there.

In this case, you should break down the entire job into the following steps:

  1. Run a weekly timer with logic responsible for fetching the records and saving them in the file system of your backendless app.
  2. Write down to the database the current index of the record to be processed (which is 0 when you start). A value of -1 may indicate there is no file to process.
  3. Run a more frequent timer (it can run as frequently as every 2 minutes if needed). The timer would check the current index in the database (from step 1). If the index is -1, quit until the next run. If not, load the file and process X records. The number of records should be small enough to handle in the allotted time. When done, update the index in the database for the next run of the timer. Once you’re done with all records, set the index in the database to -1

Hope this helps.

Mark

Thanks Mark. Makes sense. I’ll try it.

@Marc_Chriqui I’d love to see what you came up to on this, I have a similar use case :slight_smile:

@Andreas_Marinopoulos I’ll share what I did as soon as I can, and it worked, but to be honest, I ended up going with an alternate solution that satisfied the needs of my app and was less complex. Instead of copying each row of the 20,000+ record set, I stored the data as received from the external API in a JSON table column. My app UI then loads it all (JSON-to-object capabilities in Backendless are great) when needed and retrieves the records it needs at runtime. The solution you go with all depends on the needs of your app. Do you really need to store all of the external data in individualized database rows (time-consuming operation), or can you live with just storing it in bulk (as JSON for example) and querying it as needed? Backendless has strong JSON features. Sharing your app’s use case here might help.

@Marc_Chriqui thanks. My use case is to send notifications to our students. Classes are set at specific times ever HH:00 and HH:30 resulting in peaks of notifications.

I am thinking to try to load upcoming notifications in a table or in Cache via a timer. Then add a second timer to schedule notifications ahead gradually, thus avoiding the peaks.