Getaddrinfo ENOTFOUND bl-server

Tim_Jones · August 27, 2024, 12:03am

Thanks for the quick response @mark-piller

It’s happening everywhere. There is a submit button that calls sellers/addSeller with the error. And a bunch of my cron code when trying to access external APIs.

I think it is a DNS isssue? It sounds like the team knows what the problem is, but it hasn’t been fixed.

Tim

Tim_Jones · August 27, 2024, 12:09am

Examples of errors trying to get to backendless (I am assuming bl-server is you) and my server

Error: getaddrinfo ENOTFOUND bl-server

Error: getaddrinfo ENOTFOUND secure1.meadowfarms.com

mark-piller · August 27, 2024, 12:20am

I tried various API calls in your app going directly to the database and could not reproduce. Please help me with a specific API request (custom services or not) with concrete values I can plug in to make an invocation.

Tim_Jones · August 27, 2024, 12:26am

It happens randomly, I don’t think it’s reproducible in simple testing. My crons are running every few mins and I got only 1-2 errors from each. But a bunch of different crons threw the same errors over a span of about 2 hours starting around 6pm ET.

mark-piller · August 27, 2024, 12:38am

Unfortunately I do not have any info from you to get started with any kind of testing. Ideally it would be a test method that (even randomly) produces an error

Tim_Jones · August 27, 2024, 12:53am

Mark,

What was discovered previously?

I can’t be the only one getting these errors, can I?

My crons get these errors maybe once a week. If you read up in the thread I’ve been posting when this happens. Generally it’s not often, but tonight was significant failures.

Tim

Tim_Jones · August 27, 2024, 1:02am

I just got these -

Error: Backendless encountered an error while handling the request. An internal trouble ticket with ID bl-server:0B855C07-B804-4061-8B82-D18ADD1D8627 has been created and we will be investigating the issue.

Error: Backendless encountered an error while handling the request. An internal trouble ticket with ID bl-server:41D38F49-1E3F-468E-8393-3FC88F202E05 has been created and we will be investigating the issue.

Both errors have this as the cause -

Caused by: java.net.UnknownHostException

This seems related. Can you look at the details in the ticket?

Tim

Andreas_Marinopoulos · August 27, 2024, 5:43am

@mark-piller @Volodymyr_Ialovyi @Inna_Shkolnaya

We are getting the same errors. Very hard to reproduce

Many different methods, random times.

We have a custom function we use to catch errors. Is there some specific type of log we can add there to help you identify the issue? Are there any logs on your end you can check if we provide specific timestamps?

APP ID: 4A47197B-AE30-FA84-FF56-0071F4010900

Andreas_Marinopoulos · August 27, 2024, 11:20am

Does Backendless use node.js? Does this resource mean anything?

Tim_Jones · August 27, 2024, 1:02pm

They do -

Tim_Jones · August 27, 2024, 1:34pm

Any updates on this? I’m still getting errors -

Error: getaddrinfo ENOTFOUND bl-server

/api/services/Crons/officeinsertorders

This service gets orders from an external API, formats them, and POSTs them to another API. It is run automatically every ~5 minutes.

Tim_Jones · August 27, 2024, 2:19pm

This has gone from an infrequent issue to something happening in multiple locations every hour.

Error: getaddrinfo ENOTFOUND bl-server

/api/services/messages/sponsorPresale

This service runs ~ 15 times every Tuesday to generate flyers. There is no input. It successfully generated 50 flyers and got the error above 3 times.

Is this related to the outage this morning?

Tim

Kelly_Oglesby · August 27, 2024, 3:02pm

I’m getting these too, along with calls failing with bad request errors, and the only thing I see in my logs is “Task execution is aborted due to timeout.” No idea how to reproduce it.

Viktor_Mudrevsky · August 27, 2024, 3:07pm

Hi @Tim_Jones, @Andreas_Marinopoulos, @Kelly_Oglesby

Unfortunately, I don’t have any major good news yet that everything has been fully fixed, but we are actively working on it and making progress. We are aware of the increased frequency of these errors, and we are fully focused on addressing them. We are closely monitoring the situation and working diligently to resolve the issues as quickly as possible. Thank you for your patience, and I’ll keep you updated as we continue to work through this.

Regards,
Viktor

Tim_Jones · August 27, 2024, 4:13pm

Thank you for the update @Viktor_Mudrevsky.

Do you have a sense for how much time this might take? Hours, days, weeks?

Thanks,
Tim

Tim_Jones · August 27, 2024, 4:29pm

As a quick update. I am now getting “Task execution is aborted due to timeout”.

I hope this helps you find the issue.

Tim

Tim_Jones · August 27, 2024, 6:17pm

I am now getting -

Error: socket hang up

Error: connect ECONNREFUSED 10.42.4.70:9000

Tim_Jones · August 27, 2024, 7:21pm

The errors are getting more frequent. Any update?

Error: Network Error

Error: socket hang

Error: connect ECONNREFUSED 10.42.4.75:9000

Viktor_Mudrevsky · August 28, 2024, 9:03am

Hi @Tim_Jones, @Andreas_Marinopoulos, @Kelly_Oglesby,

Please check how your applications are performing today. We resolved the issue yesterday where several of our worker nodes were not functioning correctly.

Regarding the specific errors discussed in this thread, unfortunately, they may still occur for now, but their frequency should be significantly reduced. We are continuing to work on eliminating these errors - much has already been researched and accomplished, but there is still more to do and verify.

Regards,
Viktor

Tim_Jones · August 28, 2024, 2:02pm

@Viktor_Mudrevsky

Things are much better! I had 1-2 errors last night and another 1-2 this morning. Please keep us updated; it would be nice to put this to bed. I started this thread more than 2 years ago!

Tim