504 errors against API

For the last ~5 days, a lot of my API calls to our document come back with a 504 error, with the request body being HTML reading “We’re sorry. We’re experiencing an issue on our side.”

The request that triggers it is a HTTP GET to https://coda.io/apis/v1/docs/{docId}/tables/{tableId}/rows?limit=1&valueFormat=rich" although removing the valueFormat seems to make it work more reliably. It’s not deterministic, and this document is very large, so it does make me wonder if it’s performance thing (e.g. timeout on the backend). How could I go about diagnosing this? This is load bearing infrastructure for our internal automations so it’s causing a lot of pain to not work.

2 Likes

That doesn’t sound fun at all, it’s also a little concerning there’s no report of that here

1 Like

Hi @Alex_Gourley - If this is a very large doc then it sounds likely that it’s hitting some infrastructure limit in our API that’s causing requests to fail. Have you looked into reducing the size or complexity of the doc? Perhaps by archiving older records to another repository and deleting them from the doc?

1 Like

Thanks for confirming. I haven’t tried reducing doc size, no. Is there any kind of guide or writeup on what tends to be expensive?

There are 7.5k rows, 117 tables and 4500 formulas (I am actually confused by the formulas part, there must be some per-row thing going on)

1 Like

No, nothing public that I know of. I believe size and complexity are the two main culprits to docs failing to load correctly, but that’s not very useful. I would suggest looking around for some easy places to make cuts and see if that makes a difference. If you want to do some more detailed troubleshooting you’ll need to reach out to the support team who should be able to connect you with an engineer.

1 Like

I ran many tests and I’m still a bit stuck. Let me share what I found:

  • Some tables in the doc are fast, it’s the larger ones with lots of views which can time out.
  • The slow query is Coda API (v1) Reference Documentation (listing tables or table columns is fast)
  • It makes no difference if the table has a filter limiting how many rows are there. Nor does it matter if I query a view with only one column unhidden.
  • I copied the doc and deleted the obvious big stuff. 5.0gb of images (doc size down to 120mb), and columns holding complex canvas objects. Timeouts still common (or 60-90 seconds if I get a reply)
  • If I copy that table in the same doc (not a view, fully copy), that can be queried fast. Maybe it’s the comments?
  • So I copying that doc without comments. Success!! Queries under 2 seconds every time.
  • Hmm okay so can I copy the original doc without comments to get succcess? NO. Still times out. What?
  • On that no-comment-copy I stripped out the large files and canvases like in the success case. Still timing out!
  • I tried more times and just re-confirmed above.

It’s almost as if there is a path dependent procedure where you must first shrink the doc, and then clone it without comments, and only in that order will it query fast.

A workaround I found is a 2-way sync table in another doc. That seems to be fast right now. I’d appreciate more direction on things to try though.

1 Like

Hi @Alex_Gourley - That’s some impressive troubleshooting, but I’m sorry to hear it didn’t reveal any quick wins. Unfortunately, the only way to provide you with more concrete guidance would be for an engineer to look into your specific doc. To do that you’ll need to contact support, who can connect you to our engineering on-call.

1 Like

Thanks Eric I’ll do that. If anyone lands here from searching this error please say something, more datapoints could help.

1 Like