I noticed that every value returned when using the /rows API endpoint includes ids, so I thought “Great, I can just write the same data back and it will figure out which values to update” from the ids. Nope, trying to write back the same data (even if you put it in the { rows: [{ cells: [] }] } format) does not work at all. Btw the docs seem to use a reference to a RowEdit format which is not documented anywhere. You just have to figure it out from the one example, it seems?
It is still unclear to me how to write back structured data, every example I see includes writing plain values. I guess I just have to rely on parsing? This seems a bit fragile.
But also, from reading this it appears that you have to provide a list of columns to act as the primary key? But …there is an id already?
In my use case, I’m writing a library that handles Coda data generically, so I cannot possibly know which columns I could use as key. I know there are also the single row endpoints, but it would be nice to not need to send a dozen API calls to update a number in a dozen rows.
What am I missing? I feel I must be missing something pretty huge.
Hi @Lea_Verou - Sorry to hear about the confusion, our API documentation could certainly benefit from more guides and tutorials that demonstrate common patterns.
If you are trying to update an existing row you should by use the updateRow endpoint, not the upsertRow endpoint. The updateRow endpoint takes in the row ID that you mentioned, and doesn’t require specifying any key columns.
You can think of the upsertRow endpoint as an insertRow endpoint, meant for adding new rows. It just happens to include this bonus, optional upserting feature. To use it you specify a set of key columns. and if an existing row matches the passed row in for those columns, the existing row will be updated instead of adding a new one.
“Here are the rows in my system. Add them to the table if they are new, and update them if you have an older copy.”
When upserting you don’t need to first get the existing rows, figure out how they map to your data, then update those specific rows. You just send over the data, as well as which columns in that data are the keys in the foreign system, and Coda handles the logic.
If you want to implement a more sophisticated syncing method you will need to update rows individually, which today can only be done with updateRow. We currently don’t have a batch endpoint for updating rows.
I don’t want to use a “more sophisticated syncing method”, I just want to use the ids that already exist, rather than having to depend on some combination of cells being unique (a concept that doesn’t seem to exist anywhere on the Coda interface, nor am I aware of any way to enforce it).
Fair enough, but unfortunately not something we support today. I’ll pass your feedback onto the team. If you could provide some more context on your use case it will help the engineers understand the gaps and prioritize the request.
I’m writing a library to read a Coda table and return it as a JSON object that can be manipulated by the user and written back. Since the library needs to be able to handle any data, I cannot depend on a particular combination of columns being unique, nor can I add columns with the row id. And since the update could potentially involve numerous rows, having to make an API call per row would be very inefficient.
Hi @Lea_Verou ! Just a regular Coda user here, but I do a lot of upserting of rows. In particular, I don’t know if you’ve ever used Drafts for Mac, but the Coda workflow I created for Drafts does what I think you’re describing — I can download data from the Coda API into Drafts (in my case, as JSON), edit the JSON in Drafts (a text editor), and then upsert it back to Coda using the API again. That seems like the kind of data in/data out process you’re describing. (I’ve also been able to do the same using Shortcuts for iOS.)
Couple thoughts I had that I hope might be helpful:
It is definitely the case that the format of the JSON that comes from the Coda API and the format that you need to have the JSON in to send it back to the Coda API are different. A lot of the code I write in my various Coda in/out programs is transforming that JSON from one way to the other. Mostly I reduce the incoming (downloaded) JSON to just what’s within the “values” object, the direct contents of (in this case) each row, and then use that in creating my JSON to send back. The API is really fickle about getting the pattern of nested objects and arrays just right when sending back that row/cell/column/value JSON.
I don’t specifically recall a prohibition against using the ID that comes through in Coda’s API as an upsert value, but I’m not doing it with any of my codes, so I’m wondering if at some point I found out it wasn’t possible. In one of my uses, I have a product name that might change but a SKU that’s immutable, so I might upsert to make a change to a name but I can use the immutable SKU as the key column.
For another of my Coda databases (and I’m hoping this might help you) there was really nothing relevant to use as a unique key (vendor might be the same as another, price might be the same, date might be the same), so what I did was create a formula column called ID that writes the Unix epoch time to a row when I create it — DateToEpoch(thisRow.Created()). All my other attributes might be unique, but the time down to the millisecond that I create a row can only occur one time. But, Coda doesn’t let you use calculation columns for keys, so then I have another column, ID Duplicate, and a button system where occasionally (or before I’m going to do some downloading/upserting) I can have it simply copy the row-creation epoch time value from the ID column to the ID Duplicate column. That ID downloads when I download the JSON and I can use it as the key when I upload, because it’s unique and I never need to change it.
Upsert calls can definitely contain updates to multiple rows (though matching each row’s values to their equivalent in just one key column) and anything that doesn’t match gets added as a new row. So batch upserting definitely exists and shouldn’t require multiple API calls.
I hope that helps! Again, I’m just going by my own knowledge cobbled together from using Coda over the years, but if I can answer any questions, don’t hesitate to let me know. All best.
I hope the PMs in charge of the API are monitoring this, because that’s a pretty big DX smell. Reading and writing should be idempotent (or at least symmetrical), to the extent possible. Right now these two API endpoints appear to have been designed by completely different people, following completely different design principles!
Yup, as I mentioned I cannot do that, because a library intended for reading data and writing back data changes cannot be arbitrarily inserting columns for its own use. I do not (necessarily) control the tables I’m modifying.
Yes, that was my original understanding, and it appears to be correct. The main issue is referencing the correct rows, without actually modifying the data model. Which doesn’t appear to be possible right now (in the general case).
there is a way to set your ID value using a formula that DOESNT make the column into a computed column, thus eleminating the need to copy the column later with a button or automation.
you set the “value for new rows” option for your ID column to the formula that combines epoch-timestamp and other values.
it gets executed immediately when a new row is created; giving it the unique UID you need.
but it is not a formula column thereafter, so you can use it directly in your API calls as the key for upserting.
Just to be clear, this is still the case? I am in the same case as @Lea_Verou here; I’m building a plugin that syncs offline documents with a Coda table (each doc being a row). I am not in control of the table and I’m not going to ask my users to add a unique field to their table just to make my sync plugin work, nor am I going to depend on them managing uniqueness correctly.
I’m also not, for obvious reasons, going to add my own column and hope I don’t nameclash or don’t confuse my users.
Even if I did decide to go for one of those two paths, it reduces substantially the UX of my plugin, which now needs documentation, explanations, and possibly a set of UIs just so people can create that column. It makes my sync plugin less plug and play, less immediate to use, and reduces its usefulness.
Rows have an ID already. There is still no way to upsert based on that id?
Can someone from Coda confirm that the only valid way to sync a large amount of data to:
Make one query per local document to see if there’s a corresponding row id
@Jad_Sarout1 - We haven’t made any changes here, so the limitations are the same as previously mentioned.
I’m not going to ask my users to add a unique field to their table just to make my sync plugin work
This doesn’t sound like such a large lift to me. I imagine each doc your syncing already has some sort of URL, path, or unique ID associated with it, and having it in the table too could be useful for the users, in addition to be useful for syncing.
it reduces substantially the UX of my plugin, which now needs documentation, explanations, and possibly a set of UIs just so people can create that column
Don’t you already need some way for the user to specify which columns in their table correspond to the content of these docs? It seems like you’d just have to extend what system you use there.
Can someone from Coda confirm that the only valid way to sync a large amount of data to:
Make one query per local document to see if there’s a corresponding row id
Then decide to do an upsert or an insert
No, I don’t think that’s right. If you don’t want to rely on a unique column, then I think the pattern looks like:
List all of the rows in the table, noting the row IDs.
For each doc that was previously synced to one of those rows, call updateRow to update it.
For all of the rest of the docs, make a single call to upsertRows to insert them.
Yes, I do have a UI to map columns, but that doesn’t mean there’s a unique column in the doc.
I am in the same case as @Lea_Verou above: I do not want to change people’s docs. I want to download, modify, and put back, in an non-intrusive manner.
Your solution implies people would be using my plugin all the time; that once they buy in, they’re only operating through my plugin, and that the sync would be daily. But in reality, people might use the plugin when they have 3 rows, then not touch it for months, then pick it back up when they have 1000 rows. Many changes might have happened in the meantime.
Ideally, I only have to ask people for an API key. I can then show a list of doc in the plugin config, and after one is picked, a list of tables, and done.
I certainly do show a list of fields and their mappings, but that can be automatically done (create one frontmatter key per column, named like the column), and the user may never have to touch it.
It’s a one api-key paste, two clicks operation; in the worst case, editing a few column names in-situ.
With this specific custom-unique-id constraint, I need to:
Warn my user that I will create a new column; explain why.
Suggest a name, and probably allow them to change it so it fits in their naming scheme on Coda.
Or worse, giving them instructions to go on Coda and add the column name themselves, and then have them do the process again, and then pick an id column.
This is what Zapier does iirc, and while it is workable, it is certainly not a nice flow.
Additionally I need to handle the following edge cases at minimum:
Two or more rows have a similar ID, because the user changed something, or didn’t follow instructions. Worse: I may not even discover duplicated IDs unless I read all rows, and I may update the wrong rows as a result.
User deleted values in the unique ID column, or changed values in the column.
User deleted the column.
All of which are likely to happen if more than one person operates on the document.
I could decide to not handle those cases, and just crash, of course; that’s again what Zapier does. But I simply do not find that an acceptable standard of friendly UX.
Even if I decided to crash and ask the user to change their model, I still have to:
Design a failure state
Show the column picker again
Present the explanations again
Worse: my plugin suddenly requires more emotional investment on the user’s end. Whereas before, they could add the plugin to their Obsidian, enter an API key and test, now they need to either operate on the Coda doc, or trust me to operate on their Coda doc.
Zapier puts up with it because to them, it’s just one more API in a sea of APIs. And people can put up with the necessary changes because Zapier’s value proposition is clear, and the name Zapier holds enough clout to not feel risky.
But for me, building an Obsidian-Coda open source plugin for free, it’s not doable. If my users need to take additional steps to try some random plugin from a stranger, a large amount simply aren’t going to do it. And even if they do, if I can’t present my users with what I consider a baseline standard of ease of use, I’m not going to do it at all. I’m not interested in dealing with the support requests and bug reports that’d come from that.
In short, this omission from the API makes my plugin:
Many times more complicated to write (maintain, debug, …)
Offers a much larger bug surface
Requires much more from the user, technically and emotionally
Creates much more branching states for me to handle
It’s all the more frustrating that there is a unique Row ID in the model, making this entire complexity completely feel unlike anything else in the tech space, artificial, and forced. I’m sure you have internal technical reasons, and I don’t pretend to know better than you how doable it is, but from the outside it seems an entirely absurd omission.
If you don’t want to rely on a unique column, then I think the pattern looks like:
[…]
For each doc that was previously synced to one of those rows, call updateRow to update it.
Sorry, you’re right; but that is indeed what I meant. If a person made 100 changes in 100 documents (e.g, search-and-replaced data in many docs), I now have to make 100 queries.
Again, it’s acceptable (maybe? Not aware if there are API call limits and if I’d need to space the calls), but it just feels very frustrating and inefficient. It is again unlike anything else.
I guess I’m simply unable to form a mental data model that’d make this difficult to expose.
Anyway, bottom line: My Obsidian-Coda plugin will be used by me only for the time being. Hope you change your mind some day.
For anyone ending up here; I bit the bullet and decided to create the column myself, hidden.
Only, that also seems to not be possible. The API docs certainly do not mention that anywhere, and there’s this thread:
So there isn’t even a way to do this in a friendly way.
The only way you can sync is indeed to tell the user to please create a column with such and such properties, not set any option on it, not set a formula, name it such and such so we can find it (or name it as they wish and make sure to select it correctly when back, lest we overwrite a column they need), and then come back and continue the process. And to please make sure to not touch the column in the future.
Again I’m sure there are good technical reasons for those weird omissions, but they are very limiting.
@Jad_Sarout1 - Just to be clear, there certainly should be an endpoint that allows for bulk row updates, and I don’t think there is any technical reason why we don’t have one. With many things in software it’s just a matter of priorities, and while I’m campaigning internally for API improvements it can be hard to weigh the impact against other projects.