I came up with a solution to my problem that I’m happy with, so I thought I’d share. Feel free to copy:
I tried a bunch of methods to get data out of Coda and back in again. At first I had a basic Import/Export proof of concept working, using Copy to Clipboard to get the data, then pasting it as JSON into a cell and uploading that object via API. But I came across lot of problems like cell character limits. Every time I hit a limititation, I engineered a solution to get larger quantities of data in or out. In the end I can safely say that I can export/import as much data as I like The only limitation I have now is whatever limits Coda itself has (in terms of Doc size).
The data model for this example consists of Trades, each of which can have one currency “Pairs” lookup, and multiple Tags. There are also Transactions - each of which has a link to a single Account and a single Trade. In effect, a Trade item can have many Transactions assigned to it (so a reverse lookup).
I have introduced the concept of a “Database” to my Doc - Another table to which all tables in the app are linked to. This way I can switch the “Database” to view an entirely separate dataset:
- All tables are filtered by database using the “DatabaseSelect” interactive filter.
- The “Database” field is set as the default value for new rows on all tables.
- All lookup select columns filter the available options by the currently selected database:
This gives us seamless switching between databases. From the user’s perspective, creating a new row in the Database table and selecting it will effectively give them a completely empty app to start filling with data.
Pretty simple so far. However my end goal was to be able to export an entire database and then re-import it (either as a new database in the same doc, or to migrate to another version of the Doc - possibly a future update with extra features a user would like to take advantage of).
The goal with the export is to get a .json file of all data from all tables in one hit, which can then be imported. My first approach to export a row was to create an “Object” column on each table, and have a formula build an object for each row which can later be put into an array. In the end I settled on creating a dedicated “Export” table, containing a row for each Database, and a column for each table to be exported. This way my object definitions are all in one place and it’s much easier to manage:
In each column, I grab all of the rows from the table I’m exporting, filter by the Database of the current row, and construct my row object using the Object() function. The requirement for import is that each exportable table must have a Row ID column (thisRow.RowId()) and an import ID column (which is just a blank cell, for use later). I also include the gridReference string using the hidden ROW_KEY formula so that my import pack can extract the table ID for the API request.
With the required fields out of the way, I then add all of the columns I want to export, being sure to only grab user input (non-formulaic) values. When adding a lookup column, I take the Row ID of the lookup value, so that the relationships between the tables can be paired properly when imported:
I then combine the list of rows from each table into a monolithic array, the output which looks something like this:
On my Databases table, I have an Export button with the following formula. I am using the free Temp File pack to force the browser to download my JSON array as a file. Great.
The only problem is, the pack can only process a certain amount of text before failing (everything has it’s limit). To get around this, I create an array sequence to split my monolithic array into smaller chunks. The button action then runs on a loop, triggering the Temp File pack each time - so I will get multiple files (database1.json, database2.json etc), with the array length of each file depending on my “export chunk length” setting - until the entire array is downloaded.
I use a file field to accept the user’s files. Then I run multiple nested RunActions() and ForEach() loops within my import button action, which performs the following tasks:
- Loop through and download all of the JSON files we just uploaded into memory, and reconstruct my monolithic array. For this I developed a simple Pack function to just retrieve a file URL as a JSON string.
- Split the array returned in to chunks (the length of which is based on another configurable setting) and run a sequence similar to what I have on the export button. But this time, we dump the chunk of array data into a hidden cell. The reason I split it into chunks here is because again, I found a limitation - a cell can only handle around 40,000 characters of text before it gets upset. So we need to split it up for large datasets.
- Once the chunk cell has been populated, my loop hits the “Import chunk” action to actually perform the API request via my pack (this is another action button which I hid from view)
- I then add a delay using the hidden _DELAY() feature before the loop goes around again - just so I don’t end up firing off a million requests at once - I also added a rate limit setting for this but actually, the API and the Pack Studio seem to be able to handle A LOT of requests (trust me, I’ve been hammering it!), so I’m not sure I really needed it in the end.
- At each step, my formula updates various cells in the Database table, so that I can feed back to the user what’s happening with a progress bar and some status messages. The last thing you want when you are performing large, scary file operations is no feedback!
Here’s what the import Chunk button action looks like. Simple:
My Pack function then takes the data chunk, transforms it into something the Coda API understands, and inserts the rows. But before it calls the API, it also copies the Row ID to the empty Import ID column. This way, when the data is received by the API, Coda can link all of the row relationships back together. The only problem is that for this to work, the user has to set the “Import ID” column as the display column on each table before they run the import. I wish there was a way to automate this, or some better way to link the rows without this manual step (I can’t tell you how many times I forgot to do it while developing this feature!), but I can live with this for now.
Once the data is imported via the API, and Coda has completed it’s task of re-linking the rows and bringing them into the Doc, we now have a bunch of non-empty “Import ID” fields that we don’t want. If we want to import another data set, those left over fields will cause us problems - data could get overwritten or linked incorrectly. So I needed a “clean” button that lights up when the import completes - this simply erases all of the temporary import IDs in all tables, “cleaning” my database.
After all is said and done, we end up with the following user experience, which is not bad at all in my opinion (I’m using silly small files sizes to demonstrate the chunking method):
I have tested this method with around 40,000 rows - and while Coda definitely slows down, and the import takes a while, the process runs smoothly and reliably with basically unlimited amounts of data, thanks to the chunk/splitting method.
I learned a lot about Coda’s limitations while developing this feature. But I have to say I am very impressed with the sheer scope of creativity this platform provides. I am impressed with how powerful RunActions is and it’s sequential nature (it can be used like a promise chain, which I hadn’t realised previously).
The only downside is, I am relying heavily on hidden formulas throughout the entire process. But I can live with it
Hopefully this helps someone out.
Enjoy your weekends