Check out this doc for an explainer on how to build an Archive Doc.
An archive doc can back up old rows from your large docs so they can be deleted, that way your current doc can run faster, but you’ll still have access to your data for building charts or running queries.
_Merge() is still an unsupported function, right? Do you feel okay about this?
What I like about the classic AddOrModifyRows() logic is that it prints the data as plain text, while in your solution you have to unpack every column with a formula. In your scenario you have automatically every column and I love this idea, while the AddOrModifyRows() breaks once a column is deleted that is part of the set-up. So I do see the serious advantage of your solution, but I also notice that the Archive Table becomes a bit heavier in this scenario due to the many unpack formulas you have to deploy. How do you feel about this?
I asked for acces to the source doc to check out the details
I like the idea of color coding each step in the rules/code so it’s more clear what where should happen and in case of trouble shooting easier to know where to look for
Since I’m unpacking the data with _Merge() I’m not overly concerned about it changing. All I’m grabbing is the underlying json datastructure, which will always be useful. If the formula were ever to break, it would just cause my archive automation to fail, which will just mean I’ll have to go change it. However, no data will be lost, and the retrieval function (ParseJSON) is definitely supported, so I’ll always be able to grab my data.
Re the overhead of the ParseJSON formulas, I’m not sure how much slower it is. Presumably this is similar to what Coda is doing under the hood when it looks up the contents of a row in a Cross Doc table anyway. Either way, this is a tradeoff in exchange for way less maintenance on my part as columns are added and removed from the source doc.
What would be really great is if Coda supported a way to copy the row object that Cross Doc returns into a new table as an object. Then you could do this without _Merge() and could reference with just Row.Field like usual.
Or if Coda could cross doc more than 10k rows, then you could do this without archiving to a separate table.
Or if Coda could run quickly with more than 100k rows and complex lookup formulas, then you would never have to cross doc.
all of this is worth pointing out just to show that the root of the problem is performance, and all of this is a workaround. And of course, Coda’s working hard on performance, and also these sorts of workarounds may always be necessary at some level of doc size and complexity
Update! Don’t use this approach. For reasons I haven’t yet figured out, this doesn’t work. My syncs constantly fail, and so do my automations. I don’t know why. I’ll update this thread once I figure out what’s going on
Update! Nevermind, it works! There was a performance issue because of a formula I was using. That is now fixed.
Here’s what changed.
Before, we were detecting whether the sync table was up-to-date with the archive table by looking at all of the rows of the archive table and trying to find one that matches. That’s silly! So many comparisons.
Now, when we archive a row we cache the archived contents in a row of the sync table called Archived Contents. The column that checks if the content is archived now can just look at the previously archived contents to see if it matches the current row’s data. If it’s the same, then the row is up to date. If it’s different then it has changed and needs to be updated, so mark the row as not archived. Duh!
We went from so slow that everything was frozen to blazing fast! Thank you @Shitong_Stone_Shou on the Coda team!
The video needs to be updated to match the new formula, but I’ll have to do that later.
I’m not entirely sure about this (I would have said the same thing though ) but you need some text Coda will be able to interpret as JSON in your column .
(It seems to work like the Separate as date which is only available in the menu when right-clicking on a Date field AFAIK )
Holding x amount of archived rows in a single row in a CSV format
Using ParseCSV(), formulaMap(), and addRow() to recover the rows
Its seeming to work really well but just wanted to see if ya’ll had any opinions on limitations? Don’t want to go for this method if Im going to run into obstacles down the road - (In this instance, won’t need the data for queries, charts, etc after its been archived. Just want to store the data and not lose it)
Here’s random example doc just as proof-of-concept:
Edit : The only real limits I’m noticing is cell length limits (The amount of characters you can hold in a single cell)
Cell with rich text: ~7,000 character limit
Cell with no rich text: ~43,000 character limit
You can store rich text (dates for example) as plain text and they are still recoverable. The run I did for the 43,000 character limit was:
1156 rows
4 columns of data per row
If you had to archive more than 43,000 characters you could easily do it in two runs. . .
Compared to the archiving approach this is incredibly simple. However, it will still grow the size of the source doc and you obviously can’t use the data in the archived rows for computations (e.g. you can’t offer summary charts).
I’ll definitely be using this in the future though. Have you noticed a speedup?
Hey Christiaan! Yeah - I tried using both character(10) and LineBreak() - there was no difference in terms of the speed or amount of data stored, so I went with LineBreak() as, when it is released, it will be more accessible to the average user (B/c seriously, character(10) is so arbitrary).
Storing data in a cell because I am essentially storing over 1,000 rows worth of data in a single archived row thereby saving room and hopefully increasing speed of a doc by archiving data in a simple way.
I have clients with large docs and plenty of rows and this would allow the doc to scale more easily, work faster, etc
I guess you could also combine cell data with main table data pretty easily, too.
Like if you want to answer the question, “how many sales have we made this year?” Is there an idiomatic way to simultaneously use non-archived and archived data?