How to handle large docs (Archive Docs)

Check out this doc for an explainer on how to build an Archive Doc.

An archive doc can back up old rows from your large docs so they can be deleted, that way your current doc can run faster, but you’ll still have access to your data for building charts or running queries.

Included is a video of how it works so that you can implement it on your own:

Archive Doc - Watch Video

Thank you to @Kyleigh_Johnson, @Coda_Zac, and @Lena_Webster for help figuring this out!

7 Likes

Woo Woo Woo! Arc-high-five!! :clap:

2 Likes

hi @Connor_McCormick1 , very well done et Merci!

  • _Merge() is still an unsupported function, right? Do you feel okay about this?
  • What I like about the classic AddOrModifyRows() logic is that it prints the data as plain text, while in your solution you have to unpack every column with a formula. In your scenario you have automatically every column and I love this idea, while the AddOrModifyRows() breaks once a column is deleted that is part of the set-up. So I do see the serious advantage of your solution, but I also notice that the Archive Table becomes a bit heavier in this scenario due to the many unpack formulas you have to deploy. How do you feel about this?

I asked for acces to the source doc to check out the details

thanks again, great work!
Christiaan

1 Like

Dear @Connor_McCormick1 ,

I like the idea of color coding each step in the rules/code so it’s more clear what where should happen and in case of trouble shooting easier to know where to look for :mag::mag_right:

1 Like

@Jean_Pierre_Traets, @Christiaan_Huizer, @Jono_Bouwmeester, sharing settings on the Source Doc are now updated. Let me know if you still have any problems.

@Christiaan_Huizer re your questions:

  1. Since I’m unpacking the data with _Merge() I’m not overly concerned about it changing. All I’m grabbing is the underlying json datastructure, which will always be useful. If the formula were ever to break, it would just cause my archive automation to fail, which will just mean I’ll have to go change it. However, no data will be lost, and the retrieval function (ParseJSON) is definitely supported, so I’ll always be able to grab my data.
  2. Re the overhead of the ParseJSON formulas, I’m not sure how much slower it is. Presumably this is similar to what Coda is doing under the hood when it looks up the contents of a row in a Cross Doc table anyway. Either way, this is a tradeoff in exchange for way less maintenance on my part as columns are added and removed from the source doc.

What would be really great is if Coda supported a way to copy the row object that Cross Doc returns into a new table as an object. Then you could do this without _Merge() and could reference with just Row.Field like usual.

Or if Coda could cross doc more than 10k rows, then you could do this without archiving to a separate table.

Or if Coda could run quickly with more than 100k rows and complex lookup formulas, then you would never have to cross doc.

:point_up: all of this is worth pointing out just to show that the root of the problem is performance, and all of this is a workaround. And of course, Coda’s working hard on performance, and also these sorts of workarounds may always be necessary at some level of doc size and complexity :slight_smile:

1 Like

Update! Don’t use this approach. For reasons I haven’t yet figured out, this doesn’t work. My syncs constantly fail, and so do my automations. I don’t know why. I’ll update this thread once I figure out what’s going on

Fixed!

2 Likes

Update! Nevermind, it works! There was a performance issue because of a formula I was using. That is now fixed.

Here’s what changed.

Before, we were detecting whether the sync table was up-to-date with the archive table by looking at all of the rows of the archive table and trying to find one that matches. That’s silly! So many comparisons.

Now, when we archive a row we cache the archived contents in a row of the sync table called Archived Contents. The column that checks if the content is archived now can just look at the previously archived contents to see if it matches the current row’s data. If it’s the same, then the row is up to date. If it’s different then it has changed and needs to be updated, so mark the row as not archived. Duh!

We went from so slow that everything was frozen to blazing fast! Thank you @Shitong_Stone_Shou on the Coda team!

The video needs to be updated to match the new formula, but I’ll have to do that later.

2 Likes

Thank you, Coda team! I don’t know if this is new, but just noticed it and it’s really helpful!
image

I don’t see this within my docs. Where do you get this JSON option?
Thanks @Connor_McCormick1

Might need to turn developer mode on?

coda.io/account

I’m not entirely sure about this :point_up: (I would have said the same thing though :blush: ) but you need some text Coda will be able to interpret as JSON in your column :blush:.
(It seems to work like the Separate as date which is only available in the menu when right-clicking on a Date field AFAIK :blush: )

Ahh good call! Try getting some json and pasting it in the cell

1 Like