This doc is too big to sync

In Cross-doc functionality, I’m getting an error

This doc is too big to sync

Screen Shot 2019-12-18 at 2.33.53 PM

When is a doc too big to sync? How do I know the size of a document?

If you write in to Support, the “?” icon at the bottom right of the doc, we can likely give you a ballpark idea of what you might be running into.

Finding a general rule of thumb is difficult though because docs vary quite a bit. Some docs have giant payloads coming in from Jira, or other packs, and are really heavy with formulas while others have tens of thousands of rows and are lightweight with formulas so they run fairly quick.

A tough part of Coda is the fact that it can do so much and everyone uses it in really creative ways. Lots of outside-the-box thinkers here breaking new ground which makes it tough to find any standard metrics.

1 Like

Thank you for your reply @BenLee.

I thought imports were bringing in tables and not full documents. Is that not so?

We’ve been able to synchronize tables from this document in the past so I imagine something has changed in the limitations of Coda or we just went above the limit here.

If you’re referring to my mention of some packs pulling in a large amount of data, it’s usually a JSON dataset loaded into that one column, then the other columns are the initial column projected out. This tends to create larger data sizes per row than a disconnected Coda doc would generate.

An issue you may be running into is overall doc size. This refers to a “snapshot” size, which is the full amount of data that you need to run the doc in your browser. Loading it this way is what allows you to still run that Coda browser window when you lose your internet connection. When this snapshot size grows too large, it can cause some issues with other things, like the API that does this syncing.

We can give you a better idea of the issue through support, it just takes an engineer looking at the stats.

2 Likes

Hi @BenLee, I think you are right and the problem is overall doc size because we are not using packs with large amount of data on this doc.

We can’t do much here for now so we’ll wait for improvements on how large documents are handled.

Something that might be valuable for you to know is that we don’t have a strong pain of being able to access docs offline. We’d much rather make sure our documents works well when we are connected online even over the document not working when we are offline. Having this into account, a potential improvement would be to do lazy loading and only bring those parts of the document that are necessary as it’s been used.

1 Like

Ditto here – prefer an online-first, lazy loading approach to offline access.

I have a lot of questions about document size and performance issues, and would love a plain-speak guidance document to help me plan my docs before I start building. And perhaps a dedicated category for performance-related issues here in the Community?

For example, is there a performance penalty in a one-to-many cross docs implementation, and if so, what are the limiters? In my case, I have a master document with a small-to-medium sized table (<100 rows, < 30 columns). I want to use cross doc and filters to distribute data to 25-50 child Coda docs. Assuming that only a few of those child docs would be open concurrently and not requiring frequent syncs back to the master doc, should I anticipate performance issues? From a performance perspective, is it better to filter the data in table views in the master document, or in the child documents?

Thanks,
JB

Here are a few very short facts. I’ll expand on these in my blog when I have more time.

  1. The doc size limit is 125 megabytes. This is 0.json (schema file) and 1.json (data file) size combined in their uncompressed form:


    When the doc goes over this limit it becomes “too big” and you won’t be able to access it through API / Cross-doc, and automations will most likely not run either. You’ll still be able to open it in your browser — it just won’t open on Coda servers for automation/API access purposes. I got this info from Coda support.

  2. The biggest impact comes from button columns and conditional formats.
    Coda stores buttons very inefficiently: a button is set up on column level and the configuration is the same for all rows, yet in 1.json it’s duplicated for every row.
    Conditional format settings are also duplicated for each row, regardless if those settings apply for this row or not. They also appear duplicated multiple times per row.
    Cutting down on buttons and conditional formatting are the most effective first-aid measures to bring doc size down. There are workarounds for buttons that I’ll expand on later.

  3. Cross-doc data takes more space than local tables. This is mostly due to table/row identifiers taking not 10 but 64 characters. So every time you reference anything from a cross-doc table, you’re getting formulas that take ~5–7 times more space.

Bad news is that Coda stores doc data very inefficiently, as I learned from debugging the 0.json and 1.json files myself. Good news is that there’s room for eventual improvement.

18 Likes

Thanks a lot @Paul_Danyliuk: impressive and useful post, as usual.

It’s a good, concrete starting point after many threads about this perception.

Let’s wait for a feedback from Coda and hope for a not too long road towards improvements.

3 Likes

Super helpful @Paul_Danyliuk

For the conditional formatting, I understand from what you’re saying that all conditional formatting rules for a table, whether they match a row or not, are embedded into each cell of each row. Let’s say you have a rule that only affects 2 columns though. Does this rule also get embedded in the cells of other columns?

Basically, if conditional formatting is important to my UI, can I save doc weight by having the formatting apply to a single column rather than the whole row?

I haven’t looked very closely into it; all I know is that conditional formatting can be heavy and I should dial down on it. In schema file (0.json) it looked like a lot of extra and duplicate-looking columns were generated for just a handful of rules; columns were named with “conditional-format” in their IDs.

At the moment I was looking into that, I was told that Coda were working on optimizing it, so I didn’t examine it very closely.

To those coming after, it seems these file names have recently changed:

0.json → ss-schema
1.json → ss-data

Edit: and in Chrome you may need to hover on the file size to see the uncompressed size

1 Like