Size Limits & Doc Design Best Practices

James_Eades · May 17, 2020, 4:58pm

Hi All,

So I have been playing around with coda and I love the tool but I am running into a few of the issues referenced by power users like @Paul_Danyliuk @Bill_French and others have referenced in the community previously. I actually read through the Why would you NOT try to manage all a Tech Start-Up’s stuff in Coda? thread and it still left me wondering if it is possible to make things work for anything that will exceed a small task management doc or wiki in coda.

Specifically I have been working on a watered-down CRM doc and hit the hard limits of doc size when I imported some prospect data into my doc (20,000 rows) where the doc just stopped working on mobile entirely and struggled in the browser. I recognized that I had some inefficiencies in my doc filters, formulas, and structure (buttons are so useful but seem to be resource hogs). I figured I would take a second look at everything and start fresh but I find myself mentally running into a bit of a brick wall trying to plan out how to most efficiently design a doc with all of the constraints I am now aware of - at the end of the day it may not even be possible using coda which would be a big bummer. I guess I am struggling with a few of the following conundrums:

If the doc size is too large it can be split into smaller docs - but if data within the doc is dependent on other tables, splitting the doc would create another issue with the inefficiencies and limitations from cross-doc.
Was using a search table to filter a view by user, but that leads to a couple of inefficiencies with formulas - but if new views are added instead of a search table then the doc size increased even more.
Interactive control filters are WAY inefficient - my search table and filter formulas would take less than 30 seconds to calculate and show results where an interactive control filter would take 2 minutes on the same 5,000 records?
I also did the same math that other have - if I have 5 employees entering 40 notes a day that will add up to 48,000 records by the end of one year and that by itself will probably break coda (ok so you archive some of the notes after a while, but still it seems like it would very quickly break down from moderate usage)

To me this seems to be Coda’s biggest opportunity and flaw - there really should be a way to handle more data in these docs. I understand some of the under-the-hood aspects of coda thanks to Paul’s extensive research and very helpful posts on the community, but I guess I am still really surprised that this isn’t priority #1 for the company. I feel like these docs need to be more efficiently handled by coda.io - and yes good design will still come into play but when you talk about companies using something like this you are bound to hit a lot of these hard limits very quickly.

Does anyone have any solutions to these issues that still leverage coda.io? Are there any ways to split docs with data dependencies and avoid inefficiencies? Does anyone else here scratch their head when they see cross-doc and wonder if it is even worth it to use (I personally laugh every time I look into it and try to find an efficient use)?

Johg_Ananda · May 17, 2020, 6:33pm

Hey @James_Eades I empathize with what you have written and myself have butted up against many of the limits. At the end of the day you have to realize that while that seems like a huge limitation for the app, and in a sense it is, there is still so much that can be done at the <10k row space. Take a look at the template gallery. People can organize their household, small business, meetings, processes, etc.

You, like many of us in the community here, see the massive potential to use no-code apps to just focus on the ‘building’ of our creative pursuit rather than dealing with the infrastructure (servers, security, etc). However, as we write this in May 2020, if you want to have a massively scalable app (and I know that your 5 employees @ 40 notes a day doesn’t seem 'massively scalable, but in this sense it is) you still have to deal with all that stuff that none of us here want to deal with (servers, security, etc).

I was somewhat early to Coda and dealt with many of these issues, and I can say Coda has been very conscientious and made a huge amount of progress. Things just work better over time as they slipstream improvements into the app. So it is getting better over time, and will continue to do so.

There is a LOT you can do with cross doc to address the limitations, but there’s a high cost in scheming, testing and implementation. I too have built a to-do manager custom for my team and once we hit ~3,000 rows it wasn’t much fun to use any more. I built an archiver:

that stores 30 days of data (for reporting) in the functional app and then every night moves the 31+ data out and the app is again snappy and fun. It was tedious to develop and test the cross-doc, but definitely worth it and made me better at computer science and coda.

So, if you’re creative and patient you can develop work arounds until Coda builds the infrastructure to scale (and probably the accompanying business model) which I am confident they will. Coda has confirmed to me several times that they are interested in allowing Makers to sell/license their docs/apps to third parties which will inherently bring the scaling issues, so it is de facto in the roadmap to eventually address.

My advice would be to work around it for now (it also will make you focus on efficiency and leanness which is a great practice to have anyways) and eventually, when you’re app is nice and polished and ready for the masses, hopefully Coda will be too!

Paul_Danyliuk · May 17, 2020, 7:48pm

Hey @James_Eades, welcome to the community and thanks for the mention!

I don’t know the whole picture of yours, but I believe there are ways to make Coda work bearably with over 20,000 rows (I have to believe — otherwise my work for my recent client would be a flop)

Here’s what you need to aim for:

Mostly scalar inputs in columns (i.e. numbers, strings, checkboxes, arrays of strings — NOT references or formulas)
No formulas that filter over the same table (directly or indirectly)
No buttons in this table. Build a separate single-row table to host a button and iterate instead
No conditional formatting — use something else to distinguish, e.g. a dedicated Status formula column with emoji
No unnecessary formatted text — flatten Format()s and Concatenate()s with .ToText()

Basically, these large tables should be plain fact tables (i.e. where one row registers one fact, e.g. one transaction of X money from user with ID=Y to user with ID=Z)

but yeah, sometimes it could be just better to use something else, like a traditional database, when you expect your app to work at scale

Johg_Ananda · May 17, 2020, 7:56pm

What exactly do you mean?

Paul_Danyliuk · May 17, 2020, 8:20pm

For example, in table Tasks those would be things like:

Finding a previous/next task for each task
Linking to a Project, which looks up all Tasks for that project, then somehow depending on that

There are lots of scenarios where I don’t have logical explanation, but they resulted in doc size blowing up. E.g. I had two tables: Lessons and Syllabi, each Lesson would link to a Syllabus, and each Syllabus would aggregate all Lessons via a simple lookup formula. Yet these two turned out to 1) not cross-doc at all, even with the column hidden, and 2) take an unreasonable amount of data in the doc file. Still not sure why, but the circular dependency caused that somehow (as Coda support told me). This is purely empirical for now; maybe one day I’ll get to the bottom of it. I suspected it was because references to rows from other tables might’ve pulled related data such as references back to this table, which in turn pulled in references to that other table and so on for quite a few levels of nesting, but I checked for that and it doesn’t seem to be the case.

James_Eades · May 17, 2020, 9:51pm

Well, certainly Coda can handle 20,000 records and do it quite well. I referenced 20,000 because that was the number of records I added when I finally crashed my doc… and it crashed pretty hard. I was attempting to use Coda a simpler tool than a self-hosted Vtiger CRM, so I was looking to store companies, contacts, sales rep notes, quotes, orders, invoices, and a little more. When I saw “unlimited” on the pricing screen I took it to mean… unlimited:

Alas, once my doc crashed after a couple of weeks working on it I then dug under the hood and found your posts about how Coda actually works. I had assumed there was a bit more of a beefy infrastructure to support the claim of unlimited but I was misled like so many others. Unfortunately, what I was working on would have many circular dependencies in data:

Prospects (20k)
Customers (5k)
Contacts - related to customers and prospects (30k)
Quotes/Orders/Invoices - 900 orders might have 4,500 product lines
Products - related to Quotes/Orders/Invoices (15k)
Activites - related to customers (48k)

I am very doubtful Coda will be able to handle this amount of data based on what I am seeing from the community here.Unfortunately a lot of the data is dependent on other parts - orders have customers and products. I could certainly prune certain parts but at the cost of it being worth switching the company to coda from the slow but working crm with a mySQL database.

I am fascinated by this though and wish I knew more about the actual structure behind the scenes. I am a bit confused by some decisions and the way certain aspects are implemented - I program for fun and have generally dealt with massive amounts of data so I was really surprised by the limitation. If it is a limitation in the size of a .json I wonder why each page isn’t it’s own .json or why there is no mongodb implementation in the background to better handle things.

@Paul_Danyliuk thank you very much for the clarification - I will see what I can do with those tips (some have been implemented already on my rework). I cannot wait to see you launch your site!

Paul_Danyliuk · May 17, 2020, 10:39pm

misled like so many others

Well, nothing is truly unlimited Conventional databases have some constraints too.

Coda is catering to a wide audience. Only a fraction of it is power users. For many people and the majority of use cases, “unlimited” indeed feels like unlimited.

That’s why I said in some other thread that before any commitment, you MUST fully realize the fact that Coda is a doc. Not an app builder, not a cloud database — a doc. More user-friendly and stricter about data structure than spreadsheets, but still technically in the same category. So if your project is too big for tracking it in a single Excel file, then it’s very likely that it’s also too big for Coda.

The lack of the database behind the scenes must be a design choice. I’m not on the Coda team, so I can only speculate based on my experience in software engineering, and I’m not ready to tell why I think they went this way. But it must have been a conscious design choice at the time. Heck, I even kinda understand why they did the buttons the way they did (i.e. so that each row has its own button instance and not one declaration for the entire column).

Regarding your dataset numbers. Are those all required to be available at any time? Or the majority of it is historical data that can be tucked away / no need to recalculate it constantly? Because, frankly, if you’re dealing with 5k ongoing customers and processing lots of orders, an effort to hire a team of coders and build a bespoke piece of software sounds like a viable option to me. Not meaning to overstep my boundaries — just an outsider’s perspective on pain (value) vs cost to solve.

P.P.S. If I were hired to solve the kind of problem that you have with the volumes, my first instinct would be just that — figure out if we could split the data into the “historical” part (where nothing is linked and everything is scalar like I described above, and daily/monthly summaries for statistics are also stored as scalars in some Snapshots table) and the “working set” where live formulas and references are okay. With the capability to move data between two sets, of course (i.e. archive a customer, then “resurrect” the customer back into the working set when required)

Ed_Liveikis · May 18, 2020, 3:04pm

That’s why I said in some other thread that before any commitment, you MUST fully realize the fact that Coda is a doc . Not an app builder, not a cloud database — a doc.

This was the most helpful piece of advice of how to look at Coda at the moment Paul. You had given this to me earlier as well.

Instead of looking to replace things like Jira, big CRM’s etc, I now look at it to primarily replace other documents we normally write in things like Google Docs.

And often we do things like very simple todo list or trackers in Excel or a Google Spreadsheet. For small projects it can be much better than all the overhead of a heavy enterprise product.

We’re still using coda for CRM and project tracking, but on a small scale. If our CRM needs grew to enterprise levels, we would likely switch to a product focused around that.

Bill_French · May 18, 2020, 3:45pm

Then you certainly must understand the nature of tradeoffs.

In every product, there are advantages that lay ahead and down the pathway for each design choice. I cannot speak for the Codans, but based on my tests and evaluations, the development team has made some very clear decisions that remain true to the compass-heading of their product vision. Sometimes, these design choices are suboptimal in the true essence of cutting-edge “database technology”. Indeed, I can find a half-dozen suboptimal design choices - don’t get me started on data visualizations.

But, these choices have allowed them to carve out a niche that clearly advances the science of “documents”; a realm that needed a little disruption since there’s been almost zero advances in the concept of digital documents since the advent of early pioneers Wang, Word Perfect, Wordstar, and Microsoft Word.

Data, seamlessly embedded and perhaps piped into documents, filtered, rendered, ordered, and charted is certainly part of that disruptive equation. However, an embedded “database” is not likely central to the spirit of their objective.

My assessment is that Coda is a platform for building document-centric point-solutions, not a platform for building “apps” in the most common sense of the term.

James_Eades · May 18, 2020, 4:05pm

Yes - that is the pill I have now swallowed. Unfortunately the marketing side of Coda does make some more lofty claims regarding capabilities which is what led me to this post. In all honesty I do believe the wording on the pricing page should be changed for legal reasons… Looking through my original desired use-case Coda will not be the right tool, so now I am exploring some other options.

Most of the said data will be moved over to a big-name enterprise system in the future, but that is on hold because of the impacts of COVID-19 - I was looking for a band-aid in the beginning.

I do have a few other (much smaller) uses for the app so I will most-likely continue to use it. If Codans could solve the riddle of doc size I think it would open up a whole new market and I am sure there would be a lot more money to be had with those capabilities. Similar services also have some hard limits on data complexity and size. Would love to see this change with Coda because it really is a very easy to use and well designed tool outside of its limitations. We will see!

Bill_French · May 18, 2020, 4:14pm

Yes, it does and there have been others who have pointed this out. But… all companies make lofty claims; it’s our duty to scratch the surface to peer into the truest capacity of any tech.

Mike_Ray · May 19, 2020, 9:24pm

Glad I read this thread before I spent too much time working on a new “app” idea instead of a “doc”,

I would have likely hit doc limitations since I was hoping to use Coda to replace other “apps” I use.

Any suggestions from the experts in this forum on a similar product(s) other than AirTable that I can use to create replacement “app(s)” [CRM, PM, etc.]?

James_Eades · May 19, 2020, 9:38pm

Everything with a similar feel as coda has a similar limitation. Unless it allows you to work with a traditional database you will probablyrun into these types of barriers. Even with traditional databases you can start to have performance issues if things aren’t well maintained and your code isn’t efficient.

As @Paul_Danyliuk mentioned it would likely be best to get something custom made for your use case if size limitation is a potential issue. There are some no/low code app building sites out there but I imagine they have their own limitations.

John_Coda · May 20, 2020, 12:30am

Hi Mike, I saw that your line of business is internet marketing. I created this playbook with a few digital agencies who are using Coda for parts (or most) of their business: https://coda.io/@john-scrugham/digital-agency-playbook.

The areas mostly correlate with the apps that they would replace. For example, Clients shows how to bring your CRM into Coda (though many agencies will gravitate towards Hubspot as it’s Free and pre-built).

Breno_Nunes · January 24, 2021, 2:02pm

Hi, @Paul_Danyliuk.
Could you please elaborate on how you can archive and “resurrect” data.
I’m trying to use Crossdoc for that but for me it’s a one way street, specially with references. We can’t get them back.
In broad terms, what is your schema like?

Paul_Danyliuk · January 24, 2021, 3:02pm

I never link cross-doc items by references, but by Row IDs, Row UIDs, or your internal record IDs instead. Those are simple scalar values (numbers or text), and I re-link back to actual references on the receiving side.

E.g. in this Cross-doc table I’m importing Student UID and Class UID (red arrows; these are read from the Row object). Separately I’m importing a table of Students and a table of Classes. Then in the actual Class and _Student columns (blue arrows) I’m doing [IN Students].Filter(UID = thisRow.[Student UID]).First() and the same for the class.

This has multiple benefits:

I’m not hard-dependent on row references, so if I e.g. have to delete and re-add the row and it’s a different row now, I can more easily restore its old ID back (e.g. make a manual override column) and everything will be linked correctly again.
Most importantly, I don’t have to import base tables from the same doc like with row references. Here’s what I mean:
- There’s a doc with Students table.
- There’s a doc with Classes table where the Students table is imported. Classes are linked to Students.
- There’s now a third doc where I want to import Classes along with Students.
With reference linking, I’d have to import BOTH the Classes and the Students tables from the Classes doc. That’s because each time you import a table, it becomes like a new separate table on its own, with its own table and row IDs (and that’s logical because you can add more columns onto it). Now if this had to propagate even further, that’d mean that eventually in some terminal doc you’d have a cross-doc table that’s a cross-doc of a cross-doc of a cross-doc of the original one.

With scalar ID linking though, I can always import tables from their respective source-of-truth docs and link together on the receiving side. And those don’t have to be base tables (like cross-doc references now enforce) but they can be views.

Breno_Nunes · January 24, 2021, 8:25pm

Wow. This is really cool. Thank you
I’m trying to replicate your schema here. I hope I’m getting it right.

Danila_Sentyabov · May 5, 2021, 8:24am

Can anyone working with large-ish datasets in Coda share their current experiences? Coda boasts performance improvements in their emails, and I wonder how this translates to these kinds of workflows.

Mario · May 5, 2021, 9:18am

@Danila_Sentyabov If your dataset takes more than 5 second to be copied from excel to coda it’s not a db to use on coda, for what i got till now more than 5000 rows becomes just unusable in comparison to smaller db.
Some says rows are not the critical limit but i disagree, also with 10 columns the doc become unusable…

Off course i’m using those on coda anyway but i would not recommend anyone to work on those conditions…
For now coda is fine with small and not growing db

With small i mean mini, what we consider a small db in excel become huge in coda, more or less all of the time, i’ve had stopped trying to consider rows and columns because it’s really not that important, if you apply grouping it’s gonna become 5x slower, if you filter it, 3x slower, all of this while pressing the “wait…” button on chrome while the doc is not responding…

So my answer is as april 2021 is no, coda is not designed to manage normal dataset, and if a small one is used there is not a lot of space to grow it

For anyone who think in a different way i have some simple db that you can have fun in trying to work on them, i can send you pure frustation in a .csv

P.s. maybe and if the performance will improve 50x (not 0.3% faster, lovely result but not game changer) we will be able to use those, but this would mean a complete rewrite of how coda work, so it’s probably easier to build another one…

Danila_Sentyabov · May 5, 2021, 10:41am

Thanks! That’s what I thought.
Coda is very promising, and is still immensely useful as a prototyping tool, and for publishing docs with a little bit of interactivity, but not as even a half-serious process management solution.

Topic		Replies	Views
What are Codas current limitations and how can we work with them?	22	3452	December 5, 2022
Realistic number of rows in a table Tips and Hacks	37	11103	April 4, 2024
Some Performance Announcements News from Coda	45	13200	September 26, 2022
Why would you NOT try to manage all a Tech Start-Up's stuff in Coda?	61	14141	March 13, 2020
Privacy laws and data protection Product Feedback	12	3742	May 8, 2024

Size Limits & Doc Design Best Practices

Related topics