Perfomance when most rows are filtered?

Juan_Manuel_Perez_Lostao · August 15, 2020, 7:24pm

I’ve read several topics about how much data a Doc supports before going beyond practical. Are we talking about data in sight, or does it account every row all document wide?

I will tell you my concern:

I am building a management application over coda Doc. I first considered spread it through different docs, yet it seemed unpractical back then… everything is linked.

So I have a huge fear for scalability.

I am tracking processes, activities, execution logs (system records, such as results and KPI’s), and other business related stuff such as relevant organizations (customers, partners…), and so. Beyond that, I have modelled yet another dimension to track each process and results to its own management system, since our service provides operation for customer’s own systems.

So, as you can imagine, just in docs and execution records, tables can grow, FAST.

First I modelled every piece of information related to processes in several tables, and I had to hack them to a joint table so I could easily relate them to the due process. Then, I thought, “why not model a big, sparse table containing every single document and record”.

Enter my huge fear: one way or another, there will be soon lots and lots of records in some tables. Some of them, I can break in smaller tables, some other I can’t (such as communications).

I could set some really complicated way to archive them, and somehow hack Coda to show them in a more or less “composed” document.

But… I have read limits of scalability relates to number of rows, and i am wondering… All rows in document, or just the ones loaded in current view of the document?

Because problem can be significantly reduced if only the rows loaded are the first practical limit for document, since I can easily prevent source query, and force filters for everyone.

David_Clegg · August 15, 2020, 7:56pm

It’s fear I share too.
I have broken out some tables into different documents and replied on CrossDoc sync (which has problems of its own).

The scalability issue is a massive concern when conceptual designs turn into reality and a table of 100 items rapidly turns into potentially tens or hundreds of thousands of entries, or even hundreds of users.

Right now, I’m simply praying that Coda can scale at a rate faster than my company’s use of it, but already I’m starting to look at alternative options.

I love a LOT of the way Coda works, it’s a wonderful product but… this concern won’t go away.

Jean_Pierre_Traets · August 16, 2020, 1:29pm

Dear @Juan_Manuel_Perez_Lostao and @David_Clegg,

I would like to use an analogy

When you go out on a trip with your family you can use a family car.
When you go out on a trip with the football team you need at least a bus for everybody to join the party

As @Paul_Danyliuk, clearly stated, “everything has it’s limitation” and he has proven that there are many ways to deal with it at a certain extend.

I am sure that scalability is high on the agenda of Coda and of course it’s always good to emphasis your concerns

David_Clegg · August 16, 2020, 10:49pm

The whole point of expressing concern is to raise awareness, otherwise a product might happily believe it’s meeting customer needs whilst in reality it’s missing out on a much larger customer base.

Whilst you say it’s a ‘doc’ - you’re suggesting that automatically implies limitations. Why should it? There are some complex documents in this world. It feels a little unfair to use the term ‘doc’ to set some lower level of expectation.

One of Coda’s key selling points is the move away from spreadsheets - and it offers a very enticing proposition. However, even a relatively primitive spreadsheet can deal with hundreds of thousands of rows.

Yes, everything has limitations, but that should not stop us understanding why such limitations exist and questioning if they can be removed, mitigated against, or reduced.

We do see continuous progress on the evolution of Coda, but that does not mean key decision makers in companies don’t have worries. Scalability is a worry, reliability is a worry, performance is a worry. I think it’s right to discuss those concerns.

It genuinely isn’t easy to sell Coda (or Airtable, or Ninox) or similar technologies to companies who feel safe with Excel and who don’t yet appreciate the advantages of Coda (there’s a few disadvantages too!). They’re looking for assurances when moving away from a tried and tested technology, however antiquated it might be!

I’ve always considered Coda as somewhere between a spreadsheet, a word processor and a database and whilst not being the best at any of those things, it’s one of the very best at being a low cost, easy to work with hybrid of them all.

If people believe a 10,000+ row table is beyond the capabilities of Coda, then so be it, but I think that should come from Coda as ‘not a typical use case for our target audience’.
If Cross Doc sync between 20 documents is considered excessive, then it would be nice to know.

For instance, it’s really not unusual for a company to have a client base of 10,000 individuals. That company could easily have a sales team, an accounts team, a service department etc all wishing to share the client details, thus share access (via cross doc) to link their own data to those 10,000 different clients.
But what happens when that number reaches 11,000 and cross-doc sync only works with 10,000 - what then?

These are genuine questions that are already cropping up, and will continue to do so.
We are all advocates for Coda, and our use cases are quite varied, but it’s good to discover some of the outlying use cases, or more complex relationships that businesses are trying to address with Coda.

Juan_Manuel_Perez_Lostao · August 17, 2020, 6:21am

Thanks for your points, Coda has an awesome community, that’s for sure, and @Paul_Danyliuk always offers great insights on Coda.

@Jean_Pierre_Traets I am aware of those answers, and of course, I am aware of usual technical limits / scalability costs for most systems.

We are a company of 6 people, we fit in family car No, really, I understand your metaphor. I am not trying to store historical data or so, I am aware I will need to archive it to somwhere else*. I just want to track information related to ongoing activity: projects, services, and processes.

Given that point, and having already read about limits, its my educated guess that:
-Since calculations are performed client-side (except for automations and packs), practical limits discussed on some other threads are related to the required power and memory to handle doc in client-side working browser
-Having experienced performance issues in document sections depending on their complexity and amount of tables, my guess is that not every document piece of information is being handled at any given moment, but only part of it (this point, i am not sure. I experienced this mainly at design time so it could be something related only to formula and document building instead of also to document data).

Hence, if my assumptions are correct, and we can assume that data not being used in current view isn’t severely impacting performance, I know I can handle it in a way there won’t be performance issues I can’t solve by archiving old data to other document.

For example, I know I could perfectly handle the required amount of information in Excel which is also a document, right? But I just don’t know in Coda.

@David_Clegg If my understanding of Coda workings is right, cross-doc won’t help by itself, as you are basically copying your data between docs. The ‘view’ of your doc 1 table in your doc 2 is basically keeping a copy of doc 1 table in doc 2, hence, not reducing storage consumption whenever you have at least one document where you need to have all data spread across other documents available.

In the end, David, I also took a leap of faith, but building a truly well UX designed doc has forced me to go beyond my firstly intended level of complexity. I have now a bunch of auxiliary tables and views, just devoted to build some ‘modal’ views and alternate templates.

Juan_Manuel_Perez_Lostao · August 17, 2020, 6:33am

@David_Clegg By the way, my probable approach for archiving data, rather than just cross-doc-ing, will be as follows:

-Since part of the information is related to projects and customer services, I will make a consultation doc that I will clone for every project or service. In there, there will be a part where consulting active data from the main document (cross-doc), and historical data, that will be archived to that document and deleted from the main one.

-Every year, archive the main document and clone it to create a blank copy, deleting unrequired data, similarly to financial software workings, so non current data unrelated to specific services or project can also be archived.

This way, if you want to consult old data from company point of view, you go to the desired year document, if you want to consult data specific to a project or service, you go to that document.

Unclear scenarios:
-If project/service document grows too big too? -> should I also clone blank new version?
-If i want to see evolution or run some analytics involving more than one document? -> Should I build a analytic document extracting precalculated data from archived documents (i.e., averages and so)?

Topic		Replies	Views
Size Limits & Doc Design Best Practices	24	8380	February 15, 2022
Performance with a large table	23	8753	March 2, 2021
How many rows/columns can a Coda table handle?	3	2306	June 19, 2019
Large table 100,000+ rows?	1	710	September 30, 2019
New API limit - 1,000 rows? Alternative services or workaround	9	2397	January 22, 2020

Perfomance when most rows are filtered?

Related topics