Tell me about your big slow docs

I’m exploring building a doc + pack to solve the problem of large, slow docs.

Can you tell me about your slow docs and your big docs?

I’ll start:
I manage a doc that runs payroll. It can generate 1000 rows per pay period, which quickly gets beyond the performance that Coda can handle.

This means that:

  1. We have to wait a long time for it to finish “Calculating…”
  2. If there is a bug, then it is slow to track down and fix because the doc isn’t responsive
  3. Complex tables that reference each other

I reduced the problem by archiving rows I no longer needed.

1 Like

Doc that manages event scheduling (crew, equipment, etc). Bogged down in particular by overlapping date calculations (which scale exponentially with row count).

Workaround:

  • Automation to prune historical and orphaned records (would kinda rather keep them for posterity, but need performance. not sad about orphan record pruning though)
  • Restrict date collision formula to only care about dates that are within a month of Today()
2 Likes

@Federico.Stefanato @Paul_Danyliuk @Xyzor_Max @Nick_HE @Nina_Kastenauer1 @Scott_Collier-Weir @Jono_Bouwmeester @Christiaan_Huizer

@Ryan_Martens2 @Johg_Ananda @Daniel_Stieber @Bill_French @joost_mineur

This is super interesting. Is there a minimum example doc you could share of the exponential date calculations problem?

insurance broker client using a big doc to track claims processing

based on an old excel spreadsheet with vba macros

niavely migrated, grew quickly in volume but also had a great many formulas that were not effecient

filters, sumif, formulamaps that did addrow and addormodifyrows all over the place

solution
better data schemas - 3rd normal form
use views instead of crazy filters and sumif stuff
but biggest improvements
move column formulas into buttons
so only executed when needed
and biggest of all
create a single document per claim
(locality of reference and all that)
then use crossdoc to accumulate a summary of each claim into a big fat (but nonactive) archive database

so LOTS of smaller docs with tons of processing
and a huge archive database that just sits there

now our problem is keeping track of all those documents, they are legal/audit essential

so building a master document tracker using the coda docs pack to index and retrieve the docs when needed

so our motto is
keep it small
keep it simple
and keep lots of them

max

3 Likes

Hey Connor,

We track activities for our members. Some members have been around for 20 years, some for just a year, or anything between these 2 numbers. Activities are tracked with time, date and location and average about 50 activities per member per year. Most years, there are about 150 active members.
We keep these logs ‘forever’ and we produce user reports for each member, listing their activities from up 20 years ago.
We also list annual totals for each member, as well as for the club, for the location and for the type of activity.
This is good size table (sic) and the date has been saved in a (Pervasive Btrieve) database, which can handle this with ease. For the last two years we have been collecting new data in Coda, which works pretty well so far, but we’ll run into the limits of what is manageable pretty soon. A native DB connection in Coda would be nice. To be honest, I think this amount of data should not reside in Coda at this point in time, but writing proper apps/programs the traditional way is to much work (for me). The last 2 years I delivered more software features in Coda than in the 18 years before in a traditional IDE environment, so I really hope Coda will keep up with my data requirements. In the foreseeable future I would need for my tables to host about 50K rows and preferably a couple of tables of similar size alongside each other. For the time being I can resort to cross docs, but I would like to do this native in one doc.

So, for sure looking forward to your large-doc solutions!

Greetings, Joost

2 Likes

Basically just imagine a table of events with a start date and end date. I need to know which events overlap with each event. So each row has an “overlapping events” column with this formula (per book 2 of the gospel of Paul)

EventsTable.Filter(
    currentValue != thisRow AND
    thisRow.[Start Date].IsNotBlank() AND
    (
      CurrentValue.[End Date] >= thisRow.[Start Date] AND
      CurrentValue.[Start Date] <= thisRow.[End Date]
    )
)

Optimally-performant as that may be, each row still triggers a recalculation of every other row, and each of those calculations gets slightly harder each time.

2 Likes

I also have a payroll doc. Right now, it just takes a few minutes to calculate monthly payroll because it’s not super huge yet (a few thousand rows). But we are growing fast so I’m worried about how it will perform when our staff team doubles or quadruples and our transactions build up over time.

I haven’t taken the time yet to analyze or debug the calculation time, but I have a feeling the primary culprit is actually a RunActions() button that triggers Coda to update expensive column formulas in between each action. So I might have to move some column formulas to buttons, or optimize those calculations somehow.

2 Likes

Lots of rows to review insurance claims for our dental practices. These aren’t kept for posterity, just for taking actions and then the tables get erased and repopulated. I manage performance by limiting the number of records that get inserted via an API call.

All of our patient data is in another software with a MySQL/MariaDB database … i use coda to present the information we need more efficiently and create buttons to take actions.

When I get a list of 1000 claims it definitely gets very slow. I’ve implemented some date and clinic pickers so data can be imported with better precision.

I’m also working on some docs for bookkeeping where I import lots of transactions from QuickBooks Online We then manipulated data quickly in coda before updating via api calls back to QuickBooks ( QuickBooksOnline is just so cumbersome and slow to try to do efficient changes). These writebacks get a special code phrase attached. To limit issues we have buttons that wipe data and import fresh, ignoring records that have been marked with the code phrases that let us know we don’t need to review them again.

So mainly we’ve resorted to using it for data manipulation and presentation, not archival storage.

3 Likes

Column formulas into buttons is a great move. I’d almost like to see some sort of option to natively do that. E.g. a formula that can be set to “calculate when needed.” Lots of calculations would be faster.

The single doc per X is a pattern I’ve used too. The only problem that it has is it’s difficult to later centralize all the data if you don’t first plan for that. I’ve talked to the Coda team about this, I proposed a “spoke and hub” solution, where the user creates a doc for a certain task, and then later once that doc is being replicated they can map the columns of their local doc to those of some archive doc, that way all the data generated by a process can still be looked at in aggregate, but the user doesn’t have to plan around it ahead of time.

2 Likes

Yeah I feel very similar. Even if Coda improves performance to the point that I can host 50k - 100k rows, I’m noticing that I would feel more comfortable knowing that there is a backup of my data somewhere uncorrelated with their systems.

1 Like

Wow that’s pretty fascinating you use it exclusively for data display.

How do you connect with QuickBooks and with MariaDB? Have you created a custom Pack?

nope id love to work on a custom pack but im still waiting for access to it:)

I use Integromat extensively and it works well. the Integromat button on coda pushes a webhook that runs a scenario that runs a query (either QBO or mysql, depending on scenarios) and then i map that to a text aggregator and then the Coda API module to bulk paste into coda. works quite well for that. just wish coda was a bit quicker overall but its still better than doing batch tasks in QBO, and it saves tons of clicks vs tasks in our dental practice management software

3 Likes

Sometimes, I still need to copy and paste csv data into the Coda and it will take forever to load about 5000 rows of data from SAP (copy table) and paste.

@Irfan_Khan interesting use case you have here. Would you mind pointing me into right direction how do you accomplish this?

1 Like

This is so cool. . .

2 Likes

@Hendrik_TnB I’m out of town this weekend but I’ll try to remember to do it Monday evening. Which part/use case in particular are you asking about ?

@Hendrik_TnB to get you in the direction tho:

For example I’ll use QuickBooks online. I want rows of transactions imported. I have a table created in Coda and I have a button to import data.

The button just call a webhook (hyperlink) to Integromat (now called Make, btw). When that webhook is called, and automation scenario runs on Integromat where a query is run to get data from QuickBooks.

That data is parsed and manipulated to JSON , and then an API call is made to coda where that data is pasted into my Coda table. It’s pasted in bulk, NOT line by line. Line by line would be very slow, and Coda has rate limits of 50/min so that would take almost 2 hours for 5000 rows, and cost you lots of extra money in Integromat. By using batch pasting you can push hundreds of rows in a single operation.

The API call runs on Coda servers and can be a bit slow, but we are talking seconds not minutes for hundreds of rows. It may take another 20sec to show in Coda after completion.

I’m not sure about 5000 rows. You may need pagination or split your chunks in Integromat. It’s doable but I haven’t had to do that yet. I’ll try to get a mini tutorial screenshared if it’s something people want to see more about.

Irfan

2 Likes

Yes please! Or make a tutorial doc and share. Would love to see