Major Scaling Issues with Large Data Set

Richard_Kaplan · May 21, 2018, 2:02am

I was patient through numerous prompts by Chrome asking if I wanted to “Wait” due to an unresponsive site… and uttimately I was able to upload two large tables of data from Google Sheets, one abut 9,900 records in size (3.9Mb) and the other about 32,000 records in size (12.6Mb).

I discovered that the waiting time for a filter formula search on the larger table was just barely on the border of workable, though there may be some background indexing as it improved as the day went on.

But most notably, the ability to randomly insert a row hyperlink via the “@” feature on the canvas has slowed so much that it is entirely unusable in the entire document. I presume this is more affected than a filter formula search because it searches through all tables in a document.

I do realize these are pretty large tables. But that said, coda.io is of course a database! Files of this size are not an issue for Excel or Google Sheets nor for Airtable (for paid customers).

Is the plan for coda.io to be able to handle tables of this size, or is the intent for the “database” feature to be for limited personal databases only?

If it can support larger databases, the usefulness goes up quite considerably.

******* Update – as of Monday morning all of the searches in the document are effectively unusable - even after waiting 5-10 minutes they time-out or do not respond. Perhaps there is a variability in performance depending on server load. In any event, at present Coda simply does not scale to any workable degree with moderate “database” sized tables.

I realize that in the live release, there may need to be a paid tier for larger database/table sizes - that is reasonable. But the system really will not be usable, even for free, without some way to handle larger tables. What a disappointment to have such a capable database that becomes unusable when “too much” data is added.

Philipp_Alexander_Asbrand-Eickhoff · May 21, 2018, 1:39pm

It would be really interesting to get some sort of time estimation until the processing power gets increased or have a general rule of thumb what is and what isn’t currently possible.

Otherwise people will keep putting work into their docs only to find out later that they are unusable

Xinyu_Zhou · June 7, 2018, 3:19am

+1
I tend to use coda.io as an “app” for managing my everyday-growing data, say, 100 rows a day, 512 bytes per row, which translate to about 50KB data a day. It would only take 80 days to catch up with the 4 MB table of @Richard_Kaplan .

Henry_Martin · June 8, 2018, 5:19am

Any news on this @Richard_Kaplan

Richard_Kaplan · June 8, 2018, 2:58pm

At the recent Webinar, Maria said they hope to address this shortly.

As an interim workaround, I realized that you can embed a Google Sheet into a Coda document and it updates in real-time - that is something I do not think is even possible to do with Google Docs! Plus you can easily copy/paste data between that embedded Google Sheet and a Coda table and vice versa. It’s not a perfect solution from a design perspective, though it does give some added data manipulation and Add-on feature capability that Google Sheets has but Coda does not.

So I am using that for the moment and awaiting further word from Coda. Mara said there is a team within Coda specifically working on these performance issues.

Matthew_Oates · September 17, 2018, 12:22pm

Any further updates or improvements on this? I ask because I’m hesitant to start seriously testing Coda as an alternative if it ultimately doesn’t scale to business volume requirements.

Richard_Kaplan · September 18, 2018, 12:30pm

The API has similar scaling issues as the main app.

I have concluded that at least for now, Coda is a great tool for small projects and may be useful for medium to large projects by embedding a database viewer into a Coda page. But using Coda as a primary means of data storage for other than small projects does not seem to be feasible.

Matthew_Oates · September 18, 2018, 12:52pm

Interesting. Maria has indicated they will be focusing heavily on performance improvements starting in October. I’m okay with that, assuming there are actual improvements in 2019. Have they demonstrated a track record of delivering on promises thus far?

Matt

Richard_Kaplan · September 18, 2018, 9:00pm

They seem to be terrific at delivering on promises within their overall product goals - which is of course understandable. It seems harder to get a clear answer as to what the long-term roadmap is and what goals are particularly important.

I think an equally large issue at present is the inability to share documents publicly without the need for a login (for read-only content). Coda appears to be emphasizing the market seeking an intra-company communication / collaboration tool but not the potential market of publishing / communicating to the public at large.

Matthew_Oates · September 18, 2018, 9:35pm

yep, that could be a major issue for us as a professional services organization that needs to interact with clients. I wonder what market segment / Use Case they’re focused on? That would tell us alot about their potential goals and priorities. I’m also flabbergasted that they don’t allow attachments in cells? Wow.

While I’ve got your attention, I thought maybe I could run another issue by you. I’ve currently created a document that has about 15 tables in it. For most of the tables I’m using a formula (not a Lookup) in a Select List dropdown to assign a person from a centralized Staff Table. To do this, I’m using a formula for the “selectable options” that is, bascially =[Staff].Name

Here’s the issue: for some reason it stopped working after about the 10th table I used it in. On additional tables (within the same document), it absolutely will not recognize the Staff table any more as a defined Entitiy / Object. The formula section only wants me to select fields within the same table, it won’t identify other tables at all. Is there some kind of internal limit on how many times you can use a formula on Selectable List fields in a single document? Just curious if you have run into this.

Cheers,

Matt

Richard_Kaplan · September 19, 2018, 5:48pm

That’s an odd issue on the 10th or subsequent table.

Might you have used non-unique names and that might be causing the issue? Puzzling

Matthew_Oates · September 19, 2018, 6:09pm

Ah, I figured it out. I happened to have an icon in the name of a table so a formula wasn’t finding it. I continue to have challenges with columns that have the same name (i.e. “Client”) in different tables. Forumula’s never seem to find the right field and I’m not quite sure how to force it.

Matt

Jean_Pierre_Traets · September 20, 2018, 8:30am

Dear @Matthew_Oates,

The best is to differentiate the column names like “Client”, ClientID, Client#.
Recommended is also to use the table name in your formula like in the sample below, where Movements is the name of the table and [Move to] is the column I am putting the filter on.

if(Movements.filter([Move to].totext()=thisRow.To.ToText() && Style.ToText()=thisRow.[Item send].ToText() AND In_Out.totext() = “Out” AND Date.Matches(daterangepicker1)).count()>0,Movements.filter([Move to].totext()=thisRow.To.totext() && Style.totext()=thisRow.[Item send].totext() AND In_Out.totext() = “Out” AND Date.Matches(daterangepicker1)).QTY.Sum(),0)

Just to say that the credits need to go to @Al_Chen_Coda and @Daniel_Stieber. At SKILLSHARE you will be able to find a 3+ hours training material on Coda created by Al. I can really recommend it.

Enjoy Coda,
//JP

Shannon_Massingill · September 20, 2018, 5:53pm

Use [Table Name].[Column Name] in your formulas

Carlo_De_Pascalis · November 15, 2018, 10:27am

I have a table with 386 rows and 37 columns (about 14’000 records). Scrolling horizontally is perfectly smooth but scrolling through it vertically feels super laggy. As soon as I apply a filter that reduces the displayed rows to about 30 it feels somewhat smooth again.

Eduardo_Cavazos · April 11, 2020, 1:48pm

I tried to import a the American Standard Version of the Bible as a 4.5 MB CSV file and I ran into the same issue. Many ‘Page unresponsive’ prompts in Chrome. It eventually did import however the resulting doc was pretty unresponsive with certain operations (creating views on the table). Ultimately, the doc didn’t appear to save properly.

Coda was however able to handle the entire book of Genesis.

The performance with that document is quite acceptable. Even with things like grouped views:

Pretty impressive!

Eduardo_Cavazos · April 12, 2020, 8:00am

It turns out you don’t have to hit the ‘Wait’ button on these ‘Page unresponsive’ prompts. So if you decide to import a large CSV like this one, feel free to just let it spin; it will likely finish. I was able to successfully import the entire ASV Bible mentioned above without issue.

That said, rendering and interacting with the resulting table is probably not quite performant enough to be practical.

Eduardo_Cavazos · April 13, 2020, 6:40pm

I was curious about this claim that Airtable can handle imports of this size. So I attempted to import the ASV Bible mentioned above into Airtable. The claims are true… The import was fast, maybe a minute or so. The resulting table performance is excellent. Link the the table with the the full text of the bible: link.

So this gives me hope that the Coda team will eventually be able to get similar performance for tables. (I generally prefer Coda).

Sam_Harrison · May 26, 2020, 8:12am

I’m struggling with this issue too. I’ve got a 2.5 MB file, ~21000 rows x 21 columns. I tried importing as CSV file and as @Richard_Kaplan suggested, I just waited and waited and eventually (~20 mins) it imported. But it was super sluggish, then froze after a few minutes and when I reloaded the data had disappeared.

Has anyone had any success with large(ish) databases like this? My next approach will be to try and import via the API, but I suspect this will be futile if it renders the doc unusable.

Topic		Replies	Views
Performance with a large table	23	8616	March 2, 2021
Coda always crashes on large data sheet. Help?	13	2547	July 6, 2020
Feature request: database field Suggestion Box	5	1111	September 7, 2022
Large table 100,000+ rows?	1	672	September 30, 2019
Document Size & Speeed	5	236	January 31, 2024

Major Scaling Issues with Large Data Set

Related topics