Polymorphism would be nice: a way to join tables with common columns under a parent or generalized table

The desired functionality is to somehow have an umbrella table that would join together (in the SQL sense, i.e. to merge the rows of multiple tables on some common columns) multiple otherwise unrelated tables. The intention being to:

  1. Aggregate all rows from multiple tables into a view that presents the row values for the common set of columns.
  2. Considering this view of the rows as a more abstract or generalized table, to reference this table within formulas and lookups.

This has been previously discussed here:

wherein @Federico_Stefanato proposes to extend an initial parent table with subtables “that only define the differences needed to be tracked.” (see the “Proposed approaches” section below for a review of this approach).

Proposed approach is quite similar, but is more bottom-up and data-centric. Rather than create tables to be extended (which implicitly creates the concepts of table type and inheritance), I would create a table that joins together the rows of other specified tables. This more abstract parent table would have only the common columns among all of the specified or “child” tables, and would act as a view of the rows of the specified tables projected on those common columns. You could change the set of common columns by changing the columns on one of the child tables, or adding a new column to each child table.

The table itself would be read-only; adding new rows directly to this parent table, modifying columns types and settings, and modifying existing rows would all be impossible.

Under the hood, this would simply be a unidirectionally synced join of the specified tables on the common columns (would need to determine the intersection of the columns and column types of all specified tables).

The UX would be to leverage the same table creation menu, but simply allow selecting multiple existing tables (with the option to change this selection later on). This would create a new multi-view (or some other catchy name) of those specified tables, that can be referenced in formulas and behaves like any other table (with the exception of being read-only in all contexts.)

Use cases

There is a good use case explained in the discussion linked previously.

Mine specifically was to consider rows in multiple unrelated tables as events. For example, I had a Meetings, Classes, and Exams tables. All of them had a Start date time, an End date time, a Name, as they are each collections of types of events with their own unique concerns and additional columns. I wanted to render these tables in a unified calendar.

Proposed approaches

Federico’s approach proposes a more top-down way (similar to many OOP programming languages out there) of doing things, where a general base or parent table (type) is declared and then more specific sub tables (types) are declared that extend it. IMO, I find it unnecessary to pre-emptively declare the extension relationship between tables at creation time. What happens if you want to make an existing table extend another one? (even if you add an alternative UI to accomplish this, the semantics of making an existing table extend another is unclear to non-programmers.) I also find it goes against the design flows in Coda, where data is first created then processed (for example, having to create a table type for a lookup column and then create the table to be looked up would also feel weird). Ultimately, both approaches are essentially two sides of the same coin, and differ mostly in how they are framed upon creating the relationships.

Complications

Column types of common columns in the parent table

This is likely the biggest issue as there is no clear decision that intuitively follows from the design. I think naively columns should only be considered common if they have the same name and exactly the same type in all specified child tables. There could be some leeway for example in different types of dates or numbers being coerced to the most general type.

There is also the concern that columns on the parent table cannot be directly edited (adding a formula or changing a setting) as that would mean that there would be columns that are imposed onto the child tables (would need to be only editable on the parent table). IMO this is also too complex so I would not include it.

Modifications to the parent table

It should be impossible to add new rows directly to the parent table, as this introduces more complexity (the parent table becomes a concrete type that can have its own rows, as opposed to only show rows from the specified child tables, which comes with a whole new set of decisions). Also it can easily be accomplished by creating another table, with the same shape as the parent table (same columns and column types) and then simply making it another specified child of the parent table.

It might be possible to add columns to the parent table, that are automatically added to the children tables. Settings on the parent table column are simply copied over into the new column in the child tables, but can be replaced individually in each child table. Ultimately this would just be a shortcut to adding the column directly to each child table, and introduces other complications, so it is not necessary.

Distinguishing the joined rows

There would be a simple way to distinguish the joined rows by simply having a Type column common to each child table, that would be a select list with the names of each child tables as options (and an appropriate value for new rows). There could be an argument for having this column be automatically created and managed in the parent table.

Referencing the full rows

There should be some way to navigate to the full row in the specified child table from the row in the parent table (not in formulas, but through the UI). Could be as simple as a link in the full screen view of that rows like “See full row in [Child table name] table”.

Existing demand for this functionality:

3 Likes

For completeness, below is a survey of existing approaches and workarounds.

Currently there are four methods that I know of. I will illustrate them with the domain explained in the “Use cases” section in the original post.

  1. Create an Events table with the common Start, End, and Name columns. Create three views of that table, named Meetings, Classes, and Exams. Each view adds new columns for all of the properties it defines, and shows only the common columns and its relevant columns.
    The downsides are that this potentially creates a large number of unused columns (for example, the Exams view will have all the columns from the Meetings and Classes table, but they will all be hidden and empty). It also creates namespace issues, if two of the views have similar columns with slight differences (i.e. different formula or settings), then they will need to use different names.

  2. Create an Events table with the common Start, End, and Name columns. Create three new tables named Meetings, Classes, and Exams. Each new table has only the columns for all of the new properties it defines, plus an additional lookup column that references a row in the Events table, which contains all the event-related information for that row.
    The downside are that the data is no longer flat, and the event-related information cannot be directly embedded inside the tables. Also, creating a new row requires creating a new row in both the specific table (e.g. Meetings) and a new row in the Events table, and then connecting them. You end up with a large table of Events and no good way to link each row to a row in one of the specific tables (unless you create three different columns in the Events table to lookup the rows in the three different specific tables). Some of these can be solved by using automations that run upon creating new rows or buttons to create rows in both tables and link them.

  1. Similar to 2, but delay the creation of the linked events to the viewing stage with manual processing or automations. Create an Events table with the common Start, End, and Name columns. Create three new tables named Meetings, Classes, and Exams. Each new table has only the columns for all of the new properties it defines, including the event-related information such as Start, End, and Name. When it comes time to join the rows of the Meetings, Classes, and Exams tables as events, simply use a button or an automation to process all the rows from each table and create a corresponding row in the Events table that copies the event-related info from the appropriate columns in each table, joining a copy of the rows into the common Events table.
    This is not ideal but might be sufficient for many use cases as it does not alter the representation of the data in the specific tables, but it does require manually joining the rows (which can be automated using automations).

  2. Similar to 2, but push the handling of common information onto an external integration. In our use case, you might use the Google Calendar pack to create events for each row in the specific tables, and then have a sync table that pulls in all of the Google Calendar events. This has much of the same drawbacks (info no longer embedded, and need to create the new row and create a new Google Calendar event).

2 Likes

hi @louca.developer
I would use something as described in scenario 3: via a button you generate a new table that brings all data together in this new table. This approach can be follow-ed by any regular Coda user, however if you are new to Coda, you might need the proposed solution. Coding buttons is not a thing most new users feel comfortable with.

Have run into the calendar use case before, agree that would be helpful.

1 Like

Hi Louca,

I would go with option 1. It is simple, and available already.

I have played around with the idea earlier in this document: Rambling Pete's Task List on Steroids- Maker version ¡ Rambling Pete's Shopping List on Steroids.

As you can see, you can go bananas with the idea. People are way to eager to create new tables, especially in the modern era. Third normal form is out.

Regards
Piet

oh lordy… this takes me back to the 70’s and the Great Debates about inheritance, polymorphism and collections. a lot of meta-physics, philosophical, pan-dimensional, inter-galactic, deep-thought discussions among enthusiasts from every programming religion; simula, smalltalk, pascal, c, objective-c, algol, pl1, etc… etc…

even today it has resulted in divergent technologies; single-inheritance in java, prototypical-inheritance in jacascript, multi-inheritance in c++, constructors in clojure, 3rd normal form in sql, polymorphic methods in graphQl, factory patterns, interfaces,… etc… etc…

in a non-third-normal table system like Coda, you can easily use several approaches as listed in previous posts. go knock yourself out. have fun.

but the simplest are probably the best as Coda is meant for low-volume, user-amendable, widely-shared, collaborative documents to be understood by non-experts.

so i have used a KISS simple approach, so that my clients dont need computer-sciencey know-how.

i WANTED to be all deep-voodo software engineering best-practice about it. but my clients need super-simple. sigh.

so… have a single table with a Type column. select the Type from a list or another lookup table.

depending on the Type, you can use different columns as required for that type. yes, you end up with more columns (and some will be blank for most Types) but the logic remains understandable. if its a Meeting type, use the meeting-layout. if its an Exam type, expect certain columns to have appropriate values for Exams.

but there are no extra links, multiple-inserts, multiple-updates or multiple-deletes.

and no possibility of leaving “widows and orphans” all over the place. and no delays where inconsistant states await automations or buttons to ‘fix things up’.

use specific views and dialog-layouts for different types.

you may need some Switch(Type, … ) logic in a few places, but it is all contained in one table, not scattered across multiple places.

sorry if i am preaching, dont mean to be preachy.
its just my opinion, hard won by experience.
there is NO “one true way” in coda.
whatever gets the job done

respect
max

4 Likes

Agree Max.

I have never had people stop using a doc because it is too simple. But I struggle all the time to get people to use my docs because they are overwhelming. And I am not doing voodoo, just integrate a few topics that they used to do in several spreadsheets, and then they get a mind freeze.

P.

4 Likes

This is a great article and I look forward to reading the reactions.

As I read this (twice) there was a little troll in the back of my mind which my wife said she actually saw a leprechaun-like creature dancing on my shoulder whispering into my ear - abstraction layer with GraphQL and Neo4J. :wink:

I have often asked - what if Coda’s entire data visualization platform could be an abstraction of D3 or Vega? Is it possible the same question applies to the underlying datastore?

2 Likes

From my point of view, holistically this proposed approach would actually reduce complexity. Tables are no more than they seem. For example, the Meetings table is no more than a first-class, plain-old table that is self-contained and relates only to meeting instances. I find it almost more complex to have to explain to a customer why multiple concepts, that might share nothing in common save for a few properties, must share the same table, and all the nuances and complications this incurs. Not to mention the experience that would happen if a user were to realize the need for this functionality only after having already designed and filled their independent Meetings, Exams, and Classes tables. It almost requires a certain level of expertise to have the forethought of designing tables in this more convoluted way to enable some desirable functionality down the line.

I prefer patterns that condense complexity. Rather than have the complexity distributed to each of the user-facing components at every level (each table is actually a view of the Events table under the hood, and there are a list of columns that the user must remember to ignore when referencing this table, and must remember to switch on the row type when referencing in formulas, …). I would prefer to have the front facing components be as simple as possible, and pull all the complexity into a single centralized point, whether it be a button that creates new copy rows in a common Events table backed by the rows in each of the more specific tables, or the proposed table that acts as a view of all the rows onto the intersection of columns. Sure, this has downsides, but at least it is somewhat of a “set-and-forget” pattern and does not even require that the user fully understand how this work. It also doesn’t impact or dictate how the rest of the doc data will be structured and referenced, which would make removing this behaviour in the future a pain.

I settled on this approach as it is less complex, doesn’t presume any technical understanding of the more CS-ish topics you mentioned, and is imperative in the sense that it allows flexibly making decisions on how to aggregate and view data after creation, rather than inflexibly making decisions about how to relate and declare tables before creation. Another big part is that it relies entirely on existing Coda language, such as views of tables, locked tables, etc. It is nothing more than a view of multiple tables that is locked, and happens to enable more powerful functionality (polymorphism) by being referencable in formulas.

4 Likes

This is now possible with the Merge Table pack:

1 Like