Most real-world workflows will involved ‘Multi-Tiered’ databases of linked tables.
But, over time (especially during development), we can end up with rows in lower tiers that are no longer linked to valid ‘parent’ rows in the higher tiers. We call these ‘disconnected’ rows “orphans”.
For example, we might have CUSTOMERS who place many ORDERS which contain several ITEMS.
So the ITEMS table has a lookup-column, MyOrder, that links it to its order in the ORDERS table.
And the ORDERS table has a lookup-column, MyCustomer, that links it to its customer in the CUSTOMERS table.
We often find that, over time, we end up with ORDERS that are no longer linked to valid CUSTOMERS, and many ITEMS that are no longer linked to valid ORDERS. They are caused by people deleting CUSTOMERS and leaving their ORDERS behind or by adding rows that are not properly linked into the rest of the data-model. Often this also happens during development while we are testing buttons and automations.
So we need a way to carefully and safely remove all these “Orphans” from our database to leave only those rows that are properly linked to live parents etc.
To do this, we create a canvas-button, PurgeOrphans!, on a hidden page that has the following action-code (for our CUSTOMERS->ORDERS->ITEMS example)…
RunActions(
DeleteRows(Filter(ORDERS,Not(MyCustomer.In(CUSTOMERS)))),
DeleteRows(Filter(ITEMS,Not(MyOrder.In(ORDERS))))
)
We need to start at the top tier and work our way down through the tiers for the purge to work properly.
How It Works (using the ORDERS as an example)…
- the
MyCustomer.In(Customers)
selects all ORDERS that are correctly linked to existing CUSTOMERS - the Not() reverses that selection - selecting all ORDERS that are NOT linked to existing CUSTOMERS
- so the
Filter(ORDERS,Not(MyCustomer.In(CUSTOMERS)))
returns a list of all the orphaned ORDERS rows - which is then used by the DeleteRows() function to remove those orphaned rows
Having done this for the ORDERS table, the next action does the same for the orphans in the ITEMS table.
In programming terms, this is known as “Garbage Collection”, an automatic way to identify and delete anything in memory that can no longer be ‘reached’ by the valid objects in the system. But I prefer “Purge Orphans” as being a better description of what it does.
No human-orphans were harmed during the production of this post
Max
UPDATE: this process is efficient and compact BUT if you enter the formula incorrectly, it could delete a LOT of the wrong rows - so please see my SAFER “2-Phase Process” in the replies below…