A client asked me today how he could take a text field like this and extract people from it:
The “straightforward” ideas like using Split didn’t work. Links and @-references in formatted text are unlike the rest of the text: technically they occupy just one empty character and instead have an extended encoded descriptor telling what should be displayed in place of that empty character. Running a formatted text field through just strips those descriptors and that’s what you get:
While splitting by space didn’t work, splitting by empty string to extract individual characters actually did:
I tried to find a way to Filter()
out the regular characters and keep only the @-refs and came up with this solution: add and remove a character from each letter and see if it stays the same:
This actually worked and I had @-references listed in my cell. There was just one problem: it was still just formatted text values, not actual People objects or so. Besides these formatted text values retained paragraph formatting (bullets) of the original text, which is not very nice.
I knew I had to go and dissect the rich text object (aka slate
) and extract references from its descriptor. Here’s ultimately how each formatted text value is encoded (see children of different types):
I extracted portions of the descriptor marked in red: objects of type InlineStructuredValue
or actually their value
properties. For that I used the infamous Something._Merge() + ""
idiom to get the descriptor as JSON, and then ParseJson(descriptor, path)
to extract interesting pieces from that descriptor as objects. TIL that the path
parameter now supports JsonPath, which allows writing complex queries like “Get all value
s of children
whose type is this or that, and additionally the ObjectID
of those value
s is the People table”.
This is how I got all @-references:
And this is how I got just the People and formatted the column as multi-select People column:
Here’s the known catch. If you directly parse control objects (the value
s containing references) as top-level objects, Coda simply treats them as regular plain objects with no special meaning (i.e. doesn’t turn them into proper references):
It will, however, properly “unpack” objects at least one level deeper than the root. So the catch is to use ParseJson
not on the values
but on objects one level higher (in our case the children
) and then extract values
through _Deref_Object()
function.
In the above example I extracted all @-refs. To only limit @-refs to People, I had to add a filter clause on the JsonPath to only limit found children to those where value.ObjectId
is Global-Document-People
, a hidden system table containing the list of Coda users in the doc:
This is a rather Pro hack but I decided to post it anyway. My goal here was to tell you about JsonPath, the thing I just discovered myself today and show how flexible it was with regard to extracting just the right items from a complicated JSON tree.
Follow me at https://twitter.com/codatricks
Subscribe to the newsletter at codatricks.com
There’s Patreon, too
I’m building a Coda Fundamentals course, Maker Funded by Coda. Watch either of the above to know when it launches. After my course you’ll be prepared to discover solutions like this one yourself.