AI-assisted PKM

LLMs Aide in Developing a Pervasive Second Brain

I love many of the latest advancements in PKM. Tana, Mem, Obsidian, even Bardeem, and of course, Coda. The innovative approaches in keyboard expanders like Text Blaze and Espanso, also blend into the goals we all share concerning second brain activities.

What’s common about these tools? Their domains mostly end in .ai. There’s a correlation between modern second brain tools and AI. No surprise; AI might be the silver bullet that KM has needed for four decades.

What else is common about these tools? None of them play well together or operate in all the environments we tend to work where second brain activities occur.

The Interoperability Paradox

When I’m in Tana, I may stumble across something I saw in Slack that is particularly relevant about my current focus. I can go find it, or maybe I book-marked it in Mem. I recall bookmarking it, but where exactly?

This is a common and pervasive process; we see things that are sense-worthy; they are snippets of information that move us in some way, or compel us to take note. But, they happen across a vast horizon of tools and contexts. If you’re a new startup like Tana, you now need to build a few dozen API interfaces in an futile attempt to play everywhere.

This is the interoperability paradox; the number of connections and the effort required to build them is bigger than the product itself.

No single vendor of PKM can ever be everywhere. This is why glue-factories like Make and Zapier have thrived, despite serving as band-aide-like duct-tape, poorly architected Goldbergian shit shows, and deeply latent. These tools attract business logic (where it should not live) and create more security attack surfaces. Describing this as a shit show is probably being kind.

Note-Taking Funnel

I’ve often felt that capturing things that matter to us in the context of PKM, occurs everywhere. From the top of the information funnel all the way down deeply inside applications like a field in a database. The cone of sense-making activities is very wide and very deep. This illustration demonstrates how one tool that works well at the top of the funnel needs to interface with another tool forther down in the funnel.


Get Lucid

You might be sitting on a blog and realize there’s something related to a cell in a record in Airtable, which happens to be open in the adjacent browser tab. You’re determined to capture both of these observations, but your PKM platform of choice doesn’t work in either of them and it’s not likely to - ever.

What’s the Remedy?

I think the capacity for PKM interoperability intersects with two realities:

  1. The note-taking funnel is an abstraction; an economic externality to PKM tools.
  2. Artifacts gathered in this abstraction must be easily found when needed.

These are universal attributes that hold true in all PKM activities. From innovators to users, we generally have a lofty belief that a single PKM tool can meet many, if not all, interoperability and findability requirements.

If you own a really nice electric vehicle, it’s true value is determined by its ability to be conveniently charged. Elon Musk was smart; he realized infrastructure was a critical economic externality of Tesla’s product success.

Your second brain’s architecture is no different; it needs infrastructure for its value to be fully realized. The two realities above outline the infrastructure that’s missing. In my prototype, an oversimplification of it looks like this.

Early Assessment

This approach is untested and still very early. But it has proven a few noteworthy things that we can explore before I get deep into the implementation details and how this infrastructure works.

Interoperability

The diagram shows only a few app silos; however, this architecture works in all silos. The OS-level script makes it possible to homogenize PKM activities across all contexts from desktop apps to web browsers, and all without custom browser plugins. It even supports applications that have yet to be invented.

Findability

How do we make snippets that are captured in Coda (or any desktop app or browser for that matter), easily findable in AP Newswire? This is [seemingly] a search requirement. AI — specifically LLMs — lend themselves well to meeting this challenge.

Walkthrough

  • OS-level scripts make it possible to capture information in any context
  • LLMs provide word vectors; Pinecone provides vector storage and findability
  • Cmd / → captures the current selection, vectorizes, and stores
  • Cmd \ → captures the current selection, vectorizes, and recalls

Click to enlarge any of these example use case images.

Capturing Snippets

Coda Experience

  • In Coda, highlight a noteworthy snippet
  • Press Cmd /
  • Title and keywords are automatically generated
  • Curate (if necessary)
  • Select to to capture

Recalling Snippets

Slack Experience

  • In Slack, enter a topic
  • Press Cmd /
  • Browse the list
  • Select to insert

Mem Experience

  • In Mem, enter a topic
  • Press Cmd \
  • Browse the list
  • Select to insert

Superhuman (Email) Experience

  • In the email client, enter a topic in a new message
  • Press Cmd \
  • Browse the list
  • Select to insert

FreeFlow

This article is a precursor to more tests and research concerning the use of this approach. I call it FreeFlow because that’s what I envisioned when trying to put a dent in the second brain interoperability paradox.

I simply want what I am compelled to take note of, to be findable (and useable) irrespective of tool context.

I want my sense-making snippets to flow freely amongst all apps; not just my PKM app. I want my PKM artifacts to be instantly (and smartly) recalled and used in every other computing context.

Tall order indeed.

3 Likes

Bill, I’m very appreciative of this post. I’ve been thinking about (in less systemic) and encountering many of the limitations you point out here during my own PKM journey, which is more than five years old. The silo problem still strikes me as the greatest impediment, but as I’ve come closer to resolving that, the interoperability problem grows. That’s the inherent tradeoff, right?

My own experience as an early adopter of Obsidian led me to try and “do everything” using Obsidian plugin-ins: project management, quantified self, collaboration, note-taking and knowledge management, and content production. Given Obsidian’s limits, though, I’ve returned to a two-app strategy that depends on Obsidian and Coda — and is, honestly, still unsatisfactory. Critics of the “do everything” approach argue that it’s better to suffer the silos and use many apps, each of which does one thing well… and they have a point. Apparently, they don’t mind the many silos problem.

I’ll be interested to see what comes of your work here. As a non-coder (and someone who constantly struggles to tinker less and get the damn work done), my first hope was in the future of “universal APIs” and an AI-based connector (think “Zapier on steroids”). The other intuition I’ve harbored is to advocate for a new OS. If for example Obsidian was positioned as a plain-text OS, with its data not merely available but purpose-structured for use by apps (a plain-text calendar, a plain-text task list, a plain-text etc), you would have a single data lake with the potential for speciality, one-task apps to present views and manipulations of data. I mention these because I’m apprehensive about OS scripts doing the work, but I suppose AI could do the heavy lifting there.

Ryan J.A. Murphy has written about this frequently. One of my favorite posts of his: Obsidian, Roam, and the rise of Integrated Thinking Environments

Thanks again for the stimulating post!
Jack

1 Like