GPT Similarity Demo

Actually, there’s a middle choice as well. That work, though, is not difficult.

Hard Token Search

On the left side of the search continuum are “hard” token searches. You need the precise term to find references to the term. I believe this is Coda’s default search capability setting aside any possible fuzzy search capability based on token fragments.

Abstract (AI-driven) Search

On the far right side of the continuum is the fanciful “AI-driven” search, and it’s not really token-based at all. OpenAI has already done all the heavy lifting allowing us to converse with it in abstract and conceptual ways. This is the future vision of search - universal findability without creating inverted indices and the behind-the-scenes work as you described it. This is why Google executives have little beads of sweat forming on their foreheads. They know that universal findability is perhaps the end to their paid-search visibility model. Search engines have held information hostage since 1998, giving it up only in exchange for distractive attention.

Soft Token Search

In the middle of the findability continuum is the inverted index; the likes of Lucene and others like ElasticSearch which were birthed from Lucene. Inverted indexes can be easily created from a corpus of documents or objects and rather effortlessly. Even embedded Lucene-like engines such as SOLR and LUNR can be integrated into a Coda Pack (for example). Currently, it’s really difficult to get all of the content objects out of Coda such that you could build a really granular full-text search engine, but this is probably where Codans should place a bet because it doesn’t require new costs (i.e., OpenAI tax), and it’s a really good search outcome. A quick read of LUNR’s search capabilities helps you get an idea of the power of full-text search.

p.s. - Had Coda exposed the canvas as a collection of objects (as I intimated in this thread in 2019 and a few others since), there would be a Pack right now with an awesome search engine for each Coda document. There would also be a federated search solution for a workspace.

I agree. It’s a very useful approach to findability at scale. On the continuum of findability it sits between full-text inverted approaches and full-on AI-driven approaches.

Thanks for calling out semantic search approaches that don’t require a vig to search your own data. :wink:

2 Likes