GPT Similarity Demo

This is fairly similar @Jake_Nguyen, I send each row to the OpenAI API for encodings which returns an ~1000 number array. The ‘search’ is essentially treating those arrays like coordinates in ~1000 dimension space and measuring the distance between them.

Luckily, the OpenAI API makes this a fairly quick process since you can submit requests in parallel.

Feel free to copy the doc and the pack and give it a go!

@Fran_Vidicek looking at the code for the pack, it looks like the Prompt( ) formula is using the same API endpoint as the QuestionAnswer( ) formula is, but your input for the latter is injected into a string of Q&A’s which is probably intended to put GPT into a ‘question answering mode.’ You would probably get the same response if you asked a question in either formula, however with the Prompt( ) formula you could break outside of just questions and give it commands or really whatever. As an example, I’ve used it to help generate boilerplate content by just telling it to do so.

The GPTPromptExamples( ) looks like what you’re doing is setting up several examples, desired responses, and then submitting a prompt for GPT to complete given the examples. So one example I can think of is if you fed it “Water, Fire, Snow” as training prompts and “Blue, Orange, White” as example responses, then provided “Dirt” as your prompt, ideally GPT should return you something like “Brown.” Thats my best guess at least.

1 Like

Sorry for the delayed response.

I wasn’t referring to caching as it relates to the app itself. Instead, I’m thinking about GTPs responses; making them sustainable without additional calls to GPT itself. It’s no different than managing costly reverse geo-encoding processes;

… given any lat/lng, transform it into a street address, but make sure you never do that more than once at a stated geo-precision.

This type of location caching makes it possible to create massive databases of specific mailing addresses by simply driving the streets with a real-time geo-tagging device. Want to know every address on the LA-transit route network? Just geo-encode the routes every two seconds. We did that in LA with our transit appliances and over 400 days, we had 500,000 addresses without paying a dime for geo-encoding fees (the free tier is 2500 encodings per day).

I have a hunch, similar caching approaches could create relatively large collections of shared (GP3) knowledge that could be sustained and utilized with larger audiences without breaking the bank.

Yeah I understood you correctly I think @Bill_French, the Coda OpenAI pack seems to be only submitting when there’s a change. So in my case I’m getting those ~1000 number arrays per row, but it’s saving them and only resubmitting when I change the data for that row.

However, creating a shared record of these coordinates might be infeasible except for some subset or type of responses. Otherwise, you’d really only be able to save an API call if folks are putting in the exact same prompts.

I thought the definition of stale was time-based, not content based.

Perhaps I’m using the term a bit liberally, however GPT3 is a fixed in place model (as far as I’m aware), so if you submit the same prompt it should give you the same response regardless of the time. You could certainly set up an automation to resubmit prompts to keep embeddings or responses ‘fresh’ but I’d be surprised if there was any difference. That’d make for a good experiment though!

Yes, it is. But I was referring to Coda’s definition of stale. As I understand it, Packs will refresh data even when the record or field referenced hasn’t changed. I thought it was based on some magic inside the Pack engine that determines when and if a cache hit will be passed over, thus triggering another request and response.

Um, not necessarily. It depends on the temperature and top_p settings. Ive seen vastly different responses to identical prompts submitted back-to-back and even days later. Results are not changing because the model has changed; they’re doing it because the temperature and top_p settings are encouraging a degree of random freshness.

Hi, thank you for your reply. This method wouldn’t be scaleable for millions of rows right?

Unless openAI will let you store and train a database on their servers so that you don’t need to give the same text input every time

Wish I was smart enough to understand what this meant in detail. I will bookmark this and come back at another time

1 Like

What is the difference between them?

I would be great if we could use GPT to help us understand all of @Bill_French posts. :smile:

@Jake_Nguyen I think ‘scalable’ is probably relative. There are plenty of applications which rely on making external API calls per-row on intake. Per @Bill_French 's comments I do need to check if Coda itself is re-submitting the data to keep info ‘fresh,’ but barring that you’d only be looking at one API call per row or per set of text (ie you change the text in a row).

I do not think that OpenAI would let folks store data on their side. The ‘So What’ of these kinds of models (in this embedding scenario) is specifically that they can produce the same or similar results given similar text, like a lookup table would, but for data with a probability space way too large to realistically make a database to contain all potential options.

1 Like

@Johg_Ananda correct me if I’m off here, just thought I’d get a reply in while I’m in the thread.

There are already ways to determine similarity between two strings. This is generally what’s used for autocomplete searches to make sure that if you’re getting relevant results if for instance you’re using a UK or US variation on the spelling of a word. I think this is what’s being referred to as a ‘soft’ search vs the AI-vector-distance method my doc is using, but I may have the two backwards.

In this context I’m talking about doing a ‘hard’ search as in searching for a precise/exact string in a larger string. So you’re searching for ‘fox’ in ‘the quick brown fox jumps over the lazy dog’.

A ‘soft’ search might be searching for ‘quik’ or now with GPT would could search for ‘slow dog’ or ‘tan fox’ since the similarity expands the ability to have the queries grokked and expands the capabilities massively.

Previously to create these types of soft searches the designer would have to anticipate them and do a lot of behind the scenes work, whereas with this simple API call you can just basically mash anything into GPT and it will give you a search/relativity score.

1 Like

Thanks for sharing info online man rock on

Take a look at this NLP Search if you’re looking into scalability, this is what I used

1 Like

Ha ha! There are some things we should fear AI understanding.

If true, we’d have to explain how I was able to get this working.

When you use Fine Tuning, you upload your own training data to cache-forward prompts and answers. In that process, OpenAI hands you back a pointer to new model that is a derivative of the model the fine tuning was based on. You are able to access your derivative model as you would any other base model through the API. Ergo, your training data and new model exist inside OpenAI in joint tenancy and is accessible through – and only through – your API token.

I suspect this is all achieved virtually inside the OpenAI platform; it doesn’t actually duplicate the base model. Rather, it likely virtualizes your enhanced training data as a wrapper and simply associates it with your new model name.

One must certainly wonder - why would OpenAI allow you to host your training data and new model derivatives on their servers? The short answer is money; they want to charge you a fee to search your own data. :wink:

Actually, there’s a middle choice as well. That work, though, is not difficult.

Hard Token Search

On the left side of the search continuum are “hard” token searches. You need the precise term to find references to the term. I believe this is Coda’s default search capability setting aside any possible fuzzy search capability based on token fragments.

Abstract (AI-driven) Search

On the far right side of the continuum is the fanciful “AI-driven” search, and it’s not really token-based at all. OpenAI has already done all the heavy lifting allowing us to converse with it in abstract and conceptual ways. This is the future vision of search - universal findability without creating inverted indices and the behind-the-scenes work as you described it. This is why Google executives have little beads of sweat forming on their foreheads. They know that universal findability is perhaps the end to their paid-search visibility model. Search engines have held information hostage since 1998, giving it up only in exchange for distractive attention.

Soft Token Search

In the middle of the findability continuum is the inverted index; the likes of Lucene and others like ElasticSearch which were birthed from Lucene. Inverted indexes can be easily created from a corpus of documents or objects and rather effortlessly. Even embedded Lucene-like engines such as SOLR and LUNR can be integrated into a Coda Pack (for example). Currently, it’s really difficult to get all of the content objects out of Coda such that you could build a really granular full-text search engine, but this is probably where Codans should place a bet because it doesn’t require new costs (i.e., OpenAI tax), and it’s a really good search outcome. A quick read of LUNR’s search capabilities helps you get an idea of the power of full-text search.

p.s. - Had Coda exposed the canvas as a collection of objects (as I intimated in this thread in 2019 and a few others since), there would be a Pack right now with an awesome search engine for each Coda document. There would also be a federated search solution for a workspace.

I agree. It’s a very useful approach to findability at scale. On the continuum of findability it sits between full-text inverted approaches and full-on AI-driven approaches.

Thanks for calling out semantic search approaches that don’t require a vig to search your own data. :wink:

2 Likes

Thanks @Bill_French, would you mind explaining to @Jake_Nguyen how to use fine-tuning to solve the use case he was asking about in the following comment? It’s a bit beyond my understanding.

Scalable is a deeply contextual definition. Is OpenAI itself scalable? That depends on how many GPUs and millions of dollars you have stacked up, right? :wink: To answer your question, we need to put a finer point on the definition of scale and the business value of using AI for or in a specific solution. But, it’s clear that OpenAI does allow you to upload your data to serve as the basis for a new derivative model that is likely to provide lower-cost AI solutions.

In a broad sense, imagine you have 1000 “objects” that each describe some knowledge facet in the context of a specific product. Let’s use CyberLandr as an example. There are about 1,000 [known] use cases for this product. But we can distill this example by simply focusing on the urban use cases which number only ~350.

If we have a list of urban use cases, we can create a finely tuned model that includes all known urban use cases and easily generate questions about each use case. In fact, given a use case, we can ask GP3 to generate five questions about each use case. Armed with the question variants for each use case, we can build a training data set using three of the questions, holding back two questions as our test data set. The purpose of this work is to create a chat tool that can carry ion a conversation with prospective CyberLandr buyers. We have another project that helps CyberLandr owners locate unique places to utilize Cybertruck and CyberLandr in new ways - i.e., wine country, farm tourism, deep overlanding where electricity may be scarce.

We submit the questions and answers to GP3 as the basis for our new model, and then we test that model using the test data set that we withheld from the training process. We can easily gauge the performance with confidence scores and develop the data we need to determine if more training is required.

Everything in GPT has a cost, but using a fine tuning approach is almost universally less costly than prompt-engineering your solution with preamble answers. This is why I believe fine tuning is likely one of the best ways to scale up GPT projects to a level of financial practicality.

IMPORTANT: AI models need data and lots of it. The more data, the more valuable your fine tuned model will become. But, a fine tuned model is the ideal way to build an ever increasing value in AI and it serves us well as a framework for improving the model.

Almost every integration with OpenAI [thus far] has been tactical; everyone wants an AI checkmark on their product. If your AI project is strategic, you will create workflows that harvest user experience in ways that improve the model - ergo, it needs an element of ML as much as it begins with AI.

p.s. - If Coda were smart (and there’s plenty of evidence to support this), everything in this community would be harvested in real-time into a fine tuned model that could carry on exceedingly helpful GPTChat-like conversations. Of course, almost everything CODAChat might say would likely mention @Paul_Danyliuk in some way. LOL

4 Likes