Indexing at Scale for Fast API Retrieval

I’m always pushing my luck. If this were an actual physical exercise, I would be trim and fit.

At Stream It we are beginning to use Coda as a CRM that represents the ground truth of our CyberLandr reservations as well as many data attributes surrounding each customer. This brings with it the need for reporting and analytics to retrieve table rows quickly. Of course, Coda’s API lacks a search method, and recall of records is limited to a single “key” field (i.e., set as display column).

My interpretation of the Coda API docs suggest that there is only one key field for row retrieval. This forces us to pull all rows to find specific rows based on other field values, a decidedly unfavorable approach if performance matters and the table happens to have thousands of rows.

What are the options?

I thought it would be simple to use the API to retrieve rows from a table whose key is Order ID when all I have is customer email address. I figured, just create a new view and set the display column to email address. But, this also changes the original table’s key field. Ooops!

Coda is so powerful, but sometimes it’s not obvious how it can overcome issues like this. Knowing that …

  • We need a separate table to act as an index for fast access to specific rows;
  • We can create an independent index based on the original table;
  • Such index could easily be a hash index composed of objects based on fields that are not key fields.

The answer is simple; a table with a single column (Objects) and a single row (1) which contain a JSON hash index. Armed with this table/row/cell, applications can use this to locate orders very quickly (less than 10ms).

Constructing the index requires a simple formula that serializes all the email addresses and their order IDs.

Utilizing the index in Javascript is equally elegant. In a single API call I can know the key that points to this customer’s data. And to be clear, the index may contain more than the name->value pair; you are free to add many fields to create a fast lookup using any fields as keys.

  var result = CodaAPI.getRow(cCodaDocID, "CRM Data - Email Index", "1", {"useColumnNames" : true});
  var objects = JSON.parse(oIndex.values["Objects"]);
  var orderID = objects[emailAddress].orderID;

Possible Gotcha

One must wonder, what is the maximum cell size that could handle a JSON-formatted string? I’ve tested this hash index approach with 3,000 to 10,000 rows and Coda doesn’t seem phased. If there is a max cell size, it would be good to know.

This is great! Going to keep looking into this more

I tested max cell size before, and it was in the range of ~14k characters. But that was before Canvas cells. Need to do more testing to see if canvas cells can contain more characters or have a limit at all

1 Like

I’ve created some modestly sized indexes that have > 150k in a single cell. I believe one has more than 250k.

Amazing! My test was a year or two back, so likely grown since then. Time for new testing!