This pack is incredible! I have just spent a bit of time testing this pack with a will and a report on a mining company. Fabulous results - saved me a ton of time finding out what I needed to know. Check it out. Wizardry!
The server is now using multiprocessing, giving us ~3x speed for bigger documents!
A 66 page PDF used to take 43 seconds - It’s now done in 14 seconds
This will allow processing of even larger documents within the 100-second timeout
Keep in mind, if you expect more than:-
40,000 characters - Change your output format, see post
300,000 characters, be careful how you use ReadTextFile() on the Text File output, worst case Coda will automatically cancel your formula if you try to display it directly
The Scan() formula now supports more engine options, like engine: "gpt-4o-mini", which allows AI-based OCR. This means it can capture extra details like colors, text positions, and more.
Other
I’ve raised the amount of free credits from 50 to 100
Hi Sam thanks for your question!
That setup in the video should be able to handle it!
I’d recommend considering adding some columns to the Invoice table such as
Currency (Text)
Creation Date (Date)
Expiration Date (Date)
as well as any other data points that generally occur within your invoices, otherwise the AI will probably put it in the Other Info column if you have that one
Imagine it’s a person you’re sending a message to to fill out your Coda rows, is there enough information to understand what values each cell should have, and how many rows per table?
If not then make your table names and column names more descriptive or simply give additional instructions with the prompt parameter
I’ve worked hard to accomplish a general dynamic solution, it’s an involved process under the hood, but for the user I’ve hoped to make it as intuitive as possible. If the AI has issues at any point then it will abort the process and create a nice error message for the user, hopefully informing what actions they can take to make it work
For anyone who has used this pack in the past and gotten mediocre OCR results – I highly recommend trying again, and be sure to try one of the new models. I’m finding great results with both gpt-4o and gpt-4o-mini. Totally amazing.
Change the default model to one of the better ones, and also publish a bit more instructions on how to change the model. I suspect many people will have better results that way.
Add a pack action for Scan in addition to the formula entrypoint. I was initially baffled by the fact that only ScanToRows was available as an action. Obviously more proficient Coda users know that you can wrap any Pack formula in a ModifyRows() action and turn it into a PackAction, but I suspect many will not know that.
That makes sense. I’ve tried my best to plan for the future by making the default engine “Balanced”, as new models will continously appear this allows me to accomodate for that. It’s a very interesting problem though, as all models vary in Cost, Speed, Quality, and I/O.
For the cost and speed I’ve done some linear regression to get estimates
Changing Scan() to an action is technically a breaking change, but it will only affect workflows where it’s been used outside a button.
Using Scan() as a formula is strongly discouraged as it will consume your credits with every automatic recalculation.
And also, as this pack isn’t seeing too much traffic, this felt like the best path forward