DocVision: Analyze images with cutting-edge computer vision AI, all from within Coda.
- Input image URL and a simple prompt
- Run processing
- Get AI text response
“How many people are visible in this image?”
“There are seven people visible in this image.”
Currently powered by moondream2, a collection of state-of-the-art VLM models designed to run in private server environments and remain affordable at scale.
Check out the demo here: https://coda.io/@bstocks/docvision-v0-1