Searchable PDF (or keyword extraction)

Hi!

A team of my company is thinking about moving to coda.io to set up a product management database with a lot of pdf files…

However, they are currently tending to go with Sharepoint as PDF files uploaded there are searchable.
You can just type in a keyword and all the PDFs that you need are searched through.
I know this question has already been asked before in a similar way (Search within PDF and Images (OCR)), but I wanted to ask if

  1. there is any (simple) workaround for this
  2. @coda_account Is planning on implementing such a feature in the future (as they will go with coda.io then and just wait)
  3. If there is a way to automatically at least extract keywords from a PDF in coda?

Thanks for any hint :slight_smile:

Hey! Depending on the complexity and encodings of the PDFs you might have some luck adding a column to simply read the file contents, which Coda search happily picks up on

PDF decoding is hard. To do this properly I’d try finding a node package that parses PDFs to use in a custom Coda pack

Thank you so much for your answer!
I tried it, but it didn’t turn out so well either :smiley:

1 Like

Haha yeah not ideal, worth a try! If we’re lucky someone more knowledgable has an easy fix to make a few more words appear, something like configuring the encoding when reading the file in the coda pack perhaps

1 Like