Hi,
Any pointers on a library to extract text from pdf files that works from a function? pdf-parse and others simply fail. The file is uploaded to a blob, and from there should be pasrsed and passed on for embedding in pinecone. It works fine with other filet types, but it seems as if text extraction from a pdf in a serverless environment is quite a challenge…
Any experience? REcommendations?
my site is on test.riskgpt.io