Extracting PDF Contents
It seems like it's taken decades for many businesses and government agencies to move from pushing paper into the digital age of transactions. Recently, I was reminded of this by heading into town hall and having to push a bit of paper myself. Banks and other financial institutions still actually offer paper options surprisingly. I guess being in the world of tech comes with its own biases. But, despite the seemingly slow adoption, the migration is still happening and more and more things are going pure digital. The real estate markets are finally making the switch to mostly digital and this means it's a great time to know how to take advantage of this fact.
Fortunately, storing a PDF into FileMaker is as easy as drag-n-drop or the click of a button. Getting information out of that PDF still requires some type of external software. In the FileMaker market you'll find a variety of options including both free and paid options.
In this video, I'll walk you through one of the easiest and freely available options for pulling PDF content into FileMaker. With this license-free options you'll never be scared of getting the exact content you wish to extract from any PDF.
Comments
pdftohtlm not found
I installed pdftohtml using brew.
which pdftohtml returns /usr/local/bin/pdftohtml
For some reason BE_ExecuteSystemCommand does not find pdftohtml even though the symbolic link is in bin. "/usr/local/bin/pdftohtml -h" executes in terminal but not in BE_ExecuteSystemCommand.
You need the full path to the binary
If you haven't already figured it out, given you asked back in April. You have to give BE_ExecutesystemCommand the full path to where the binary itself is installed. In my case "/Volumes/Macintosh\ HD/opt/homebrew/bin/pdftotext (put path to pdf here)" Notice the escaped space between "Macintosh HD".
PDF to Text
Thank you sir!
Thanks to this video I was able to convert a pdf file to text, using XpdfReader (pdftotext command) and BaseElements plugin. I needed to transform the pdf file into text to search for patients' blood test values and automatically fill in the fields in filemaker database. With this method I have saved tedious hours of manual entry for the nurses!
pdftotext instead of pdftohtml
Hi Matt, could you please provide an example how to modify the script so that the content of the PDF can be written into a standard text field? I need to extract the text from the PDF and then search for a specific string inside that text to further process it.
PDFtoHTML
I am a Windows user and have not been able to install the pdftohtml successfully. I downloaded the pdftohtml .exe file and tried creating a bin folder in my local data C:\Users\AngusCameron\AppData\Local\bin and inserting the .exe in that folder. That didn't work. Can you please point me in the direction of a command line utility for Windows that will manage the installation.