Full Text Indexing the Contents of PDF files

You can configure ADAM to read the contents of PDF files and to include this contents in its full text indexes. This allows your users to find the records containing these files by simply specifying a few words contained in the PDF files.

Here's how to set this up:

  • Go to File Types in the ConfigStudio and open the PDF File Type.
  • Make sure that the IFilter Media Engine is selected.
  • Make sure that the ReadContent Catalog Action is selected.
  • Install an IFilter that supports reading the contents of PDF-files. You have several options here:
    • Install the Adobe PDF IFilter v6.0. The advantages are that it is free and the easiest to install, but internal and customer tests have also indicated that this is a rather unstable product. This filter can be downloaded from the Adobe website. At the time of this writing, the link is this.
    • Install the latest Adobe Reader. Since the Adobe Reader 7.0.5, every reader install also contains an IFilter and this one is also free. The reader can also be download from the Adobe website.
    • Install the Foxit PDF IFilter. This is a relatively expensive product, but it does work faster than Adobe's implementation. More information about this filter can be found here.
  • Make sure that the Fulltext Index File Content setting is set to true.

And that's it. You may have to restart IIS to ensure that this process detects the new IFilter, but ADAM can now read the contents of PDF files.

To be able to use this feature for files already cataloged in the database, you need to submit a maintenance job to Recreate the previews and thumbnails of these files.

Comments

Friday, 13 April 2012Leen Van Gompel says
For 32 bit systems it is best to install the latest version of Adobe Reader For 64 bit systems an IFilter is still needed, however this should be an iFilter specific for x64 systems, like Adobe PDF IFilter 9 for x64 platforms which can be found on http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025
Leave a comment
You must be logged in to post comments.
Sign in now
 
 
Technical
Business
rss feed