woensdag 10 februari 2010

Apache Solr and state-of-the-art search techniques

Drupal's file handling capabilities keep getting better. Beyond the core upload module, the filefield module for CCK has enabled us to build sites with all sorts of files; documents, images, music, videos, and so forth. Searching within these docuements, however, has never been a common feature on Drupal sites. Some solutions have existed, particularly for extracting texts from PDFs and common wordprocessing documents. With Apache Solr, the attachments module, and an extension library called Tika, things can be much better. With Tika you can extract texts not only from Microsoft Office, Open Office, and PDF documents, you can also get text and metadata from images, songs, Flash movies and zipped archives. Searching for these texts is done as part of the normal Apache Solr driven site search.

Read more at: acquia.com/blog/use-apache-solr-search-files
Acquia Search: acquia.com/products-services/acquia-search
Slides: State-of-the-art Drupal search with Apache Solr

Geen opmerkingen:

Een reactie posten