As a follow up of the releases of EXT:solr 3.1.1 and EXT:tika 2.0.0 we released version 2.1.1 of the solrfal add-on. This version requires EXT:solr 3.1.1 and supports TYPO3 6.2 LTS and 7.6 LTS
New in this release
- Support of TYPO3 7.6 LTS
- Test runner prepared for travis
- Unified namespaces to use ApacheSolrForTypo3Solrfal
What can be done with solrfal?
Solrfal is a connector for TYPO3's FAL - File system Abstraction Layer - and Apache Solr for TYPO3. It allows you to extract the content of your files (by using Apache Tika and EXT:tika) and index them to your Solr server. The power of Apache Tika allows you to extract a huge set of file types (PDF, Microsoft Word & Excel, JPEG, MP3…).
How to Search Files with EXT:solr, EXT:solrfal & EXT:tika
To integrate a file search in TYPO3 for EXT:solr you need 3 Components:
Apache Tika: Tika is available as a client standalone application, integrated into Solr (Solr Cell) and as a standalone Tika server. The advantage of the standalone Tika server is that it provides all the features of Tika and does not require to start a new Java process for each file meta data extraction. Compared to the Solr Cell handler Tika server additionally offers language detection from a file or string.
Apache Tika for TYPO3 Extension: The Tika extension, developed by Ingo Renner, provides the functionality to access Apache Tika in its app, server and Solr Cell forms.
TYPO3 Solr FAL Extension: Solrfal provides the connector between the File system Abstract Layer and EXT:solr and used EXT:tika to extract the data from files.
Setting up Apache Tika for TYPO3
In the next steps we will show you how to get started using a local Tika server, installing EXT:tika, and EXT:solrfal. Afterwards we will use them to index PDF files to Solr.
You can watch the video:
Dateien mit Solr in TYPO3 mit Solrfal und Tika durchsuchen.
Or follow the steps below:
Download and start a Tika server for development
Before you start make sure that Java is installed and TYPO3 is configured and running EXT:solr (Version 3.1.1).
Now download and install the Apache Tika server (Choose one of the mirrors from www.apache.org/dyn/closer.cgi/tika/tika-server-1.11.jar) (Only for development context! In production context you should install it with your distribution and make sure the daemon is configured properly with init scripts etc.):
mkdir -p /opt/tika
wget mirror.dkd.de/apache/tika/tika-server-1.11.jar -O /opt/tika/tika-server-1.11.jar
adduser --system --no-create-home tika
chown tika:www-data /opt/tika/tika-server-1.11.jar
chmod 550 /opt/tika/tika-server-1.11.jar
java -jar /opt/tika/tika-server-1.11.jar
Setup Tika (Available in the TER)
- Install EXT:tika version 2.0.0, enable the backend module and configure the path to tika-server.jar
- Configure the EXT:tika through the extension manager to use the following tika server path: „/opt/tika/tika-server-1.11.jar“ and enable the new Tika backend module for the solr extension.
- Check the reports module's status report and check if Solr and Tika reports are ok
Setup Solrfal (Available for EAP partners on typo3-solr.com only)
- Install EXT:solrfal from the provided zip file.
- Include the solrfal TypoScript template „Search - FAL File Indexing"
- Enable file indexing in TYPO3 by setting „plugin.tx_solr.index.enableFileIndexing = 1"
- Add the solrfal scheduler task
Add files to pages and index them
- Add some PDF files to a page
- Initialize the Index Queue and let the scheduler task process the queue
- See the files in the fronted
Do you have feature requests, questions or want to get involved?
There are many ways to get involved in EXT:solr
- Join our channel on typo3.slack.com
- Become a sponsor and support the development of Apache Solr for TYPO3