Plugin settings are modified by opening the SearchPhp Plugin's Configuration in Extras > Extensions > Manage Extensions. The configuration form contains all the required description texts right next to the input fields. However, an overview of the configuration options and how they affect eachother is given in the following three sections.
Search Configuration and Crawler Settings
To generally enable frontend search, the "frontend search enabled" checkbox needs to be checked. Once this is done, the Frontend & Cralwer settings become visible. Each setting is explained in the description text above the input field. The follwing settings are available:
- ignore language
- fuzzy search - show search suggestions
- display only search result of the current domain / sub-domain
- max threads for crawler (applies if multi-threading is enabled)
- maximum link depth for crawler
- Search content start and search content end delimiters (=enclosing HTML comments to limit content relevant for search to a limited area)
- Search categories
- Start-URLs for crawler (entry points for crawler start) *)
- Regexes for valid URIs (regexex defining valid links) *)
- Regexes for forbidden URIs (regexes defining links that should not be followed)
*) These are the mininmum settings required for the crawler to work.
Search Categories
Search results can be sorted into categories by including a HTML Meta tag on the according page. The meta tag has the name "cat" and it's content ist the name of the category.
E.g.:
<meta name="cat" content="testcategory" />
In order to enable search by categories, each available category has to be specified in the options form.
Sitemap Generator (since Version 1.0)
Each time after the crawler finishes, a sitemap is created according to the sitemap protocol. The sitemap can be reached under /sitemap.xml because the plugin adds a redirect to pimcore and redirects to /website/var/tmp/sitemap.xml with the status code 301 - where the actual sitemap resides.
For each host which appears in the search index a separate sitemap is generated. The sitemap.xml itself is a sitemap index file which points to the other domain specific sitemaps. In case only one domain is present, the same mechaninsm is used, and the sitemap index file contains only one link. If more domains are used, it is important, that the main domain is the first in the list for crawler start URLs in the plugin settings. The first URL in the crawler config provides the domain used in the sitemap index file to point to the specific sitemap files. When working with several different sites it is important to generate the according robot.txt files to allow sitemap cross submits. Robot.txt files have to be created manually, they are not generated by the plugin.