Setting up Sphinx ­

By Brett Estrade http://www.justanswer.com/computer/expert­bestrade/ sphinx.conf: source example_xmlpipe_source { type = xmlpipe xmlpipe_command = /path/to/bin/sphinxpipe2.pl } index xmlpipe_source { src = example_xmlpipe_source path = /path/to/index_file_prefix docinfo = extern } command (assuming sphinxpipe2.pl outputs valid xmlpipe2 XML): indexer ­­config /path/to/sphinx.conf ­­all # creates indexes

The above should just create the indexes. To up the search server (searchd), the following needs to be added to the sphinx.conf: searchd { compat_sphinxql_magics = 0 listen = 192.168.0.2:9312 listen = 192.168.0.2:9306:mysql41 log = /path/to/searchd.log query_log = /path/to/query.log read_timeout = 30 max_children = 30 pid_file = /path/to/searchd.pid max_matches = 1000000 seamless_rotate = 1 preopen_indexes = 1 unlink_old = 1 workers = threads # for RT to work binlog_path = /path/to/sphinx_binlog }

Assuming that searchd is running, the index command would require a “­­rotate” flag to read in the updated indexes whenever updated. indexer ­­rotate ­­config /path/to/sphinx.conf ­­all

Searching

Note that there is a MySQL compatible listening interface that is defined above using the “listen = 192.168.0.2:9306:mysql41” line. This means you can point a client to “192.168.0.2:9306” and issue SELECT statements as described here: http://sphinxsearch.com/docs/archives/1.10/sphinxql.html

Using the PHP Sphinx Client is covered starting at listing 12 of this article ­ http://www.ibm.com/developerworks/library/os­php­sphinxsearch/#list12

Note the difference between fields and attributes. Fields provide the text that is subject to the full text searching and indexing. Attributes are used by the Sphinx filtering when querying and searching.

The XML formatted data returned by the xmlpipe script needs to be in the format described at this page: http://sphinxsearch.com/docs/archives/1.10/xmlpipe2.html

Putting it all together:

1. write a Perl script to read over the unzipped files and output the XML describing each document 2. install Sphinx on your server and configure it to use the xmlpipe script 3. run the indexer utility (assuming the xmlpipe script works) 4. point PHP script to search using either MySQL client interface (using that special mysql41 port) or the traditional Sphinx port and client 5. given the search results, retrieve the field of interest (if just a snippet in the index) or the full document if that’s what you need