Getting Started with

By Tapan Avasthi

[email protected] www.attuneuniversity.com

Apache SOLR Introduction and Configuration

. In layman language, is place, programs that retrieve a list of document at same place with specific keyword which we entered. The purpose search engine is to promote your expertise, your business to rest of globe to gain revenue from them. . Apache SOLR 4.0 is specially designed for accurate, enhance, rapid search prospectively. Here, we discuss few key features which is recently introduced in SOLR 4.0. It built upon JAVA Technology Library called Apache Lucene. Apache SOLR is most popular search engine for web platform because it can an indexing server and search multiple task and give recommendation content based on the search query from end users. It works basically work with two elements HTTP and XML. . Apache SOLR leverage better user experience, pagination, indexing data, sorting, highlighting, auto suggestion, data modeling, etc. . Apache SOLR 4.0 comes with some new features which helps to perform more accurate search and enhance search criteria for real business market. o NRT o Partial Document Update o Easy Replication with Apache Zoo Keeper

Getting Started with APACHE SOLR 1

. Apache SOLR Configuration

. A prerequisite installation for quickly Apache SOLR up and running in your system as below.

. Install Java . Java is a programming language and free computing platform. Java is a programming language expressly designed for use in the distributed environment of the Internet. . If you are using Linux flavor or Mac OS then it can be provided by service vendor itself in installation package. If you are using Windows operating system then you may download install Java in your environment manually. . Download latest version of Java from http://www.oracle.com/technetwork/java/javase/downloads/jdk7- downloads-1880260.html and choose the package according your environment bit operating system. Afterwards, set JAVA_HOME in your system environment variable as below:  Go to Control Panel\All Control Panel Items\System and click on Advanced system settings.  Then click on Environment Variable button advanced tab.

2 Getting Started with APACHE SOLR

 Then add JAVA_HOME in path variable also as below.

 Once your Java path set, now you are ready to install Apache SOLR in your operating system.  For Windows o Download SOLR updated version from http://lucene.apache.org/solr

Getting Started with APACHE SOLR 3

o Extract, install SOLR in the system.(Default it comes with Jetty as the Application Server) o Launch the SOLR with below command from windows terminal o Java -jar start.jar  For Linux o Run the below command from Linux terminal. o wget http://download.nextag.com/apache//lucene/solr/3.4.0/apache- solr-3.4.0.tgz o Extract with below command tar -zxvf apache-solr-3.4.0.tgz Go to cd apache-solr-3.4.0 o Start Apache SOLR as below o Go to cd /example o java -jar start.jar

4 Getting Started with APACHE SOLR

 Congratulations! You have installed and running Apache SOLR in your system.

Getting Started with APACHE SOLR 5

Indexing Your Data Using SOLR

. Apache SOLR offers Real time indexing, Index replication as automated, Logging functionality, Automation failure and recovery mode, Multiple search indexes, Server statistics logging, Full text searching, Load balanced querying, Scalability, flexibility and extensibility, faceted search. Apache SOLR considered as the server-ization of Lucene. . Before proceed further towards Indexing Data using Apache SOLR, let’s take a look the layout of SOLR.

Layout of Apache SOLR

. The example home directory of SOLR is default denotes example/solr. It contains the following: o Bin - Files for more advanced setup are placed here o Conf - Contains files which help set the Solr configurations o Conf/schema.xml - This is the schema for the index including field type definitions for given dataset. o Conf/solrconfig.xml - This is the primary Solr configuration file. o Data - It contains the actual Lucene index data in binary format. o Lib - The additional Solr plug in jar files can be placed here.

6 Getting Started with APACHE SOLR

. Apache Solr can index data using four mechanisms namely: o Solr’s native XML o CSV (Character Separated Value) which is a character separated value format (often a comma) o Rich documents like PDF, XLS, DOC and PPT o Solr-Binary is analogous to Solr-XML – it contains the same data in binary format.

Getting Started with APACHE SOLR 7

. Your Apache SOLR server up and running but wont contains any data yet. We can modify SOLR index by posting commands to SOLR like add, update, select, delete documents. Default, SOLR package comes with sample data and its located at /exampledocs directory. . We have stored our data to be searched in the example/exampledocs folder. First of all ensure that the SOLR server is running from the previous step, then type the following: o exampledocs$ java -jar post.jar *.xml o The response should be something like: o SimplePostTool: version 1.2 o SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported o SimplePostTool: POSTing files to http://localhost:8983/solr/update o SimplePostTool: POSTing file xmllisa1.xml o SimplePostTool: COMMITting Solr index changes. . Now, you can able to search the data which was indexed through apache solr query tab from admin panel of apache solr. . http://localhost:8983/solr/collection1/select?q=solr&wt=xml

8 Getting Started with APACHE SOLR

Experimenting with Text Analysis

. Let’s be comfortable with Solr’s analysis page, which is an experimentation and a troubleshooting tool that is absolutely indispensable. With this facility, you will be able to try different combination of configuration to verify whether you get the desired result or not. This facility is very helpful when you are facing an issue to find out certain result with particular queries are not matching text that you feel they should. Just, have a look in Solr's Admin Page. You'll see a link named [ANALYSIS].Enter the text which i entered in the below snapshot.

. Now, click on Analyze button. You’ll see below output in Index Analyzer.

Getting Started with APACHE SOLR 9

10 Getting Started with APACHE SOLR