Improved internal website search with

Trust-IT Services – 12/06/2019 1. About Apache Solr Apache Solr is an enterprise open source based on Java Lucene Library. Solr provides an API for interacting with via HTTP to facilitate the creation of applications for performing full-text content searches, with a special focus on internet-based search applications. Solr is stable, scalable and reliable and provides a wide set of core search functions that could be add in a web application. Solr creates an index of the documents and contents available on a website that can be then queried by any user to return the most relevant ones.

2. Implementing Apache Solr on RDA website The main characteristics of Solr that will be fully implemented in RDA website are:

• matching capabilities: Apache Solr is characterized by the advanced full-text search capabilities. It enables matching capabilities comprising of phrases, grouping, wildcards, joins and so on, across any data type. • Suggestions while querying: There is support for auto-complete while searching (“typeahead” search) and spell checking (for that we have to install third party modules). • Support for standards based open interfaces and data formats: It uses the standards-based open interfaces: XML, JSON and HTTP. • Indexing capabilities: Solr take advantage of Lucene’s Near Real-Time Indexing capabilities that ensure that the user sees the content whenever he or she wants to. Also, its built-in simplifies the process of indexing rich content like , Adobe PDF and others common format. • Security: Solr has robust built-in security like SSL, Authentication and role-based authorization. • Performance: Solr delivers the search results faster than the traditional SQL-based search in Drupal core, and with perfect resistance to the highest load possible. Solr is able to achieve fast search responses because, instead of searching the text directly, it searches an index instead.

Solr requires installation in a standalone server. For this reason, a bridge between Solr and the RDA Web site will be established using a specific Drupal module. The “Search API Solr Search module” provides a Solr backend for the “Drupal Search API module”.

What type of content will RDA users retrieve?

Apache Solr Pag. 1 di 3

Using the Search Solr module RDA users will be able to search and retrieve any public content and document available on RDA website, such as: • Content pieces published on the public website • Content pieces published in the public Groups • Files (Microsoft Word, Adobe PDF and others common format) uploaded in the public website as well as in the Group pages • Content in public wiki pages

Below are some query examples demonstrating some query syntax:

Apache Solr

Implementation steps:

• Installation of Solr and proper configuration to run with RDA website content types • Setup of the Search API module • Setup of the Search API Solr module • Configuration of the RDA web site to send content to Solr for indexing • Retrieving search results from Solr and displaying them in RDA on both a stand-alone page and with a view module

Note: Solr search must be used as a replacement for core content search. This means that the search facility currently available on the RDA website will be substituted by Solr Serarch.

Apache Solr