21 June 2014 Author: Jean-Claude Dauphin [email protected]

J-ISIS Reference Manual

The latest J-ISIS Distribution zip file can be downloaded from:

http://kenai.com/projects/j-isis/downloads

(Version 1.3)

21/06/2014

J-ISIS Reference Manual – 21 June 2014

Page 1

Other documents providing details on specific J-ISIS features:

Data Entry with Pick Lists, Validation Rules. BLOB Images. Plus Digital Data Entry Documentation Library documents.

How to transfer a WinISIS Step-by-Step Instructions for moving a WinISIS or DOS CDS/ISIS Database database to J- ISIS to J-ISIS

J-ISIS Network Client Server This document explains how to use J-ISIS in a network.

This document explains how to install and customize J-ISIS on Mac OS X J-ISIS Intallation On Mac OSX Mountain Lion

This document explains how J-ISIS is related to Web technologies and how to J-ISIS and Web Technologies use J-ISIS formats (CDS/ISIS formatting language) for creating HTML/XHTML display formats that contain hypertext.

Pick Lists Validation Pick Lists For Data Entry and Data Validation Documentation

Retrieving MARC Format Step by Step Instructions for Retrieving MARC Format Bibliographic Bibliographic Records from Records from Z39.50 Servers using J-ISIS Z39.50 client and Importing them Z39.50 Servers using J-ISIS Z to J-ISIS 39.50 client

Step by Step Instructions for As an example, the public domain Gutenberg MARC format bibliographic Importing MARC Format records are imported to J-ISIS and it is shown how to update a record for Bibliographic Records to J-ISIS displaying hyperlink to the original document.

Playing with new Web This document explains and demonstrates how some of the features offered technologies in J-I SIS rev 1 by HTML5, CSS3, and JavaScript can be used inside J-ISIS print formats

Web-JISIS Documentation Web-JISIS Documentation for the first alpha release of Web-JISIS

J-ISIS Reference Manual – 21 June 2014

Page 2

Foreword

J-ISIS is a new multiplatform Free and Open Source Software (FOSS) ISIS that provides the same successful concepts and functionalities as the actual UNESCO CDS/ISIS FOR WINDOWS (WinISIS) software. J-ISIS removes many of WinISIS limitations and restrictions, uses a Client/Server architecture, is fully UNICODE, and benefits of the latest software developments. J-ISIS follows the CDS/ISIS concepts to keep the assets and experience of the users, such that users familiar with the CDS/ISIS software family will also be familiar with J- ISIS and will retrieve the same concepts. On the developer side, the main objective is to develop a long-term solution that would be modular, easy maintainable and extensible. One of the requirements was that J-ISIS should be usable by small libraries and document centers in developing countries that don't have Internet access, or the necessary IT infrastructure and knowledge for installing a Web server. They need an easy to install application that provides a rich user interface. J-ISIS uses the client/server pattern and is a desktop application that acts as a database server as well as a rich client application. J-ISIS is a rich client application written to communicate with a specifically designed database server that is accessible as a stand-alone application and over some local network. It can be used on a single host machine, over a small local network, without Internet access or a Web server installed. A second requirement was that the J-ISIS application should be usable as a rich desktop application as well as a Web application. Today, a number of technologies are vying to add interactivity to Web applications, technologies such as Ajax are becoming more mature and JavaScript libraries such as jQuery, YUI, GWT, Dojo, etc. provide GUI widgets that would allow to develop better interactivity for Web applications. Around the J- ISIS desktop application that includes a database server, Web-JISIS is a Web application that uses the Struts 2 framework, Sitemesh, Ajax and jQuery. Please note that the J-ISIS desktop and Web-JISIS applications are complementary. J-ISIS is not an Integrated Library System (ILS) as ABCD, it's a non relational (NoSQL) database management system that uses the ISIS concepts and that is particularly well suited for the storage and retrieval of bibliographic information. While it is possible to use J-ISIS for publishing an OPAC, Managing Acquisitions, Loan and Patrons/Users as it is done with WinISIS, there are some specific ILS modules under development. J-ISIS Databases and all related files such as indexes, FDT, FST, Worksheets, etc are fully UNICODE using UTF-8 encoding and are interoperable between different platforms (i.e. you can copy databases without conversion between Windows, Mac OS X and Linux. As CDS/ISIS, J-ISIS is a flexible Information Storage and Retrieval system designed specifically for the computerized management of structured non-numerical data bases. One of the major advantages offered by the generalized design of the system is that J-ISIS is able to manipulate an unlimited number of data bases each of which may consist of completely different data elements. Furthermore J-ISIS uses TCP/IP protocol to communicate between computers and is a database server that follows the Client/Server architecture. For advanced users, J-ISIS offers a wide range of programming facility allowing the development of specialized applications through the use of its powerful print formats. The J-ISIS embedded Web browser and Web server offer the possibility to use the new Web technologies such as HTML5, CSS3, JavaScript inside ISIS print formats. The Groovy programming language has replaced ISIS Pascal offering the same functionalities and much more. For real computer programmers, the ISIS_DLL is replaced by the J-ISIS core library (jisis-core.jar), which is interoperable between different platform and which provide all necessary tools for developing J-ISIS based applications. And we may say that J-ISIS is one of today‘s available software based on the CDS/ISIS technology.

Acknowledgements. First of all, CDS/ISIS technology would not exist if Giampaolo Del Bigio would not have created the CDS/ISIS technology. The original CDS/ISIS ran on an IBM mainframe and was designed in the mid-1970s under Mr Giampaolo Del Bigio for UNESCO's Computerized Documentation System (CDS). It was based on the internal ISIS (Integrated Set of Information Systems) at the International Labour Organization in Geneva. For many years, Abel L Parker from BIREME/PAHO/WHO and now Director of SciELO has given strong support to CDS/ISIS technology. Ernesto Spinak from BIREME and Egbert de Smet of the University of Antwerp have always been key CDS/ISIS technology experts also providing their help to J-ISIS project. Also many people from the ISIS User community have provided constructive criticism, suggestions and took the time to test J-ISIS. I would like to thank most particularly Tigran Zargaryan, Francesco Dell'Orso, Renate Morgensten, Bridgette Heron, Family Britz, Sara Diana Telias, María Mercedes MacLean, Vladimir Rubtsov, Nguyen Hue, Gerhard Riesthuis, Albert Lemort, Marie-Christine de Bouët du Portal, Philippe Cousson, Amjad Ali Malik and Prakash R. J-ISIS Reference Manual – 21 June 2014

Page 3

Table Of Content

OTHER DOCUMENTS PROVIDING DETAILS ON SPECIFIC J-ISIS FEATURES: 2

1. SYSTEM OVERVIEW 9

A. The J-ISIS Data Base 9 B. System functions 9 C. Data base structure 10 1. DATA BASE DEFINITION FILES 10 2. DIRECTORY STRUCTURE FOR A DATABASE NAMED CDS 11 3. MASTER FILE 11 4. INVERTED FILE/ INDEXES 11 5. RELATIONSHIPS BETWEEN THE FILES 12 D. System architecture 12 E. Web Technologies 13 1. J-ISIS WEB BROWSER 13 2. J-ISIS WEB SERVER 14

2. INSTALLATION 15

2.1 Updating from a previous release 15 2.2 UNIX/Linux Special Setting 16 2.3 Configuration Files 16

3. J-ISIS THEORETICAL LIMITS 17

4. STARTING J-ISIS 18

5. J-ISIS – CLIENT/SERVER APPLICATION 20

A. The Application Main Window 20 B. Output Console Window 20 C. Database menu 22

6. J-ISIS DATABASE SERVER 24

7. OPENING DATABASES 25

8. VISUALIZING DATABASE CONTENT 28

8.1 Data Viewer 28 8.2 DB Browser 33

J-ISIS Reference Manual – 21 June 2014

Page 4

8.3 Dictionary Browsing 35

9. THE FORMATTING LANGUAGE 38

A. Field Selectors 39 1. FIELD COMMAND 40 2. SUBFIELD COMMAND 40 3. EXTRACTING A FRAGMENT OF A FIELD OR SUBFIELD 41 4. FIELD OCCURRENCES 42 5. INDENTATION COMMAND 43 6. MFN COMMAND 44 B. Mode command 45 C. Horizontal and vertical spacing commands 46 D. Literals 47 E. Dummy field selectors 50 F. Expressions 51 1. NUMERICAL EXPRESSIONS 51 2. STRING EXPRESSIONS 53 3. BOOLEAN EXPRESSIONS 53 G. Functions 55 1. NUMERICAL FUNCTIONS 55 2. STRING FUNCTIONS 61 3. BOOLEAN FUNCTIONS 73 H. IF command 74 I. Repeatable groups 75 J. Format errors 78 K. Including an external format 79 L. Format variables 80 M. WHILE command 80 N. CISIS functions 82 O. The XHTML/CSS/JavaScript Display environment 85 P. Adding Hypertext links to formats: the LINK command 88 Q. WinISIS Windows commands not implemented in J-ISIS 90 R. Differences with WinISIS 91

10. FIELD DEFINITION TABLE (FDT) 92

A. Introduction 92 B. General data base design guidelines 93 1. DATA ELEMENTS 94 2. FIELDS AND SUBFIELDS 94 3. REPEATABLE FIELDS 94 4. CONTROL CHARACTERS 95 C. J-ISIS Databases And MARC Records 96 1. ISO 2709 MARC RECORDS 96 2. ALL ISO 2709 MARC FORMATS USE THE SAME MARC RECORD STRUCTURE 97 4. INFORMATION INTERCHANGE FORMAT (IIF), ANSI Z39.2, ISO STANDARD 2709 97 5. XML MARC RECORD STRUCTURE MARCXML 98 6. J-ISIS RECORD STRUCTURE AND MARC RECORDS 98 CHECKING IMPORT/EXPORT OF A ISO2709 MARC21 FORMAT RECORD 99

11. THE FIELD SELECT TABLE (FST) 105

J-ISIS Reference Manual – 22 December 2013 Page 5

A. FST parameters 106 1. DATA EXTRACTION FORMAT 106 2. INDEXING TECHNIQUES 107 3. FIELD IDENTIFIER 108 4. FIELD NAME 109 B. Index/Inverted file FST 110 1. BUILDING A SEARCH INDEX 111 C. Reformatting FST 113

12. THE FST MANAGER 116

12.1 Presentation 116 A) FST MANAGER CONTROL PANEL 117 B) FST MANAGER CREATE/DELETE FST ENTRIES BUTTONS 118

13. BUILDING A SEARCH INDEX 120

Understanding the Indexing Process 120

14. SEARCHING 121

14.1 Guided Search 125 A) SINGLE TERM SEARCHING 125 B) MULTIPLE TERM SEARCHING 126 14.2 Expert Search 132 14.2.1 TERMS 132 14.2.2 FIELDS 133 14.2.3 TERM MODIFIERS 135 14.2.3 BOOLEAN OPERATORS 139 14.2.4 GROUPING 142 FIELD GROUPING 144 14.2.5 ESCAPING SPECIAL CHARACTERS 144 14.3 Expert Search Examples 144 14.3.1 USING BOOLEAN OPERATORS 144 14.3.2 WILDCARD SEARCH EXAMPLE: 146 14.3.3 PROXIMITY SEARCH 146 14.3.4 SEARCHING ARABIC (ISA DATABASE) 147

15. SEARCH HISTORY 148

16. WARNING ON A GENERAL EDITING ISSUE 151

17. IMPORTING 152

16.1 Importing ISO 2709 files 152 17.2 Importing MARC files 158

18. EXPORTING 161

A) Select Output Format 161 J-ISIS Reference Manual – 22 December 2013 Page 6

B) Output Parameters 162

19. PFT MANAGER 163

19.1 Presentation 163 19.2 Re-Using Plain Old WinISIS PFTs 167 19.3 Problems you may be faced when using old PFTs 168

20. J-ISIS GROOVY CONSOLE 171

Features 172 RUNNING SCRIPTS 172 EDITING FILES 172 INPUT AREA 172 HISTORY AND RESULTS 172

21. GROOVY PROGRAMMING LANGUAGE 173

21.1 Classes & Scripts 173 21.2 Groovy Tutorial 174 21.3 Using Groovy to write Format exits (Call from the PFT to external functions) 174

22. DATABASE CREATION 180

Field Definition Table (FDT) – Database Structure 181 Data Entry Worksheet 184 Field Selection Table (FST) 184

23. DATA ENTRY 186

23.1 Data Entry Worksheet 186 CHANGING THE ORDER OF THE WORKSHEET FIELDS 188 23.2 Data Entry Module 189 A) PRESENTATION 189 B) DATA ENTRY PROCESS 196 23.3 Advanced Worksheet Editor Module 199 ENRICH A WORKSHEET CREATED WITH THE WORKSHEET EDITOR TO GET SUBFIELD DATA ENTRY. 200 CREATE OR MODIFY AN ADVANCED WORKSHEET 202 23.4 Advanced Data Entry 206 A) ADVANCED DATA ENTRY CONTROL PANEL 206 B) ADVANCED DATA ENTRY WINDOW 207 ADD/DELETE/CLEAR FIELD OCCURRENCES 209 PICK LIST EXAMPLE 210 COPY RECORD CONTENT FROM ONE DATABASE TO ANOTHER 211

24. SORTING AND PRINTING 212

24.1 Quickly Printing All/Or a Selected Range of Records 212 24.2 Sorting the Records before Printing 214

J-ISIS Reference Manual – 22 December 2013 Page 7

25. MULTILINGUAL UNICODE DATABASES 217

25.1 Windows 218 25.2 Full fonts: 218 25.3 Configuring a J-ISIS database to use a special font. 218

26. CLIENT Z39.50 222

27 AUTHENTICATION AND AUTHORIZATION IN J-ISIS 223

27.1 Introduction 223 27.2 INI File Sections Understood By Shiro: 223 27.3 Shiro INI File Used By J-ISIS 225 27.4 Default J-ISIS 'Shiro.ini' File 227 27.5 Some examples 227

ANNEX 1 230

Installing Java SE Development Kit 7u45 230 1.1 DOWNLOADING JDK 1.7 U45 230 1.2 INSTALLING JDK 1.7U45 ON WINDOWS 230 Checking Java Runtime Environment Settings 231

ANNEX 2 234

How to run J-ISIS in Spanish: 234

ANNEX 3 235

HOW TO USE JISIS CORE LIBRARY IN GROOVY SCRIPTS OR OTHER WEB APPLICATIONS 235

1 J-ISIS Core Library Application Programming Interface (API) 235 2 Code Snippets: 239 3 The API 242 4 WRITING A GROOVY APPLICATION TO PRODUCE A PDF CATALOGUE 250

J-ISIS Reference Manual – 22 December 2013 Page 8

1. System overview

A. The J-ISIS Data Base The J-ISIS data base is based on CDS/ISIS concepts. It allows to build and manage structured non-numerical data bases, i.e. data bases whose major constituent is text. Although J-ISIS deals with text and words, and offers therefore many of the features normally found in word- processing packages, it does more than just text processing. This is because the text that J-ISIS processes is structured into data elements that you define. In the most general terms you may think of a J-ISIS data base as a file of related data that you collect to satisfy the information requirements of a given user community. It may be for example a simple file of addresses or a more complex file such as a library catalogue or a directory of research projects. Each unit of information stored in a data base consists of discrete data elements, each containing a particular characteristic of the entity being described. For example, a bibliographic data base will contain information on books, reports, journal articles, etc. Each unit will, in this case, consist of such data elements as author, title, date of publication, etc. Data elements are stored in fields, each of which is assigned a numeric tag indicative of its contents. You may think of the tag as the name of the field as it is known by J-ISIS. The collection of fields containing all data elements of a given unit of information is called a record. The unique characteristic of J-ISIS is that it is specifically designed to handle fields (and consequently records) of varying length, thus allowing, on the one hand, an optimal utilization of your disk storage and, on the other, a complete freedom in defining the maximum length of each field. A field may be optional (i.e. it may be absent in one or more records), it may contain a single data element, or two or more variable length data elements. In the latter case the field is said to contain subfields, each of which is identified by a 2-character subfield delimiter preceding the corresponding data element. Furthermore a field may be repeatable, i.e. any given record may contain more than one instance or occurrence, of the field. J-ISIS uses the standard CDS/ISIS separator, character ―^‖. A subfield delimiter is a 2-character code preceding and identifying a variable length subfield within a field. It consists of the character ^ followed by an alphabetic or numeric character, e.g. ^a. The character to be used by CDS/ISIS to separate the occurrences of a repeatable field during data entry. By default CDS/ISIS will use a percent sign (%), which effectively reserves its use for this purpose. If you need to enter percent signs as data, you may define here another character to be used instead.

B. System functions The major functions provided by J-ISIS allow to:  Define data bases containing the required data elements  Enter new records into a given data base  Modify, correct or delete existing records  Automatically build and maintain fast access files for each data base in order to maximize retrieval speed  Retrieve records by their contents, through a sophisticated search language  Display the records or portions thereof according to your requirements  Sort the records in any sequence desired  Print partial or full catalogues and/or indexes  Develop specialized applications using the J-ISIS core library and Groovy Programming language.

Page 9

C. Data base structure Although a J-ISIS data base will appear as a single file of information, in actual fact it consists of a number of logically related but physically distinct computer files that are stored under a directory structure where the root directory is named as the data base. The management of the directory structure and physical files is the responsibility of J-ISIS and you do not normally have to know their structure in detail in order to operate a data base. However some basic knowledge of the purpose and function of the major directories and files associated with a data base will help you to understand the system better.

1. Data base definition files Before a data base can be accessed for processing, it must be made known to J-ISIS by defining certain characteristics of its record structure and contents. The Data base definition services allow you to create and/or modify a data base definition. A J-ISIS data base definition consists of the following components, each stored in a separate file: Field Definition Table (FDT) : The FDT defines the fields which may be present in the records of the data base and their characteristics. Data entry worksheet(s) (WKS) : One or more screen layouts used to create and/or update the master records of the data base. J-ISIS provides a specially designed editor to create these worksheets. Display format(s) (PFT): Display formats define precise formatting requirements for either on-line display of records during browsing, searching or for the generation of printed output products such as catalogues and indexes. J-ISIS provides a powerful and comprehensive formatting language which allows you to display the contents of a record in any desired way. It also provides a specially designed editor to create these PFTs. Field Select Table(s) (FST) : One FST defines the fields of the data base to be made searchable through the Inverted file. Additional FSTs define the most frequently used sorting requirements for the data base.

J-ISIS Reference Manual Page 10

2. Directory structure for a database named CDS The /ifdt directory contains the J-ISIS database FDT which is named "master.fdt" so that it is independent from the database name.

The /ifst directory contains the J-ISIS database FSTs. The FST used for indexing is called "master.fst". Other FSTs can be stored in this directory.

The /ipft directory contains the J-ISIS database PFTs

The /iwks directory contains the J-ISIS database Worksheets

The /idata directory contains the Berkeley DB database that store all the records of a given data base. This is in fact a set of files which are logically accessed as a single database called Master File.

The /indexes directory contains a subdirectory called master that contains the main index files generated by Lucene open-source search software.

The /idocs directory contains the documents which are referenced by this database.

The /images directory contains the images which are referenced by this database.

3. Master File A Berkeley DB database store all the records of a given data base. This is in fact a set of computer files located in a single directory named /idata which are logically accessed as a single database called Master File. The Master File contains all the records of a given data base, each record consisting of a set of variable length fields. Each record is identified by a unique number, automatically assigned by J-ISIS when it is created, called the Master File Number or MFN.

In order to provide a fast access to each master file record, the Berkeley DB uses a B Tree, which is in fact an index mapping a MFN to a record in the Master file (This data structure replaces the ISIS Cross Reference File and plays the same role). You may create, modify or delete Master file records by means of the J-ISIS Data Entry Module.

4. Inverted File/ Indexes Although a master record can be retrieved directly by its MFN, through the B Tree, additional ways of accessing a record are, of course, necessary. In the retrieval of bibliographic records, for example, it may be desirable to access a record by author, by subject, or by any other data element occurring in the record. J-ISIS allows you to J-ISIS Reference Manual Page 11

provide a virtually unlimited number of access points for each record through the creation of a special computer files called logically called the Inverted file. J-ISIS uses Lucene open-source search software for creating the Inverted File and making searches. The Inverted file contains all terms which may be used as access points during retrieval for a given data base, and, for each term, a list of references to the Master file record(s) from which the term was extracted. The collection of all access points for a given data base is called the dictionary. You may think of the Inverted file as an index to the contents of the Master file. For example, four master records (with MFN 18, 204, 766 and 1039) contain the keyword ADULT EDUCATION. The logical structure of the corresponding Inverted file entry would be:

ADULT EDUCATION 18 204 766 1039

Because each term will normally have a different number of records indexed under it, the logical records in an Inverted file are of varying length. Here again, in order to provide the fast retrieval of each access point, the Inverted file actually consists of several physical files. J-ISIS allows selective creation of Inverted files for each data base. You may select fields, subfields or elements thereof. In addition, by specifying appropriate options, you may extract individual words, phrases or descriptors from selected fields. You define the searchable elements for a given data base by means of a Field Select Table (FST), which contains the fields to be inverted and the indexing technique to be used for each field.

5. Relationships between the files The logical relationship between the major files of a J-ISIS data base is best perceived by examining the way in which retrieval is performed. Retrieval from a data base is done by specifying a set of search terms which are looked up in the Inverted File to locate the list of MFNs associated with each term. These lists are then manipulated by the program according to the search operators you have specified in your search formulation until, at the end of the search, a single list, called the hit list, is obtained, corresponding to the MFNs of the records satisfying your search formulation. If at this point you request a display of the records retrieved, J-ISIS will read each record in the hit list from the Master file, format it according to the specified format and display it on the screen. You may also save one or more hit lists, which you may later use to print the records using the Print Dialog services. A saved hit list is called a save file.

D. System architecture

J-ISIS uses a Client/Server architecture, is fully UNICODE, and benefits of the latest software developments. It uses the NetBeans Platform as a framework for creating rich client applications that can be written once and then run on any operating system. It also uses several open source libraries. Nowadays, almost everyone relies upon libraries and frameworks written by someone else. On the developer side, the main objective is to develop a long-term solution that would be modular, easy maintainable and extensible.

J-ISIS is developed as a modular application and uses the best practices offered by NetBeans Platform framework. Some modules add user interface elements and others are just libraries.

J-ISIS Databases and all related files such as indexes, FDT, FST, Worksheets, etc are fully UNICODE using UTF-8 encoding and are interoperable between different platforms (i.e. you can copy databases without conversion between Windows, Mac OS X and Linux.

J-ISIS is a Client/Server application which is working as a database server as well as a client. When you start J- ISIS, in fact you start a J-ISIS database server listening on port 1111 by default. But as a client, you can connect

J-ISIS Reference Manual Page 12

either to the localhost database server of the local machine or to another machine which should have J-ISIS running and that will also play the role of a J-ISIS database server. In that case, you will provide the IP address of the machine as ―Host Name‖, ―192.168.0.13‖ for example. You can get it by typing ―ipconfig‖ in a command window of the server machine. Requests are passed as messages to the database server and results are returned. Sockets are used to communicate between the client and the database server through TCP/IP.

The basic component of J-ISIS is its menu system, which allows you to call upon the various services. However, in order to manage and operate your data bases you must also learn a number of techniques which are specific to J-ISIS, such as the search language or the formatting language. Techniques are in turn implemented by using a set of tools which J-ISIS provides for this purpose. For example, if you want to carry out a search in a data base, you must first select the appropriate commands in the menus and then formulate your search requirements, which must follow the rules of the J-ISIS search language. You must therefore know this technique. To actually enter the search you use a tool called the ‖search window‖. Whereas a technique entails the intellectual process of transforming a requirement (such as retrieving information on the effects of solar radiation on marine fauna) into the specific search language of J-ISIS, a tool is a more mechanical and generally more widely applicable facility (for example the editor is not only used to enter search formulations but also to create or modify records).

E. Web Technologies

J-ISIS application has an embedded Web Browser that is used for displaying records in raw format and the output of display formats. It also includes its own Web server which is used for serving static web resources that can be html pages or any document file.

1. J-ISIS Web Browser The J-ISIS Web browser component is based on the JavaFX embedded browser, a user interface component that provides a web viewer and full browsing functionality through its API. The JavaFX embedded browser component is based on WebKit, an open source web browser engine. It supports Cascading Style Sheets (CSS), JavaScript, Document Object Model (DOM), and HTML5.

Supported Features of HTML5

The current implementation of the WebView component supports the following HTML5 features:

 Canvas  Media Playback  Form controls (except for )  Editable content  History maintenance  Support for the and tags.  Support for the

and tags.  DOM  SVG  Support for domain names written in national languages

J-ISIS Reference Manual Page 13

2. J-ISIS Web Server J-ISIS contains an embedded Jetty http web server listening on port 8585, thus whenever J-ISIS is running on a machine, you have the J-ISIS database server listening on port 1111 and also a Web Server listening on port 8585. The document root of the Web server is defined by the first DEF_HOME value defined in the dbhome.conf file.

For example, if J-ISIS is installed in folder "C:\jisis_suite 21 December 2012\jisis_suite" and the dbhome.conf content is:

# Upper/lower case is important under unix DEF_HOME=./home_example_db

Then the J-ISIS Web server document root is: C:\jisis_suite 21 December 2012\jisis_suite\home_example_db

Having J-ISIS running, and typing http://localhost:8585 will display all databases which are children of the document root.

J-ISIS Reference Manual Page 14

2. Installation

You should first uninstall previous Java installation and install the Java SE Development Kit 7u21 (Which includes JavaFX) from

http://www.oracle.com/technetwork/java/javase/downloads/jdk7u21-downloads-1859576.html Please see Annex 2 for a step by step procedure. To install J-ISIS from J-ISIS Distribution zip file, you can unzip the file. The unzip application will suggest to create a parent folder named ―J-ISIS XXXX ‖. XXXX being the J-ISIS release date, you can keep the parent folder name ―jisis_suite XXXX‖ and change the place where it should be created. Once installed, J-ISIS will consist of the following directory layout:

Know where the directory location of your J-ISIS is. [It will now be referred to as: $JISIS_HOME]

In this example, it is:

“C:\jisis_suite XXXX”

2.1 Updating from a previous release

In case you have already installed a previous version of J-ISIS, the best strategy is as follow:

a) Install the new release in a new folder and NOT ON TOP OF PREVIOUS

b) Copy your working databases from the previous J-ISIS installation \jisis_suite\home_example_db directory to the new one

c) It's recommended to make a backup of the previous J-ISIS installation database directory in case \jisis_suite\home_example_db

d) Delete the previous J-ISIS installation.

J-ISIS Reference Manual Page 15

It may be necessary to save your working databases which are under the \jisis_suite\home_example_db directory of the previous J-ISIS installation. There is a parent directory per database. Thus if you have a database named Ernesto_DB, you will have a directory called Ernesto_DB under the \jisis_suite\home_example_db directory.

jisis_suite------+------bin | +------conf | +------etc | +------home_example_db------+------ASFAEX | | | +------Ernesto_DB

2.2 UNIX/Linux Special Setting

Debian / Ubuntu and much other Linux distros are switching to OpenJDK (Open Java Development Kit). It is a free and open source implementation of the Java Platform, Standard Edition (Java SE).

Unfortunately the OpenJDK doesn't include JavaFX latest version, you will need to install Oracle Java SE Development Kit 7u40 or higher version (Which includes JavaFX) from: http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

A good tutorial is available here:

Ubuntu Linux: Install Latest Oracle Java 7 http://www.cyberciti.biz/faq/howto-installing-oracle-java7-on- ubuntu-linux/

You should also set the DEF_HOME variable in the “dbhome.conf” file as follow: DEF_HOME=../home_example_db (Please note the dotdot)

The "dbhome.conf" file is described below in section 2.3

2.3 Configuration Files

The /conf directory located under the jisis_suite directory contains two configuration files: dbhome.conf and server.conf

dbhome.conf This file is located in directory JISIS_HOME/jisis_suite/conf/dbhome.conf and contains one or several root directories. J-ISIS will create data bases directory structure for a database as children from this directory J-ISIS Reference Manual Page 16

In the J-ISIS distribution, the database root directory is defined as:

$JISIS_HOME/jisis_suite/home_example_db and the dbhome.conf file has the following lines:

DEF_HOME=./home_example_db #DEF_HOME2=D:\MyDb Note: lines beginning with # are considered as comments

The document root of the Web server is defined by the first DEF_HOME value defined in the dbhome.conf file. For example, if J-ISIS is installed in folder

"C:\jisis_suite XXXX\jisis_suite" and the dbhome.conf content is:

# Upper/lower case is important under unix DEF_HOME=./home_example_db

Then the J-ISIS Web server document root is: C:\jisis_suite XXXX\jisis_suite\home_example_db server.conf This file is located in directory JISIS_HOME/jisis_suite/conf/server.conf and define the data base server port. It can also be used to define the J-ISIS Web server IP address for networking, the default is localhost.

# Server configuration file port=1111 # Following line must be uncommented and updated to server machine IP address for networking #jetty.webserver.baseurl=http://192.168.0.11:8585/

3. J-ISIS theoretical limits

A single database managed by Berkeley DB can be up to 248 bytes, or 256 terabytes. Berkeley DB scales in terms of the amount of data it manages, the capabilities of the devices on which it runs, and the distance over which applications distribute data. Largest installations may reach petabytes.

In theory: 2^31 - 1 = 2147483647 (~2 GigaByte). In practice: Size of a field: heap size and end of virtual memory

J-ISIS Reference Manual Page 17

Record sizes up to two gigabytes. Frequently used data is Size of a record: cached in memory.

In theory: 2^31 - 1 = 2147483647 (~2 Gigabytes). In practice: Number of occurrences of a field: end of virtual memory.

In theory: 2^31 - 1 = 2147483647 (~2 Gigabytes). In practice: Number of fields of a record: end of virtual memory.

In theory: 2^31 - 1 = 2147483647 (~2 Gigabytes). In practice: Number of lines of a FST: end of virtual memory.

The largest number a four-byte signed integer can hold is Number of records in a database: 2,147,483,647.

4. Starting J-ISIS

The zip distribution contains two Windows launchers and the shell script to be use under Unix/Linux

 J-ISIS is started from an executable launcher under Windows (jisis_suite.exe for 32-bits Windows and jisis64_suite.exe for 64-bits Windows).

 J-ISIS is started from a shell script under Unix/Linux (jisis_suite)

These three files are located in the jisis_suite/bin directory.

Double click on “jisis_suite.exe“ ( ) or "jisis_suite64.exe"

( ) which are located in the “$JISIS_HOME /jisis_suite/bin” directory to start J-ISIS.

J-ISIS Reference Manual Page 18

You can also create a shortcut to “jisis_suite.exe“/"jisis_suite64.exe" and drag it to the desktop.

J-ISIS Reference Manual Page 19

5. J-ISIS – Client/Server Application

A. The Application Main Window This application main window is displayed each time J-ISIS is started.

J-ISIS release

Menu bar

Tool bar

Output Window activation buttons

The main components of this application main window are: - the J-ISIS application title bar, at the top of the window; - the menu bar, which provides access to all the J-ISIS functions (some of these functions may also be activated by clicking on the various buttons of the tool bar); - the tool bar, located just under the menu, which provides a quick mouse access to the most frequently used functions - the output window activation buttons (The output windows shows application messages)

B. Output Console Window

The output console window is very important as it shows application messages that could be information messages on the status of an action or error messages.

J-ISIS Reference Manual Page 20

Output Console

Clicking on the right mouse button will open the following context menu:

The "Clear" command is very useful when working on print formats in the PFT editor for clearing the syntax errors

J-ISIS Reference Manual Page 21

C. Database menu This menu contains the following commands:

- Database Menu

i. Open Connection J-ISIS is a Client/Server application which is working as a database server as well as a client. When you start J- ISIS, in fact you start a J-ISIS database server listening on port 1111 by default.

But as a client, you can connect either to the localhost database server of the local machine or to another machine which should have J-ISIS running and that will also play the role of a J-ISIS database server. In that case, you will provide the IP address of the machine as ―Host Name‖, ―192.168.0.13‖ for example. You can get it by typing ―ipconfig‖ in a command window of the server machine. The server machine is the computer where the data bases are stored.

Important Note:

When you start J-ISIS, the first thing to do is to open a connection to the Database Server. There are two ways of opening a connection:

The first is by selecting one of the most recently used database from the list displayed when selecting "Most Recently Used DB..." ;

The second is by selecting this command ("Open Connection..."). In that case J-ISIS will display the open connection dialog box.

ii. Open Database… There are three ways you may open a data base:  The first is by selecting one of the most recently used database from the list displayed when selecting ―Most Recently Used DB..." ;  The second is by selecting this command;  The third is by clicking on the open database button on the toolbar: In that case J-ISIS will display the open database dialog box. When selecting a database from the most recently used database list, J-ISIS will: i) open the connection to the database server; ii) open the database; iii) open the Data Viewer displaying the 1st database record.

J-ISIS Reference Manual Page 22

In the last two cases, the connection to the data base server should have been established and J-ISIS will display the open database dialog box.

Double clicking on a database or selecting a database and clicking on Finish button will open the database and the Data Viewer displaying the 1st database record.

ii. New Database… Using this command it is possible to create new J-ISIS databases, using the Database Definition Wizard, which consists of 4 main steps:  Definition of database name and selection of database home  Definition of fields (FDT)  Definition of a data entry worksheet (Default Worksheet)  Definition of indexing rules (for searching) (FST). iii. Most Recently Used DB... Selecting this command will display the most recently used data base names. And selecting one of these database names will open the data base server connection as well as the database. iv. Re Index Database This command rebuilds the index of the currently open data base. v. J-ISIS Import Wizard This command allows you to import data from external files recorded according to several standard format for information interchange such as ISO-2709, MarcXML, MODS and Dublin Core. When you select this command J-ISIS provides the Import, Wizard, which consists of 4 main steps:  Select external file, encoding and format  Import Options (Load, Merge or Update)  Database name and structure (FDT, FST)  Parameters (line length, subfield separator, Reformatting FST, Move Leader Info into 30XX fields) vi. Export Database This command allows you to extract a data base or a portion thereof normally for transmitting it to other users. You may also use this command to perform some reformatting of the records of a data base and then use the import function to store the reformatted data into the original or a different data base. When you select this command J-ISIS will display the Export, Wizard, which consists of 2 main steps:  Select the output format (ISO-2709 / MarcXML / MODS / Dublin Core)  Output Parameters. vii. Database Backup NOT IMPLEMENTED!

J-ISIS Reference Manual Page 23

Database backup and restore are fairly easy. You just have to copy the database folder with all its subdirectories somewhere else or in a zip file to backup, and to restore the database folder with all its subdirectories for restoring after deleting the database folder to replace. viii. Application Display Font Selection This command allows you to select the font to be used for the language. ix. PrintSort This command allows you to print the output of a given query and/or to print a selected range of records. You may sort the records by virtually any combination of fields and subfields. The field(s) by which the records are sorted may be used as headings in printing. When you select this command J-ISIS displays the PrintSort Windows which has two tabs: one to define the print parameters and another one to provide the specific sorting and page layout parameters you require for that particular print run. x. Close This command closes the currently selected data base. All associated windows, such as a search window, will also be automatically closed. xi. Close all This command closes all the currently open data bases. xii. Print

xiii. Exit This command terminates J-ISIS. All open data bases will automatically be closed.

6. J-ISIS Database Server

J-ISIS is a Client/Server application which is working as a database server as well as a client. When you start J- ISIS, in fact you start J-ISIS client application. J-ISIS client application must communicate with a J-ISIS database server listening on port 1111 by default. Requests are passed as messages to the database server and results are returned. Sockets are used to communicate between the client and the database server through TCP/IP.

The J-ISIS database server may be on the same machine or on a remote machine. The "Open Connection..." dialog allows to define the machine on which the J-ISIS database server is running (and also where the databases are stored).

J-ISIS Reference Manual Page 24

Clicking on ―Open Connection...” will open the dialog below. By default you get ―localhost‖ machine as database server. It means that the J-ISIS database server and desktop client application will be on the same machine. You could connect to another machine running J-ISIS by providing the IP address of this machine. The databases and all related files are stored on the Server machine. Enter ―admin‖ and ―admin‖ as ―User‖ and ―Password‖ respectively. User administration is not yet implemented and you should keep the default ―admin‖ ―admin‖ values.

7. Opening Databases

Clicking on ―Open Database…” will open the dialog below.

J-ISIS Reference Manual Page 25

This dialog displays the list of databases defined under the root directory

“C:\jisis_suite XXXX\jisis_suite\home_example_db”

which is defined by symbolically by DEF_HOME

Opening a database is done by selecting a database in the list and clicking on the "Finish" button, or by double clicking on a database. In both case, the database is opened and the Data Viewer module is launched for displaying the 1st database record.

You may define several root directories for the databases. Each root directory is defined by a ―DEF_HOME” variable and is stored in the ―$JISIS_HOME/jisis_suite/conf/dbhome.conf‖ file ($JISIS_HOME being the J- ISIS install directory.

All databases are defined in a sub-directory of the root directory

$JISIS_HOME/jisis_suite/conf/dbhome.conf

Content:

DEF_HOME=./home_example_db

#DEF_HOME2=D:\MyDb

The database root directory is defined as: $JISIS_HOME/jisis_suite/home_example_db

Eight J-ISIS databases are provided with this distribution (ASFAEX, cds, AUTOR, etc)

Select the ―ASFAEX‖ database for example and click on ―Finish‖ button or double click on ASFAEX.

J-ISIS Reference Manual Page 26

The ASFAEX database is opened and displayed in the Data Viewer. You can see the databases that are opened by putting the mouse cursor on the ―Databases Pool‖ tab on the left.

J-ISIS Reference Manual Page 27

You can also see the connections that are established by putting the mouse cursor on the ―Connections Pool‖ tab on the left.

When you open a database, J-ISIS checks several properties of the database such as the FST and the indexes. Thus it may take a couple of seconds for big databases (more than 50 000 records) before giving back the hand.

You will also see messages displayed in the Output Window such as the FST and the validity of the print formats used in the FST. This Output window is quite useful for debugging and informing the user. 8. Visualizing Database Content

8.1 Data Viewer

Opening a database is done by selecting a database in the list and clicking on the "Finish" button, or by double clicking on a database. In both case, the database is opened and the Data Viewer module is launched for displaying the 1st database record. J-ISIS records are displayed according to print format definitions written in the ISIS formatting language. The J-ISIS Data Viewer is using a web browser component that is XHTML, JavaScript and CSS compliant

J-ISIS Reference Manual Page 28

The ―RAW‖ format is always available and used by default. It displays the record content as fields.

The user can change the default "RAW" format by specifying a user defined "RAW" format in the PFT Manager module. This can be achieved by opening the "RAW" format, writing a new format and saving it. To get the above display, the "RAW" format must be empty.

For example we can make a copy and paste of the HTML2 format:

J-ISIS Reference Manual Page 29

We save it and click on the Data Viewer tab:

J-ISIS Reference Manual Page 30

You can change the format to any format, ―egbert‖ for example:

After clicking on egbert in the list, you will see the record formatted according to" egbert" print format

If you have Internet access, you can click on the links:

J-ISIS Reference Manual Page 31

Clicking on the 1rst link will bring you to UNESCO home page

You can study the ISIS Print Format "egbert" with the ―PFT Manager" in ―Tools‖ menu item. The PFT Manager can also be used to create, edit and test Print Formats.

J-ISIS Reference Manual Page 32

8.2 DB Browser

The Database Browser provides a tabular view of the DB with records along the rows and the fields along the columns.

o The first column contains the Master File Number (MFN) and is frozen horizontally.

o Cells can be enlarged using drag and drop of the vertical separator line

o Cells height allows displaying 3 lines

o Clicking on a cell where the text is greater than what can be displayed provides vertical scroll bar to view the whole cell.

o Columns (except the MFN) can be drag and drop

J-ISIS Reference Manual Page 33

Further development and Improvements under consideration: o Selection of Fields/column to display o Filtering o Searching with highlighting o Sorting o Changing the display font o Changing the display format (1 record per cell with RAW format) o Printing/ Print Preview

Text in Arabic can be aligned from right to left and vice versa by pressing Ctrl/Asterisk using the asterisk on the numeric pad

J-ISIS Reference Manual Page 34

8.3 Dictionary Browsing

This option will allow to display and browse the indexed terms. J-ISIS displays Information about the index and the terms indexed in a table with 4 columns that contain respectively the term index, FST field identifier prefixed with underline character ( _ ) or the FST field name (if any provided in the FST entry), term value and frequency. Please note that the frequency is the number of records that contains the term, and not how many times the terms occurs.

Terms are sorted by FST field identifier/name and term value. A selected Term value can be copied by clicking on the right mouse button and Ctrl/C Copy. It can then be pasted in the Search Form.

Taking as example the CDS database, the dictionary looks like this:

J-ISIS Reference Manual Page 35

The ―Quick Search‖ fields provide a way of searching quickly a term by typing the first characters in the Query field:

Typing ―CE‖ or "ce" for example will display only the terms that begin with ―CE‖

J-ISIS Reference Manual Page 36

You can refine the search by typing more characters, adding the letter n will display only the terms that begin with ―cen―

J-ISIS Reference Manual Page 37

9. The Formatting Language

The J-ISIS formatting language follows CDS/ISIS formatting language syntax with some minor exceptions. But the main difference is that the output produced by interpreting a format is formatted as HTML and displayed in the J-ISIS embedded browser or written on a file as HTML in PrintSort module while it could also be printed as plain text.

The formatting language allows you to define precise formatting requirements for data base records. Through this language you may select one or more specific data elements in the order you want and optionally insert constant text of your choice, e.g. to label some or all the fields, as well as specify vertical and horizontal spacing requirements. A collection of formatting commands in the language described in this chapter is called a format. In general a format defines a subset of your data base record, which may then be used by J-ISIS to perform a given function. Although formats are primarily used to specify the way records are displayed on the screen or on file, they are also widely used throughout the system any time you need to perform specific operations on one or more data elements. For example, in a Field Select Table (FST), you use a format to specify to which data a given indexing technique is applied. The formatting language is therefore the core of many J-ISIS operations and an efficient use of J-ISIS requires a thorough knowledge of this technique. All formatting commands may be entered in upper or lower case or a combination of the two.

To a novice, some formats may appear to be very complex, suggesting that the formatting language itself is complex. In fact, all formats, even the most complicated ones, are made up of one or more simple commands or statements, separated by commas or spaces. The apparent complexity stems from the fact that there may be many of these commands in a format. Thus the key to understanding formats is to analyse each command individually. J-ISIS offers a format editor with syntax highlighting and checking of the displayed output result when applying the format to records. Although all formats are defined using the same formatting language, they may be categorized, depending on their intended use, as follows:

Display formats: used for displaying records on the screen or printing records on external files (in the latter case they are referred to as print formats);

Extraction formats: used in FSTs to define the data to be indexed.

When J-ISIS processes a format, it works with three objects: a data base record, the format and a work area where the output produced by the format is stored. The commands are executed sequentially in the order they are listed in the format. Some commands produce data (e.g. the contents of a given field), while others produce actions (such as skipping to a new line, leaving one or more blank lines, etc.). The data produced are stored as lines of text in the work area, which is then passed to the relevant program for further processing, e.g. for displaying or further processing. When designing a format to display data, all features offered by HTML ( HTML stands for Hyper Text Markup Language) and CSS (Cascading Style Sheets) can be used by inserting literals with the appropriate HTML markup tags and CSS to style the HTML elements (http://www.w3schools.com/html/default.asp ). For example, the following format: mhl,('


'v70'

') applied to cds database record with MFN 1 will display in the PFT Manager as follow:

J-ISIS Reference Manual Page 38

Unless otherwise noted, all the formatting examples in the following sections refer to the sample record given below, where the content of each field is given as actually stored in the record. This record is taken from the sample CDS data base contained in the original CDS/ISIS installation as supplied by UNESCO. MFN = 4

Tag Contents Electric hygrometer apparatus for measuring water- 24 vapour loss from plants in the field 26 ^aParis^bUnesco^cl965 30 ^ap. 247-257^billus. Methodology of plant eco-physiology: proceedings of the 44 Montpellier Symposium 50 Incl. Bibl. Paper On: 70 Grieve, B.J. 70 Went, F.W.

A. Field Selectors Field selectors are commands used to extract a specific field or subfield from a record. A special command allows you to extract the MFN of the record, even though the MFN is not, properly speaking, a field (the MFN has no tag and is not defined in the FDT). J-ISIS Reference Manual Page 39

1. Field command To extract a field from a record code the letter V followed by the tag of the field to be extracted. The V (a mnemonic code for Variable length field) is the command telling J-ISIS that you want to extract a field. It may be entered indifferently in upper or lower case. Below are some examples.

Format Output Electric hygrometer apparatus for measuring water- v24 vapour loss from plants in the field v26 ^aParis^bUnesco^cl965 v30 ^ap. 247-257^billus. Methodology of plant eco-physiology: proceedings of the v44 Montpellier Symposium

2. Subfield command To extract a particular subfield from a given field just append the corresponding subfield delimiter to the tag, as shown in below. Note that you may use the special subfield delimiter ^* to select the first subfield, whichever this may be. In this case the first subfield need not be preceded by an actual subfield delimiter. Note that, as alphabetic subfield delimiters are case insensitive, you may enter the subfield delimiter code in either upper or lower case.

Format Output v26^a Paris v26^b Unesco v30^a p. 247-257 V26^* Paris Methodology of plant eco-physiology: proceedings of the v44^* Montpellier Symposium

J-ISIS Reference Manual Page 40

3. Extracting a fragment of a field or subfield You may need, in some cases, to extract a portion of a field which is not a subfield, particularly when the field has a fixed format throughout a data base (e.g. a standardized date of the form YY-MM-DD). You may do this by appending the offset/length command immediately after the field or subfield command to which it applies. This command may be coded *offset.length or *offset or .length, where:  *offset indicates the position of the first character to be extracted from the field or subfield (character positions are counted from zero, i.e. the first character is at position 0, the second at position 1, etc.); if omitted, J-ISIS assumes the offset to be zero;  .length indicates the number of characters to be extracted; if omitted the remainder of the field (or subfield) starting from offset will be taken. Some examples of this command are given below, where it is assumed that the sample record also contains a field 1 as follows:

99-Nov-05

Format Output V1*3.3 Nov V1.2 99 V1*7 05 V1*7,vl*2.4 05-Nov V1*7,vl*2.5,v1.2 05-Nov-99 v26.3 ^aP v26^b*2.4 esco

Note, in the last two examples, the difference in handling a subfielded field: when this is referenced as a field (e.g. v26), offset zero represents the first actual position of the field, whereas when a specific subfield is selected (e.g. v26^b), offset zero represents the first data character after the subfield delimiter.

J-ISIS Reference Manual Page 41

4. Field occurrences It is possible to access individual occurrences of a repeatable field by specifying the occurrence number or range, enclosed in square brackets, immediately following the field selector. For example: v10[1] retrieves the first occurrence of field 10 v10[2..4] retrieves the 2nd through the 4th occurrence of field 10 v10[3..] retrieves the 3rd through the last occurrence of field 10 v10[1]^a retrieves subfield ^a in the 1st occurrence of field 10 v10[2]*4.3 retrieves a fragment of the 2nd occurrence of field 10

It is coded as follows: [ [..]] and refer to the first (or unique) and last occurrences, respectively. If the specified is greater than the actual number of occurrences, no output is generated. The same occurs if data field is not repeatable and is set to a number equal or greater than 2. However, if is set to 1 and it is used in a non-repeatable field, content is normally output. This component must be used outside a repeatable group; otherwise, is ignored. If double dot (..) is used and is missing LAST is assumed. The LAST keyword is set with the value of total occurrences of a data field.

Examples:

Format Output V70[2]+|; | Wynter, Hector V70[2..5]+|; | Wynter, Hector; Faure, Edgar V70[1..]|; | Jóború, Magda; Wynter, Hector; Faure, Edgar; v70[1..3]|; | Jóború, Magda; Wynter, Hector; Faure, Edgar; V70[3] Faure, Edgar "AUTHORS: AUTHORS: Wynter, Hector; Faure, Edgar; "v70[2..]|; |

J-ISIS Reference Manual Page 42

Please note that the example record above has MFN 137 in database CDS.

5. Indentation command When J-ISIS executes a field or subfield command it will normally output the field contents at the current line position, which depends on the last command executed. If the field cannot be fully contained in the current line, J-ISIS will create as many additional lines as required. Normally the continuation lines begin at line position 1. You may alter this by providing an indentation command, which must immediately follow the field (or subfield) command. The indentation command is coded (f,c) or (f), where: f indicates the number of spaces to be left from the left margin before formatting the first (or only) line of the field. It is only effective if the field is formatted at the beginning of a line, otherwise it is ignored; c indicates the number of spaces to be left from the left margin before formatting all continuation lines of a field formatted on more than one line. A value of zero may be specified for either f or c. If only f is needed, you may omit c (J-ISIS will supply zero by default). However, if c is required you must also specify f. Some examples are given below.

Format Output Methodology of plant eco-physiology: proceedings of the V44 Montpellier Symposium Methodology of plant eco-physiology: proceedings V44(10) of the Montpellier Symposium Methodology of plant eco-physiology: proceedings of V44(5,9) the Montpellier Symposium Methodology of plant eco-physiology: proceedings of the V44(0,8) Montpellier Symposium

J-ISIS Reference Manual Page 43

6. MFN command To extract the MFN of a record code the following: MFN or MFN(d) where d is the number of digits to be displayed. If (d) is omitted 6 digits will be displayed by default.

Format Output MFN 000004 MFN(3) 004 MFN(2) 04 MFN(1) 4

Note that you may use the F function (see under F(expr-1,expr-2,expr-3)) to suppress the leading zeros.

J-ISIS Reference Manual Page 44

B. Mode command J-ISIS may display data in three different modes:

proof mode: in this mode, fields are displayed exactly as they are stored in the record. Note that CDS/ISIS does not insert any separator between fields or occurrences of a repeatable field. It is therefore your responsibility to ensure adequate separation of fields by using spacing commands, literals or repeatable groups as appropriate (see under ―Horizontal and vertical spacing commands‖, ―Literals‖, and ―Repeatable groups‖). This mode is normally used to display records for proofreading purposes. heading mode: this mode is normally used for headings when printing catalogues and indexes. All control characters embedded in the data, such as filing information (see under ―Filing information‖) and descriptor delimiters (< and >) are ignored (except as noted below), whereas subfield delimiters are replaced by punctuation (see below). data mode: this mode is similar to heading mode, but, in addition, each field is automatically suffixed with a full stop (.) followed by two spaces (or just two spaces if the field already ends with a punctuation mark). The repeatable field separator is replaced with a full stop and a space (. ) or just a space if the occurrence already ends with punctuation mark. Note, however, that this automatic punctuation is suppressed if the field selector is followed by a suffix-literal (see under ―Literals‖).

When CDS/ISIS formats a subfielded field in heading or data mode it will automatically replace embedded subfield delimiters by punctuation marks (the initial subfield delimiter, if any, is always ignored). Furthermore, the special character combination ‖> <” is replaced by “; ” thus providing a simple way to format fields containing lists of key phrases enclosed in triangular brackets (and saving keystrokes during data entry). The standard subfield delimiter replacement table provided is as follows:

^a replaced by ―; ‖ ^b through to ^i replaced by ―, ‖ all others replaced by ―. ‖

A mode command is coded Mmc, where:

m specifies the mode as follows: P proof mode H heading mode D data mode

c specifies case translation as follows: U data are converted to upper case L data are left unchanged

A mode command may appear as many times as necessary in a format, each remaining in effect until it is changed by a subsequent one. In the absence of an explicit mode command, CDS/ISIS will use MPL by default (proof mode, no upper case conversion). Examples of mode commands are given in the table below.

Format Output Electric hygrometer apparatus for measuring water- mpl,v24 vapour loss from plants in the field An Electric hygrometer apparatus for measuring water- mhl,v24 vapour loss from plants in the field An Electric hygrometer apparatus for measuring water- mdl,v24 vapour loss from plants in the field. AN ELECTRIC HYGROMETER APPARATUS FOR MEASURING WATER- mdu,v24 VAPOUR LOSS FROM PLANTS IN THE FIELD. mpl,v26 ^aParis^bUnesco^cl965

J-ISIS Reference Manual Page 45

mhl,v26 Paris, Unesco, 1965 mdu,v26 PARIS, UNESCO, 1965. Paper on: Paper on: hygrometers; plant transpiration; moisture; mdl,v69 water balance. mdl,v70 Grieve, B.J. Went, F.W.

C. Horizontal and vertical spacing commands The formatting language provides five commands to control horizontal and vertical spacing. They are summarized in the table below: The Xn command inserts n spaces before formatting the next data. However, if less than n positions are available on the current line, CDS/ISIS will simply skip to a new line. Thus, for example, if the next available position on the current line is 77 and the defined line width is 80, the execution of the command X7 will cause the next data to be formatted at the beginning of the next line (and not at the third position of the next line). (not supported in WINISIS graphical mode, where it has same effect as TAB)

The Cn command causes the next data to be formatted starting from position n of the current line. If the current line position is greater than n, then the next data will be formatted starting on position n of the following line. This facility allows you to produce tabular output. Note that if n is greater than the line width, the command is ignored.

The / command is similar to a carriage return on a typewriter, i.e. it forces a new line and causes therefore the next data to be formatted at the beginning of a line. However, unlike a carriage return, multiple adjacent / commands, although syntactically correct, have the same effect as a single / command, i.e. a / will never produce blank lines.

J-ISIS Reference Manual Page 46

The # command is provided for this purpose. It performs the same function as the /, but the skipping to a new line is unconditional. Thus you may use the combination / # to ensure that one (and only one) blank line will appear on the output (note that the combination ## may cause one or two blank lines to be inserted depending on whether the line being formatted when the first # is executed is empty or not). The use of the # command may cause a problem in those cases where the fields selected may be absent. This situation is best illustrated through the following example:

/#V10/#V20/#V30

If all fields are present in the record, the result will be that fields 10, 20 and 30 will each start on a new line and be preceded by a single blank line. However, if field 20 is missing there will be two blank lines between field 10 and field 30. This may be undesirable: if, in fact, what you want is a single blank line between each field regardless of the presence or absence of some of the fields, then the above format will not produce the desired results. The % command is provided to solve this problem. Its effect is to suppress all contiguous blank lines (if any), existing between the current line and the last non-blank line, at the time this command is executed. Thus the following format:

%##V10%##v20%##v30 ......

will produce one and only one blank line between each field even when one or more of them are absent from a given record. A line with absent fields and an unattached Cn or Xn command is not empty; dummy field selectors may be used to prevent the output of spaces (see E. Dummy field selectors).

D. Literals A literal is a string of characters, enclosed between appropriate delimiters, which will be inserted as is in the output. Literals may be used, for example, to label fields. Three types of literals may be specified:

conditional literals: define text which will be output only if the associated field is present in the record. If the associated field selector is a subfield command (e.g. v24^a), the text will be included only if the requested subfield is present in the field. If the associated field selector specifies a repeatable field, the text will only be included once, regardless of the number of occurrences of the field. Conditional literals are enclosed in double quotation marks ("), e.g. "Title: "

repeatable literals: like conditional literals, they define text to be output only if the associated field or subfield is present in the record. If the field is repeatable, however, the literal will be repeated for each occurrence of the field. Repeatable literals are enclosed in vertical bars (|) e.g. |Author: |

unconditional literals: define text which is always output regardless of the presence of fields. Unconditional literals are enclosed in single quotes ('), e.g. 'Summary'. As unconditional literals are always output as a single block of text (i.e. they cannot be split between two lines), their length should not exceed the line width otherwise they will be truncated. To output text exceeding one line, you should break it down into two or more literals. You may also provide any required indentation by using the Cn command.

Note that a literal should escape the literal delimiter if the literal delimiter is part of the literal content, e.g. an unconditional literal can contain a single quote if it is escaped as \' (although it may contain a double quote and/or a vertical bar). Conditional and/or repeatable literals are associated to a field or a subfield by their position in the format: Literals preceding a field selector (also called prefix-literals) will be output before the field contents, whereas literals following the field selector (also called suffix-literals) will be output after the field contents. If a repeatable prefix-literal is immediately followed by a ―+‖ sign (e.g. |xxx|+) it will be output before all but the first occurrence of the field. If a repeatable suffix-literal is immediately preceded by a ―+‖ sign (e.g. +|xxx|) it will be output after all but the last occurrence of the field.

J-ISIS Reference Manual Page 47

Repeatable prefix-literals and all suffix-literals are formatted as if they were physically part of the associated field contents, and obey therefore the field indentation command, if any. Conditional prefix-literals do not inherit the field indentation (you may however use the Cn command to provide indentation, if required). A given field may be associated with more than one literal. In this case the various literals must be specified following the rules and the order given below:

Prefix-Literals 1. One or more conditional prefix-Literals. A conditional prefix-literal may be followed by other conditional prefix-Literals, vertical and horizontal spacing commands, mode commands. All commands between the first conditional prefix-literal and the associated field selector become themselves conditional and will only be executed if the field is present, otherwise they will be ignored.

2. One, and only one, repeatable prefix-literal. This, if present, must immediately precede the associated field selector.

Suffix-Literals 1. One, and only one, repeatable suffix-literal. This, if present, must immediately follow the associated field selector.

2. One, and only one, conditional suffix-literal. If present, it must immediately follow the repeatable suffix-literal, if any, or the associated field selector.

3. Suffix-literals must not be separated by commas, and there must be no comma between the field selector and the first suffix-literal: a comma signifies the end of the suffix-literals associated with a given field selector.

Null literals (i.e. zero-length literals such as "" or | |) are allowed, and may be used, for example, as prefix- literals, to provide conditional vertical spacing or, as suffix-literals, to temporarily suppress the automatic final punctuation that CDS/ISIS supplies when data mode is active. Unlike in CDS/ISIS for DOS, in J-ISIS (as in Winisis) literals will not honour upper case translation if set by a preceding mode command. Examples of the different types of literals are given below:

Format Output MFN: 004 'MFN: ',mfn(3)/ Title: An Electric hygrometer apparatus mdl,"Title: "v24(0,7) for measuring water-vapour loss from plants in the field. MFN: 004 'MFN: ',mfn(3)/mdl, Title: AN ELECTRIC HYGROMETER APPARATUS "Title: ",mdu,v24(0,7) FOR MEASURING WATER-VAPOUR LOSS FROM PLANTS IN THE FIELD. MFN: 004 'MFN: ',mfn(3)/mdu, Title: AN ELECTRIC HYGROMETER APPARATUS "Title: ",v24(0,7) FOR MEASURING WATER-VAPOUR LOSS FROM PLANTS IN THE FIELD. v70 Grieve, B.J.Went, F.W. v70|; | Grieve, B.J.; Went, F.W.; v70+|; | Grieve, B.J.; Went, F.W. |; |v70 ; Grieve, B.J.; Went, F.W. |; |+v70 Grieve, B.J.; Went, F.W. Authors "Authors"/v70(3,3)+|; | Grieve, B.J.; Went, F.W. "(by: "v70+|; |")" (by: Grieve, B.J.; Went, F.W.)

J-ISIS Reference Manual Page 48

mdl,v26 Paris, Unesco, 1965. mdl,v26"" Paris, Unesco, 1965 mdl,v26,""/#v99,v30^a Paris, Unesco, 1965. p. 247-257. Paris, Unesco, 1965. mdl,v26,""/#v44|: | ,v30^a Methodology of plant eco-physiology: proceedings of the Montpellier Symposium: p. 247-257.

J-ISIS Reference Manual Page 49

E. Dummy field selectors A dummy field selector allows the conditional output of a literal based on the presence or absence of a given field or subfield without printing the contents of the associated field. Dummy field selectors are coded as follows:

Dt or Dt^x or Nt or Nt^x where:

D or N identifies this as a dummy field selector. D indicates that all associated conditional literals must be printed only if the field is present. N indicates that they must be output only if the field is absent;

t is the tag of the field controlling the output of literals;

^x is an optional subfield delimiter code. When given, it indicates that the output of literals is controlled by the presence or absence of the specified subfield (note, however, that the absence of a field also implies the absence of a specific subfield of that field). A dummy field selector is normally preceded by at least one conditional prefix-literal (which may be null), possibly followed by one or more other conditional prefix-literals, vertical and horizontal spacing commands, mode commands and/or escape commands. Dummy field selectors may not have suffix-literals. A repeatable prefix-literal is allowed.

A dummy field selector may be used to avoid the unexpected blank lines which may be output in the presence of unattached spacing commands. The required spacing command is attached to a dummy field selector as a prefix- literal and preceded by a null literal, e.g. ""CnDt [where t is field tag for a previously formatted field]

Some examples of these commands are given below:

Format Output "[Only in English]"n76 [Only in English] "(Anon.)"n70,v70+|; | Grieve, B.J.; Went, F.W. "(Anon.)"n80,v80+|; | (Anon.) "[Conference paper]"d44 [Conference paper] "[no date]" n26^c,v26^c 1965 "[no date]"n27^c,v27^c [no date]

J-ISIS Reference Manual Page 50

F. Expressions The formatting language allows you to evaluate values and/or compare values through the use of expressions. Expressions are constructs that, when executed, return a value. This value may be a string of characters (e.g. the contents of a given field or a literal), in which case the expression is called a string expression; a number, in which case the expression is called numerical; or it may be a truth value (True or False), in which case the expression is called Boolean. J-ISIS also provides a set of functions, which, on the basis of arguments you provide, perform a specific process and return a value. Functions returning a number are called numerical functions; those returning a string of characters are called string functions; and those returning a truth value are called boolean functions. Only string functions may be used directly as formatting commands. Numerical expressions may be used in boolean expressions or as arguments of functions. Boolean expressions and boolean functions may only be used in the context of an IF command.

1. Numerical expressions Numerical expressions are formed with operands which have a numerical value and operators specifying the calculations to be performed. The operands you may use in a numerical expression are as follows:

numerical constants: such as 5 18 98.65; numerical constants may be represented as optionally signed integers, decimal numbers, or in scientific exponent notation, e.g. 1.5E5 (meaning 1.5 times 10 power 5, i.e. 150000); numerical functions: such as val(v10) (these are described under ―Numerical functions‖ on page 104); MFN: the value of the MFN of a record numerical expressions: when used as an operand, an expression must be enclosed in parentheses, for example (val(v20)-5).

The available operators are:

+ addition (or unary +); - subtraction (or unary -); * multiplication; / division. J-ISIS Reference Manual Page 51

As in normal algebra, in the absence of parentheses, unary operators are executed first and multiplications and divisions are performed before additions and subtractions. A series of two or more operators at the same level are executed from left to right. You may use parentheses to alter this order of evaluation: expressions enclosed in parentheses are evaluated first and, inner parenthetical expressions are evaluated before outer expressions.

Note that, as field selectors (e.g. v10 or v20^a) yield a string of text, they cannot be used as operands in a numerical expression. The VAL function, however, may be used to convert the contents of a field or subfield to a numerical value.

Likewise, a numerical expression cannot be displayed directly but must first be converted to a character string using the F function. Examples of numerical expressions are given below (where it is assumed that MFN=10, vl^a=10, vl^b=20 and v2=30):

Expression Value 0.155e+3 155 1e-3 0.001 2*3+9 15 2*(3+9) 24 10-(4*(2-1)) 6 15*0.001 0.015 mfn+100 110 val(v2)+val(v1^a)*7.5 105 (val(v1^a)-val(v1^b))/100 -0.1

J-ISIS Reference Manual Page 52

Note that on line 3 it was necessary to put blanks around the asterisk multiply symbol to recognize it as a multiply symbol.

2. String expressions String expressions are formed with operands which are strings of characters. As J-ISIS provides no explicit string operator, a string expression always consists of a single operand which may be one of the following: unconditional literals: such as 'some text' field selectors: which may include an offset/length command (e.g. v26^c*2.2); string functions: such as S(v24,v25,v26) (these are described under ―String functions‖). string variables: such as s1:='ISIS'

3. Boolean expressions Boolean expressions are used to determine whether a set of one or more conditions are true or false and evaluate to a truth value. The operands of a boolean expression can be one of the following: relational expressions: which compare two values and determine whether a given relationship exists (see below), such as MFN <10; boolean functions: such as p(v24), which return a truth value (these are discussed under ―Boolean functions‖). Relational expressions allow you determine whether a certain relationship between two values is verified. The general form of a relational expression is: expression-1 relational-operator expression-2 where: Expression-1 is either a numerical or a string expression; Relational-operator is one of the following: = Equals <> Not equal to < Less than <= Less or equal than > Greater than >= Greater or equal than : Contains (may only be used for string expressions) expression-2 is an expression of the same type as expression-1, i.e. expression-1 and expression-2 must be either both numerical or both string expressions.

The relational operators = <> < <= > >= have their normal meaning when applied to numerical expressions (within the limits of the precision of numerical values defined under ―Numerical expression‖). When comparing string expressions, the following rules apply: except for the : (Contains) operator, strings are compared exactly as they occur, i.e. upper case and lower case letters are compared according to their Corresponding UNICODE character code (e.g. A will compare less than a); two string expressions are not considered equal unless they have the same length. If two expressions yielding strings of different length are such that they are character for character equal up to the length of the shorter one, then the shorter string is considered to be less than the longer one.

J-ISIS Reference Manual Page 53

The : (contains) operator searches a string of characters (defined by expression-2) in another string (defined by expression-1). If the second operand occurs anywhere in the first operand the result is True. This operator is case insensitive: lower case letters are considered to be equal to the corresponding upper case letter. For example, the result of: v10 : 'chemis' would be True if, and only if, field 10 contains the string chemist, otherwise the result would be False. Note that the second operand may be any arbitrary string of characters and need not be an actual word. Thus, in the example above, the result would be True not only if field 10 contained the word chemist but also if it contained chemistry, biochemistry, photochemistry, etc. The operands of a Boolean expression may be combined with the following Boolean operators:

NOT this operator produces the value True if its operand is False, and the value False if its operand is True. The NOT operator may only be used as a unal operator, i.e. it is always applied to the Boolean expression which follows it;

AND this operator produces the value True if both operands are True. If either operand is False then the result is False;

OR this operator performs an inclusive-OR operation. The result is True if either or both operands are True, otherwise it is False.

In evaluating Boolean expressions, and in the absence of parentheses, CDS/ISIS will execute NOT operations first, then AND operations before OR operations. Series of two or more operators at the same level are executed from left to right. You may use parentheses to alter this order of evaluation: expressions enclosed in parentheses are evaluated first and, inner parenthetical expressions are evaluated before outer expressions.

Examples of boolean expressions are given below:

Expression Value Mfn=4 True not mfn=4 False not (not mfn=4) True v24='plants' False v24: 'plants' True v24: 'PLANTS' True v44.6='method' False v44.6='Method' True v24: 'plants' and v44: 'method' True

J-ISIS Reference Manual Page 54

G. Functions A Function evaluates a value (called the function value or the returned value) which is then substituted for the function in the calculation of the expression. Functions may have one or more arguments, which you must supply, the value of which is used in the evaluation of the function value. Thus the value of a function depends on the value of the arguments supplied. These are enclosed in parentheses and separated by commas. Arguments may be of three types:

format: a J-ISIS format, which may contain any legal formatting command; except for the REF function (see ―REF(expression,format)‖), when a format is used as an argument, it is the text resulting from its execution which is passed to the function, rather than the format itself;

numerical expression: when a numerical expression is used as an argument it is first evaluated and the value of the expression is then passed to the function;

field selector: a field selector argument may be either a field or a subfield command; it may not contain an offset/length command.

The available functions and the corresponding arguments are described below, classified according to the type of value they return.

1. Numerical functions a. VAL(format) The VAL function returns the numerical value of its argument. The argument format is a J-ISIS format and may contain any legal formatting command. J-ISIS executes the argument to produce a string of text. This is then scanned from left to right until a valid numeric value is found (which may be in scientific exponent notation). The VAL function returns this numeric value converted to its internal machine representation, suitable for performing calculations. If no numeric value can be identified a value of zero is returned. If the text contains more than one numerical value only the first one is returned. For example (assuming that v1^a = 10, v1^b=20 and v2=30):

J-ISIS Reference Manual Page 55

Format Value val('15.79') 15.79 val(v1) 10 val(v1^a) 10 val(v2) 30 val('19',v1^b) 1920 val('xxxx7yyyy8zzzz') 7 val('abs 5.8e-4 ml') 0.00058 val('water') 0 val('Jul-Aug 1985') 1985

Note that the fields with tag 1 and 2 doesn't exist in CDS database b. RSUM(format) The RSUM function returns the sum of one or more numerical values. The text produced by the argument is scanned from left to right, as for the VAL function, and all the numerical values it contains are added together. The final total is the function value. Individual values must be separated by one or more non numeric characters, and it is your responsibility to insert these through the format supplied as argument. RSUM may be used to compute the sum of the numerical values contained in all the occurrences of a given repeatable field. For example (assuming that field 1 has four occurrences containing 1, 2, 3 and 4):

Format Value rsum('10,20,30') 60 Rsum(v1|;|) 10 Rsum(v1|,|,'48,3.5') 61.5 c. RMIN(format) The RMIN function returns the minimum value of one or more numerical values. The text produced by the argument is scanned from left to right, as for the VAL function, and all the numerical values are extracted. The J-ISIS Reference Manual Page 56

algebraically smallest of these is the function value. Individual values must be separated by one or more non numeric characters, and it is your responsibility to insert these through the format supplied as argument. RMIN may be used to compute the minimum of the numerical values contained in all the occurrences of a given repeatable field. For example (assuming that field 1 has four occurrences containing 10, 20, 30 and 40): Value Format rmin('1,2,-3') -3 rmin(v1|;|) 10 rmin(v1|,|,'48,3.5') 3.5 d. RMAX(format) The RMAX function returns the maximum value of one or more numerical values. The text produced by the argument is scanned from left to right, as for the VAL function, and all the numerical values are extracted. The algebraically largest of these is the function value. Individual values must be separated by one or more non numeric characters, and it is your responsibility to insert these through the format supplied as argument. RMAX may be used to compute the maximum of the numerical values contained in all the occurrences of a given repeatable field. For example (assuming that field 1 has four occurrences containing 10, 20, 30 and 40): Format Value rmax('1,2,-3') 2 rmax(v1|;|) 40 rmax(v1|,|,'48,3.5') 48 e. RAVR(format) The RAVR function returns the average value (arithmetic mean) of one or more numerical values. The text produced by the argument is scanned from left to right, as for the VAL function, and all the numerical values are extracted. The average value is then computed and returned as the function value. Individual values must be separated by one or more non numeric characters, and it is your responsibility to insert these through the format supplied as argument. RAVR may be used to compute the average value of the numerical values contained in all the occurrences of a given repeatable field. For example (assuming that field 1 has four occurrences containing 10, 20, 30 and 40): Format Value ravr('1,2,-3') 0 ravr(v1|; |) 25 ravr(v1|,|, '48,3.5') 25.25 f. L(format) The L function uses the text produced by the argument as a search term to the inverted file and returns the MFN of the first posting (if any). The retrieved postings are sorted into ascending numerical order of the MFNs, thus the L function will return the lowest MFN. Before looking up the inverted file the term is automatically normalized, i.e. stripped from diacritics and converted to upper case. If the term is not found the function value is zero. The L function is normally used in conjunction with the REF function to implement table lookup (see under ―REF (expression, format)‖, for examples on the use of the L function). Note that the argument format is executed using the current display mode (see ―Mode command‖). This is important because the use of an incorrect mode may result in not finding the term on the inverted file. As a general rule you should use the same mode used in the inverted file FST. g. LR((format)[, from, to]) Like the L function, LR searches the inverted file for the term defined by format, and returns all the postings of the term. The retrieved postings are sorted into ascending numerical order of the MFNs, thus the LR function will return the MFNs from the lowest to the highest. For example: J-ISIS Reference Manual Page 57

ref(lr((v10)),v1,v2) will retrieve fields 1 and 2 from all the records posted under the term contained in field 10. You may limit the range of postings to be retrieved by using the optional from and to parameters. For example: lr((v10),3,7) will only retrieve postings 3 to 7. The parameters ―from‖ and ―to‖ are optional and must be valid numeric expressions.

h. NPST(format) Like the L function, NPST searches the inverted file for the term defined by format, and returns the number of postings of the term.

i. NOCC(Vtt) Return the number of occurrences of field tt. For example: f(nocc(v70))

J-ISIS Reference Manual Page 58

j. OCC The OCC function returns the number of the current occurrence within a repeatable group. It may be used to produce numbered lists. For example the format:

(v70/)

will produce one line for each occurrence in the field 70. The following format:

(if p(v70) then f(occ,1,0),'. ' fi,v70/)

will produce a numbered list as the following:

1. First Author

2. Second Author

3. Third Author

k. SIZE(format) Return the size of the string generated by format. Note that this function is mode-sensitive. For example, if field 10 contains 20 characters, then size(mpl,v10) will return 20, however size(mdl,v10) will return either the value 22 if the last character is a period, counting, therefore, the 2 spaces automatically generated in data mode or the value 23, counting, therefore, the period and 2 spaces automatically generated in data mode. l. TYPE(type, format) This function tests whether the string generated by format is of the type defined by type and returns 1 if the string corresponds to the specified type or 0 if it does not correspond. The TYPE function has two different forms:

TYPE(„pattern‟,format); or TYPE(numerical expression,format).

The first form may be used to test if the string corresponds to a certain pattern. For example:

type('XXA-99-99-99',v10)

J-ISIS Reference Manual Page 59

will return 1 if field 10 corresponds to the pattern or 0 otherwise. The second form may be used to test other conditions according to the value of numerical expression, which must be one of the following:

1 - alphanumeric (the string contains only alphabetic or numeric characters); 2 - alphabetic (the string contains only alphabetic characters); 3 - numeric (the string contains only numeric characters); 4 - decimal integer (the string is an optionally signed integer, e.g. -24) 5 - decimal number (the string is a numeric value, including scientific notation).

For example: type(3,v40) will return 1 if field 40 contains only the digits 0-9.

m. TAG This command works only in a repeatable group and it is meant listing the field tags in the record in their order of insertion. Repeatable tags will list all occurrences. Syntax: tag (returns a numeric value). This command works in all Winisis versions since 1997.

Example:

(if p(v0) then f(tag,0,0),| |,v0/,fi) the above will list all tags of the current record with their content:

100 ^cTRINIDAD AND TOBAGO^t(1809)66-00000 200 250 325 350

J-ISIS Reference Manual Page 60

2. String functions String functions may be used both as operands of string expressions and as formatting commands. When used as a command, the function value will be formatted as if it was a field in the record. a. F(expr-1 ,expr-2,expr-3) The F function converts a numeric value from its internal floating point representation to a character string. The three arguments are all numerical expressions. The first argument, expr-1, is the number to be converted. The second argument, expr-2 is the minimum output width and the third argument expr-3 is the number of decimal places. The second and third arguments are optional, but do not omit them if a well defined output is required. Note, however, that expr-2 cannot be omitted if expr-3 is present. expr-2 gives the minimum width, i.e. the function value will be a character string of at least expr-2 characters and, if the converted numerical value requires expr-2 characters or less, it will be right adjusted within this width. If the number of characters required to represent the value of expr-1 is greater than the width given, then J-ISIS will use additional character positions as needed. In this case the output string will be longer than expr-2 characters. expr-3 defines the number of decimal places. If missing, the result may be in scientific exponent notation and, if also expr-2 is missing a default width will be used. If present, the result will be a rounded fixed point representation of expr-1, with expr-3 digits after the decimal point. If expr-3 is zero then expr-1 is first rounded toward the nearest integer and output as an integer value with no decimal point. For fixed point and integer conversion, if the integer part of the number is too large to be represented, the output is replaced by a series of asterisks (*). The F function may be used to align a column of numbers on the decimal point by choosing an appropriate width. Examples of the F function are given below. Format Value F(1) 1.000000e+00 f(1,10) 1.000000 F(-1,10,2) -1.00 f(1,5,2) 1.00 F(1,8,2) 1.00 f(mfn,1,0) 4 F(mfn,2,0) 4 F(mfn,3,0) 4

J-ISIS Reference Manual Page 61

b. REF(expression format) The REF function allows you to extract data from an alternate master file record. The first argument is a numerical expression giving the MFN of the alternate record to be selected and the second argument is the format to be applied to this record. If the value of expression does not correspond to the MFN of an existing record in the data base, then REF will produce a null string (i.e. no output). The process performed by the REF function is represented below, where it is assumed that the current record being formatted is record 1. As you can see from this example, the REF function is a very powerful device, as it allows you to gather together data which is in fact stored in different records of the data base, and make it appear to the user as if stored in the same record. In this first example, records 1 and 98 are linked by specifying in field 4 of record 1 the MFN of the related record, which contains the name of the country in English and French: through your format you may then select either language, by simply specifying the relevant tag in the related record. In certain cases, the linking of records by means of the MFN may be inconvenient from the point of view of entering the data. Beside the fact that any typing error in the MFN of the related record will result in displaying wrong data, it may require time to determine the correct MFN to use. In the case shown in the figure below, for example, it may well be that the source document from which the data is entered already contains a normalized country code (‗UK‘ in our case). Using the MFN as a link to the country record, may therefore require the consultation of listings or a search in the data base to find out that the MFN of the record corresponding to ‗UK‘ is 98: it would be much more convenient to be able to enter ‗UK‘ rather than ‗98‘. This is in fact possible and you can obtain the same output given in Figure 56, by organizing the data base in such a way that you may take advantage of the L function .

The L function finds the MFN corresponding to a search term. You may use it therefore to convert a character string (such as ‗UK‘) to an MFN. In order to be able to use the L function you must establish a unique relationship between a given character string and its corresponding MFN. The Inverted file provides such a mechanism (see ―Inverted file‖). In our example, therefore, it would be sufficient to invert field 10 of the ‗country‘ records to establish a unique relationship between the country code and the corresponding MFN (note that the concept of uniqueness is important as the L function assumes that the key it is searching for has one and only one posting. It is your responsibility to make this relationship unique by using, if necessary, a search term prefix as indicated under ―Inverted file FST‖). The figure below illustrates this technique. It is assumed here that field 10 of the ‗country‘ records is inverted with the prefix ‗CC=‘. J-ISIS makes no assumption as to the nature of a relationship existing between two records. It simply provides a mechanism for linking records. A particular implementation would normally convey to the user the meaning of a relationship through an appropriate use of the formatting language and a specific data base design. For example if a bibliographic record must be linked to a supplier record and to a borrower record, you should use two different fields to store the link to the supplier and to the borrower in order to reflect the different nature of these relationships.

J-ISIS Reference Manual Page 62

Note, that furthermore, because the second argument of the REF function is a format, it is possible to use this function in a recursive manner, to establish hierarchical relationships of higher orders, such as those that would be required to display the hierarchical relationships in a Thesaurus. As many REF function references as wanted may be used in a format.

Figure 1 - REF example Note that in J-ISIS the above format should be: mpl,v1/v2/v3/ref(l('CC=‟V4), v11)

The REF function format argument is evaluated in a plain text context while the global format is evaluated in an HTML context. However J-ISIS uses an embedded browser that expects HTML formatted text. If the Ref function format contains vertical and/or horizontal spacing commands, the HTML tag

 
should enclose the format as in the following example:

IF P(V28) THEN '

',(REF(L('Z='V28), c3, V1/,x10,V2,/,c10, V3/)),'
'FI,/

J-ISIS Reference Manual Page 63

Another REF function example taking data from 2 databases:

You can test by yourself the above example. i) Click on "Close All Databases" in the "Database" menu; ii) Open the LIBCAT database; iii) Open the "PFT Manager"; iv) Select the testRef1 format; v) Click on the "Apply" button. And you should get the same results than above. c. Alternate data base The REF, L, LR and NPST functions may specify a data base qualifier to refer an alternate data base. When the application is so designed, the data base specified in the REF function may be different from the one specified in the L or LR functions. The data base qualifier is specified as follows:

->dbref J-ISIS Reference Manual Page 64

where dbref is the name of the alternate data base (which must be in the data base path specified in parameter 5 of SYSPAR.PAR or for which there is a dbn.PAR in the data base path). For example:

ref->bib(l->book(v10),v100,v200)

In this example, supposing the current data base is CDS.MST, the different parts of the format will be executed as follows:

Format piece Database Ref->bib(..) CDS l->book(..),v100,v200) BIB V10 CDS

BOOK‘s MFN corresponding to term v10 of CDS is used as reference for the REF function that is retrieving the content of BIB‘s v100 and v200.

More realistically, you may use a format such as below (the single quotes are optional): ref->‟item‟(lr->‟item‟((|CN=|v37)),v100," "v200/)

v37 is a field in the current database CDS containing the class number. The above format looks up the inverted file of the ITEM database for CN=.. (which has multiple postings). It retrieves the MFN of each posting in ITEM, opens each record, and displays the contents of fields v100 and v200 on separate lines. Note that the use of a repeatable group is mandatory with the LR function (hence the double parentheses). d. More on REF AND L It is possible to display, apparently as one record, information that has come from more than one record in the database. It also allow the combination of records in different databases.

The REF function allows you to link one record to another by means of the MFN of the second record. For example, you could have one record for a journal article and another for the journal containing it. The first could refer to the second by quoting its MFN.

Article record: MFN 259 10 Walker, Gladys 20 The care of azaleas 30 6

Journal record: MFN 6 20 Houseplants Monthly

Here field 10 is used for author, 20 for title (the article title in the first case and the journal title in the second) and 30 is used in the article record to contain the MFN of the relevant journal record. You can then display the combined information using the display format v10/v20/'In: 'ref(val(v30),v20)

The val function converts the contents of field 30 to a number. (Apart from the MFN, all fields in a CDS/ISIS record are held as strings of characters and cannot be used in calculations as they stand.) The ref function then extracts field 20 from the record with that number as its MFN. The result will be:

J-ISIS Reference Manual Page 65

Walker, Gladys The care of azaleas In: Houseplants Monthly

There are two limitations to using the ref function like this. One is that when you enter the article record you need to know that "Houseplants Monthly" is record 6. The other is that the article and journal records need to be in the same database, whereas you might prefer to keep them separate.

The first limitation can be overcome by using the L or "look-up" function. (A capital L is used here just to make it easier to read but lower-case works just as well.) This looks for a term in the inverted file in order to identify the related record. So, instead of using MFNs to make the link, we could use some sort of abbreviation, e.g.

Article record MFN 259 10 Walker, Gladys 20 The care of azaleas 30 HM

Journal record MFN 6 20 Houseplants Monthly 100 HM

Now field 30 is used in the article record to contain an abbreviation of the journal title. Field 100 in the journal record contains the abbreviation for that journal. "HM" should be easier to remember than "6". In the Field Selection Table, you then need an entry

Tag: 100 Technique: 0 Format: v100 so that the abbreviation is entered on the inverted file. In fact, it is more reliable to use a prefix for this term (see Section 7.3) so that it cannot be confused with the same term from another source (e.g. HM from a title "A visit by HM the Queen" or field 30 from article records.) In the Field Selection Table we could use

Tag: 100 Technique: 0 Format: "ABBREV="v100

To display the combined information you can then use v10/v20/'In: 'ref(L('ABBREV='v30),v20)

Note that it is the abbreviation in field 30 of the article record that needs to be looked up to find the relevant journal record. The second limitation - the need to have both records in the same database - is removed in J-ISIS (as in CDS/ISIS for Windows). Suppose that the journal records are in a separate database called JNL. You can then display the combined information using the display format v10/v20/'In: ' ref->JNL(L->JNL('ABBREV='v30),v20) Here both the linking (with ref) and the look-up (with L) are done to the database JNL. Another use in a library context might be to run a circulation system. You could have a database of books (called BOOK) and a database of registered borrowers (called BORR). When a book is issued to a borrower, you could enter the borrower code in a field of the book record. When the book is returned you could delete the borrower code. The REF function allows you to produce a display or printout combining information from the two records. A simple example is shown below.

BORR database 10 Christopher Sabanda 20 17 Fourth Street 30 Harare 40 CS230

J-ISIS Reference Manual Page 66

BOOK database 100 Francis Zikonda 200 Introduction to patent law 900 CS230

In the BORR database, field 40 is used for the borrower code. Let us assume that it has been put onto the index file as BC=CS230. In the BOOK database, a borrower code is entered in field 900 when the book is borrowed. If you need to produce a recall notice, you could use a format like this to print from the book database:

'Dear ',ref->BORR (L->BORR('BC='v900,v10)/'The following book is now overdue.'/'Kindly return it immediately.' /v100/v200 e. Using REF, L, LR function with three Databases:

1. BOOKS Database

FDT:

FST:

Sample Data:

2. LOAN Database J-ISIS Reference Manual Page 67

FDT:

FST:

Sample Data:

3. Member Database FDT:

FST:

Sample Data:

J-ISIS Reference Manual Page 68

We create a PFT in the Loan Database that should produce the following output:

Member ID: 14590(1) Name: Amjad Ali Malik(2)

Acc#: B014803(1) Title: Electromagnetic principles of integrated optic(3)s Issue Date: 10-02-2014(1) Acc#: B014804(1) Title: Optoelectronic technology and lightwave communications systems(3) Issue Date: 10-03-2014(1) (1) Data from Loan DB (2) Data from Member DB (3) Data from Book DB

Here is below the PFT, and clicking on apply will produce the desired result.

And selecting the Loans PFT in the Data View will show as follow: J-ISIS Reference Manual Page 69

f. S(format) The S function returns the text produced by its argument. As mentioned earlier CDS/ISIS provides no explicit operators for string expressions. The S function may be used, however, to implement string concatenation. It is particularly useful in boolean expressions, where it may be used to implement an implicit OR, which is more efficient (and more concise) than using an explicit OR operator. For example, the two following boolean expressions:

S(mdl,v10,v20,v30) : „water‟

V10 :„water‟ or v20 : „water‟ or v30 : „water‟ are equivalent (they are both True if any of the fields 10, 20 or 30 contain the string ‗water‘), but the first will execute faster than the second.

J-ISIS Reference Manual Page 70

g. Substring: SS(pos, length, format) A substring of a string can be produced in two different ways:

1. by using the *offset.length construct with the S function, as in the following example: S(v24,v69)*3.5 (in this case CDS/ISIS will extract 5 characters starting from the 4th position of the string returned by S); 2. by using the substring function SS(pos,len,format). That function will take the substring of the string returned by format beginning at position pos and len characters long. For example SS(1,5,v30) will extract the first 5 characters of field 30.

The main difference between the two forms is that in the SS function both pos and len can be numeric expressions, while in the *offset.length construct the values must be numeric constants. Note also that * works with an offset (starting from 0), while SS works with a position (starting from 1).

h. DATE(exp) Returns the current date and/or time in the format specified by the numeric expression exp. The value of exp can be one of the following: 1 - returns a date stamp identical to the one returned by the DATESTAMP function of ISIS Pascal, i.e. an 18-byte string of the form MM-DD-YY HH:MM:SS (e.g. date(1) could return: 09-30-97 15:03:44); 2 - returns only the date (e.g. date(2) could return: 09-30-97); 3 - returns only the time (e.g. date(3) could return: 15:03:44).

J-ISIS Reference Manual Page 71

i. DB The DB function returns the name of the current data base. j. Format exits In a format you may invoke Groovy programs you have written to perform special formatting functions required by a particular application, which could not otherwise be obtained by using the formatting language. These programs are called Format exits. As format exits are developed to satisfy specific needs, their description is beyond the scope of the formatting language. J-ISIS provides, however, a normalized way to interface format exits with the formatting language.

From the point of view of the formatting language a Format exit is a string function with a format argument. The argument format is first executed and its output is passed to the function. A format exit returns a character string which J-ISIS handles as if it was a field in the record being formatted.

From the point of view of J-ISIS a Format exit is a Groovy Method that can be written with the Groovy Console. Before a Format exit can be referenced in a format, the corresponding program must have been compiled successfully.

A Format exit is invoked as follows:

&Name(format)

Where:

& identifies this as a Format exit invocation; Name is the name of the Groovy function; Format is the argument.

Format exit

J-ISIS Reference Manual Page 72

Please note that to get access to the format output, a String should be used through the binding object and the "format" property

Format Output &PFTExampleFunc('xxx') xxx &PFTExampleFunc(v26^a) Paris &PFTExampleFunc(mhu,v24) AN ELECTRIC HYGROMETER APPARATUS FOR MEASURING WATER-VAPOUR LOSS FROM PLANTS IN THE FIELD

3. Boolean functions a. P(field selector) The P function returns True if the record being formatted contains at least one occurrence of the field or subfield indicated by the argument. For example: Format Value p(v24) True p(v26^d) False p(v70[2]) True p(v80) False

J-ISIS Reference Manual Page 73

b. A(field selector) The A function returns True if the record being formatted contains no occurrence of the field or subfield indicated by the argument. Note that the absence of a field implies the absence of all its subfields. Therefore, if the field selector specifies a subfield the A function returns True if either the field is present, but the specified subfield is absent, or the field itself is absent. For example:

Format Value a(v24) False a(v24^s) True a(v26^d) True a(v80) True

H. IF command The IF command allows you to implement context-sensitive formats, i.e. formats able to produce output which may vary depending on the contents of the record being formatted. It is coded as follows:

IF condition THEN format-1 ELSE format-2 FI J-ISIS Reference Manual Page 74

where: Condition is a boolean expression as defined under ―Boolean expressions‖; Format-1 is a J-ISIS format which will be executed if and only if the Boolean expression is True; Format-2 is a J-ISIS format which will be executed if and only if the Boolean expression is False.

The ELSE format-2 clause is optional and may be omitted. The IF, THEN and FI keywords are always required, although format-1 may be omitted when an ELSE clause follows (e.g. whenever nothing has to be output if condition is True). An IF command may therefore also take one of the following alternate forms: IF condition THEN format-1 FI IF condition THEN ELSE format-2 FI As there is no restriction in the commands you may use in format-1 or format-2, IF commands may be nested to any desired depth. The FI keyword must, in this case, be used to close each IF command (you may think of IF and FI as a pair of parentheses). For example:

if p(v1) then v24 else if p(v2) and a(v3) then v5 fi fi

The IF command is particularly useful to develop generalized formats for integrated data bases, which contain different types of records. In this case you will normally have distinctive mark for each type of record (typically there will be a field containing a code identifying the type of record). Thus, by checking the type of record with an IF command, you may perform, through a single format, specific formatting for each type.

I. Repeatable groups A repeatable group consists of a set of formatting commands enclosed in parentheses. The meaning of each command is the same as described above, except that repeatable fields are handled in a special way. In order to understand the concept of repeatable groups you should first know how J-ISIS handles repeatable fields. In the absence of any other indication, J-ISIS treats all the occurrences of a repeatable field (in the order they have been entered) as a single string of text. A repeatable group alters the way J-ISIS would normally handle the occurrences of a repeatable field, by processing one occurrence at a time rather than all together. This process is described below, with some examples. When J-ISIS encounters the open parentheses of a repeatable group it proceeds as follows:

1. An occurrence counter is initialized to 1.

2. The format enclosed in parentheses is then executed in such a way that all field selectors within the group will only output the occurrence of the field corresponding to the current occurrence counter.

3. If no output was produced (i.e. if there were no more occurrences of any of the repeatable fields referenced within the group), then the processing of the repeatable group is terminated. Otherwise the occurrence counter is increased by 1 and steps 2 and 3 are repeated.

J-ISIS Reference Manual Page 75

Note that all formatting commands within a repeatable group are processed one occurrence at a time (as explained above), including, therefore, fields referenced in IF commands, expressions and functions, as well as in string functions used as commands. Because of the processing explained above, you should not use unconditional literals within a repeatable group (if you do, they will be output one time more than you would expect). In most cases the use of simple formatting commands, such as the mode command or repeatable literals, is sufficient to adequately handle repeatable fields, as in the examples given below:

Format Output mpl,v70 Grieve, B.J.Went, F.W. mdl,v70 Grieve, B.J. Went, F.W. v70+|; | Grieve, B.J.; Went, F.W.

There are cases, however, where you will need to format repeatable fields in other ways. A frequent case is, for example, the need to format each occurrence on a new line, which may only be done by using a repeatable group, as shown below:

Format Output Grieve, B.J.Went, F.W. v70/v26^a Paris Grieve, B.J. (v70/),v26^a Went, F.W. Paris

In the first case the newline command (/) is executed after formatting all the occurrences of field 70; whereas, in the second case, it is executed after each occurrence. J-ISIS Reference Manual Page 76

The example below shows the handling of repeatable subfielded fields (assuming that the record contains two occurrences of field 20 as indicated). Here the use of a repeatable group has helped to properly display the various subfields of each occurrence of the repeatable field in a tabular manner.

Field Record content

20 ^aNew York^bMcGraw Hill^c1988 20 ^aLondon^bAcademic Press^c1975

Format: /(v20^a,c13,v20^b,c31,v20^c/) Output: New York McGraw Hill 1988 London Academic Press 1975

Format: /v20^a,c13,v20^b,c31,v20^c/ Output: New YorkLondon McGraw HillAcademic Press 19881975

If you need to output a literal before the data produced by a repeatable group, you may use an unconditional literal or a conditional literal. Note, however, that if you use a conditional literal it must be associated with a field selector (a repeatable group is not a field selector); you must use a dummy field selector for this purpose (see below). As a further example of a repeatable group, assume that in a personal history record field 10 contains the previous employers of a person and field 20 contains the functions that the person had when working for a particular employer. In such a record, both field 10 and 20 would be repeatable, since a person might have worked for more than one employer. This is a case where a logical relationship exists between two repeatable fields. Below is an example of use of a repeatable group to display these two fields (it also illustrates the use of a dummy field selector).

Record contents 10 Bedford and Associates 20 Junior programmer 10 Van Allen Inc. 20 System programmer 10 Michigan University 20 Lecturer in Computer Science

Format: "Employment History"/#d10,(v10(6,6)/v20(12,12)/#)

Output: Employment History

Bedford and Associates Junior programmer

Van Allen Inc. System programmer

Michigan University Lecturer in Computer Science

Format: "Employment History"/d10,(c7,v10|: |,c37,v20/)

Output: J-ISIS Reference Manual Page 77

Employment History Bedford and Associates: Junior programmer Van Allen Inc.: System programmer Michigan University: Lecturer in Computer Science

Repeatable groups cannot be nested (i.e. a repeatable group may not contain another repeatable group), unless the inner group is contained in the format argument of a REF function. Thus, for example, the following is a valid format:

(v10,ref(val(v20),v10,(v20,v30)))

whereas the following is invalid and will produce an error message:

(V10, (v20,V30))

Note that the use of a repeatable group is mandatory whenever: 1. you use a repeatable field as the argument of the L function;

2. the first argument of the REF function references a repeatable field.

You should also consider whether the use of a repeatable group would be called for whenever a repeatable field is used in the Boolean expression of an IF command.

J. Format errors

It is recommended to use the J-ISIS PFT Manager to create and test a J-ISIS format. Clicking on the button will perform a syntax analysis of the format to ensure that it conforms to the formatting language rules. The message "ISIS FMT Parser: PFT program parsed successfully." will be displayed in the output console if the format is parsed successfully.

Whenever J-ISIS detects an error in the format, it interrupts the formatting, and issues the messages: ISIS FMT Parser: Encountered errors during parse. Encountered " "AAAAA "" at line 1, column 65. ... This message, together with the offending token ("AAAAA" in the example) and the row/column numbers, will help you in determining the erroneous part of the format.

J-ISIS Reference Manual Page 78

Please note that the cursor row/column numbers are displayed in the left bottom corner and will change as you move the cursor in the format editor.

Messages are appended to the Output console and it may be useful to clear the Output Console when checking the format syntax to get rid of the previous error messages. Clicking on mouse right button will open a context menu. And clicking on clear will clear the Output Console.

K. Including an external format You may include an external format in a format by using the @name function, where name is the name of the format to be included. This format must be in the data base path (as specified in parameter 5 of SYSPAR.PAR or parameter 10 of dbn.PAR). For example:

if v1=„BIB‟ then @fmt1 else @fmt2 fi

In this example, the contents of field 1 will determine which of fmt1 or fmt2 will be executed.

J-ISIS Reference Manual Page 79

L. Format variables J-ISIS (as CDS/ISIS) predefines ten (10) numeric and ten string format variables which you may use in your format as applicable. The ten numeric variables are named E0 through E9 and the ten string variables are named S0 through S9. The numeric variables are initialized to 0, while the string variables are initialized to null strings, each time a format is executed.

You may assign or change the value of a numeric variable as follows:

En:=numeric expression (for example: e1:=val(v10)+5)

and you may assign or change the value of a string variable as follows:

Sn:=(format) (for example: s5:=(v10)).

Note that the parentheses around format are required. A numeric variable may be used anywhere a numeric value can be used, e.g. as operand of a numeric expression as in if e1+10<25 then ... fi. As any other numeric value, a numeric variable cannot be directly displayed, but must first be converted using the F function. A string variable may be used both as operand of a string expression and as a formatting command.

M. WHILE command The WHILE command provides looping capabilities so that you can repeatedly execute a format. It is coded as follows: WHILE condition (format) where:

condition is a Boolean expression as defined on p. 55 of the CDS/ISIS Reference Manual; format is the CDS/ISIS format to be repeatedly executed while the Boolean expression is True.

If the initial value of condition is False then format will not be executed at all. For the loop to end you must provide inside ―format” whatever commands are necessary to render ―condition” False whenever the loop must be terminated. If an infinite loop is generated, Winisis will not respond to the user. For example:

e1:=1,e2:=nocc(v70), while e1<=e2 (f(e1,1,0),'. ',v70[e1]/ e1:=e1+1) J-ISIS Reference Manual Page 80

The example above displays each occurrence of field 70 on a new line preceded by the number of the occurrence, e.g.:

1. First Author

2. Second Author

3. Third Author

A more complex example is given below. s1:=(v69),e0:=size(s1),e1:=1,e3:=1, while e1'<' (e1:=e1+1) e2:=e1+1, while e2<=e0 and ss(e2,1,s1)<>'>' (e2:=e2+1), s2:=(ss(e1+1,e2-e1-1,s1)), if size(s2)>0 then f(e3,1,0),'. ',s2/ e3:=e3+1 fi, e1:=e2+1 )

In this example, we scan field 69 for the occurrences of keywords enclosed in < >, and display each keyword preceded by its sequence number, e.g.:

1. First Keyword

2. Second Keyword

3. Third Keyword

J-ISIS Reference Manual Page 81

N. CISIS functions

instr(string1, string2) find strin g Function type: Numeric

Syntax instr(,) Returns a number specifying the starting position of the string generated Definition by , found in string settled by . If there is no match, return value is zero. Both and must generate strings, otherwise a Notes: syntax error occurs. The use of s function may help in cases where a complex string is required as parameter.

if instr(v5,'ab')>0 then v5/, fi, Examples: if instr(s(|'|v1|'|),v5)>0 then v1, fi, left(v18,instr(v18,'.')-1),

iocc Occurrence index

Function type: Numeric

Definition: Returns the occurrence index number (starting from 1), otherwise returns zero. ("Author: "v1/, ,if iocc > 3 then 'et all',break, fi), Examples: (f(iocc,3,0),|.|v10/),

left(string,length) Left substring Function type: String

Syntax: left(,) Returns a new string, containing the leftmost characters from of the original Definition: string generated by . specifies the actual number of characters to be read from starting from left to right.

If the string generated by is greater than the size defined by , function returns the string. If is zero or is set to a Notes: negative number, returns a NULL string.

Examples: if left(v1^n,2)='Ma' then v1^n/, fi, left(v1,instr(v1,'.')-1),

mid(string, start, length) substring

Function type: String If is greater than size, function returns a NULL string. If Notes is zero or is set to a negative number, default is 1.

J-ISIS Reference Manual Page 82

mid(v2,2,80), Examples: mid(v1,instr(v1,'key'),size(v1))/,

replace(string1, string2, string3)

Function type: String

Syntax: replace(,,

Returns a new string, after replacing with .

If is a null string or is not found in string. replace is case sensitive for both search string () and replace string ().

replace('Mary And John','And','and')/,

if replace(v1^a,'01x','01X')= '894501X' then v1^n/, fi,

replace(s(v304,v333),',',', ')/,

replace(s(if v415='spanish' then v299 else 'none' fi),v1,v759)/,

right(string, length) Function type: String

Syntax: right(,)

Returns a new string, containing the rightmost characters of the original string (). gives the actual number of characters to be read from starting from right to left.

If is greater than size, function returns string. If is zero or is set to a negative number, returns nothing.

if right(v1^n,1) = 'r' then v1^n/, fi,

right(v65,4)/,

continue repeatable conditional branching

Syntax: continue

Executes the next occurrence of a repeatable group if at least one data field has a subsequent occurrence.

(if iocc = 1 then continue else v10/ fi), J-ISIS Reference Manual Page 83

(f(iocc,1,0),'=',v70,continue/),

break conditional branching/quitting

Syntax: break

Breaks the execution of a repeatable group format. When outside a repeatable group, quits the execution of the current format.

The execution resumes after the end of the repeatable group. When used inside a ref function, execution continues with the format after the function.

(if iocc > 10 then '10+ occurrences'/,break else v5^n|-|,v5^s,/, fi,),

select … case … elsecase … endsel conditional branch control

Syntax: select case : case : case : [elsecase ] endsel

Evaluates and compares the result to each case option (, ). If an option matches , the appropriate block of formatting language specifications is executed (, ), otherwise elsecaseclause (if defined) is executed (format-0).

must generate a string or numeric value. If evaluates to string, all option values in case clauses must be of string type, otherwise, if is numeric, option values must be also numeric.

select s(v5) case '1': ,f(val(v5)/2,2,2)/, case '2': ,v5/, case '3': ,v6,'-',v1/, elsecase ,|Error in field v5 = |v5/, endsel,

select nocc(v7) case 0: 'absent'/, case 1: 'one occurrence'/, case 2: 'two occurrences'/, elsecase 'more than 2 occurrences'/, endsel,

iocc occurrence index

J-ISIS Reference Manual Page 84

Function type: Numeric

Definition: Returns the occurrence index number (starting from 1), otherwise returns zero.

Examples ("Author: "v1/, ,if iocc > 3 then 'et all',break, fi),

(f(iocc,3,0),|.|V10/),

O. The XHTML/CSS/JavaScript Display environment WinISIS offers graphical display commands for the Windows graphic environment. However, most of these WinISIS Windows commands are specific to Microsoft. J-ISIS uses an embedded browser for displaying format output and thus offers all XHTML/CSS/JavaScript facilities. And most of these Windows command can be replaced by XHTML, with CSS and JavaScript. The use of XHTML, with CSS and JavaScript commands will help you in designing formats which will greatly improve your output. They offer a rich array of text-writing capabilities. For example CSS lets you choose the font1 to be used for text output. Some basic CSS can be learned from the W3SCHOOL http://www.w3schools.com/html/html_css.asp

JavaScript Examples:

1 A font is a collection of characters that have a unique combination of height, width, typeface, character set, and other attributes. An application uses fonts to display or print text of various faces or sizes. For example, word processing applications use fonts to give the user a "what you see is what you get" (WYSIWYG) interface.

J-ISIS Reference Manual Page 85

J-ISIS Reference Manual Page 86

Using an external JavaScript file This example uses the code presented in the article "Creating a Code 39 Barcode using HTML, CSS and JavaScript" on the CodeProject web site. http://www.codeproject.com/Articles/146336/Creating-a-Code-39-Barcode-using-HTML-CSS-and-Java

The Print format barcode1.pft is available with the ICOMOS database and looks like this in the PFT Manager:

The external JavaScript file "code39.js" is located in the /ICOMOS/ipft folder and loaded from the HTML section as follow:

J-ISIS Reference Manual Page 87

''/ ' Code39 Barcode'/ ' '/ ' '/ ''/ The ID CSS selector #barcode is embedded and reference the "barcode" ID in the XHTML. External CSS files could also be loaded through URLs. It's interesting to note that in that case, the external files are served by the J-ISIS embedded Web server. URLs are relative to the J-ISIS embedded Web server doc root folder . Thus the URLs can be accessed remotely in a local network or from the web.

Clicking on the Apply button and selecting the last ICOMOS record will display as follow:

P. Adding Hypertext links to formats: the LINK command The LINK command allows you to add interactivity to your format, by establishing a relationship between a field (or set of fields) of a record and an action to be performed. The general format of the LINK command is as follows: LINK((descriptor),action) where: descriptor is a format describing to the user the action to be taken; the output of this format is displayed using color 2 (normally green, by default) and underlined; this text can be clicked with the mouse; note that this format must be enclosed in parentheses; action is a format telling J-ISIS the action to be performed; the output of this format is not displayed and must be one of the hypertext commands listed below, which will be executed whenever the user clicks on the item.

a. OPENFILE command: This command let J-ISIS to automatically find the proper application to open the specified file, if any installed on your computer. Syntax: link((„Click to open‟),'OPENFILE file://c:/mypage.doc')

J-ISIS Reference Manual Page 88

Note: J-ISIS uses uniform resource locators, abbreviated URL to address documents. Documents may be located anywhere on the Web, in remote or local J-ISIS databases folders, or on your computer file system. Blanks inside web URL path must be replaced by "%20". The path separators should be slashes and not backslashes. The "file://" URL protocol should be used to address local computer host specific file names and disk letters should be "C|" instead of "C:" , and blanks should be kept.

link(('PDF From local FileSystem'),'OPENFILE file://C|/jisis- workspace/home_test_db/LIBCAT/idocs/ifla bibliographic record.')

If an application on your computer is associated with a DOC documents (for instance MS-Word), the command will open it to show the file mypage.doc Replaces in many cases the command CMD and can be used in menu options as well. You can also open a web address: link(('UNESCO'), 'OPENFILE http://www.unesco.org')# or open your favourite mail software to write an e.mail: link(('Write'),'OPENFILE mailto:[email protected]')# or open any document on a shared network directory: link(('Documentation'), 'OPENFILE file://computer-1/Public/file1.pdf')# or open doc file located on remote computer (IP address) or localhost: link(('Word DOC From Server'),'OPENFILE http://localhost:8585/LIBCAT/idocs/J-ISIS%20Presentation%20New.doc'),#

b. CMD command NOTE: The CMD command is not implemented for security reasons

J-ISIS Reference Manual Page 89

Q. WinISIS Windows commands not implemented in J-ISIS In WinISIS, most of these commands are specific to Microsoft Windows, and now can be replaced by XHTML and CSS commands. Some WinISIS commands such as GOTO, GOBACK are more complicated to implement because of the Client/Server architecture. However, these commands may be implemented in the future when the JSON communication between the client and the server will be available.

FONTS

COLS

Paragraph Formatting

Indentation M(indent,indent)

Tabs TAB or TAB(value)

Centering QC

Justification QJ

Right alignment QR

Frame BOX

New page NP

Image insertion PICT

Character Formatting

Bold b

Italic i

Underline ul

Font choice fn

Font size fsn

Text color fln

Escape |

J-ISIS Reference Manual Page 90

LINK Commands GOTO

These commands may be implemented in LOGOTO term the future when the JSON communication between the client and the server will be LAGOTO/xxx term available. GOBACK

FORMAT format

BROWSE base,mfn,format

TEXTBOX format

TEXTBOXCHILD format

TEXTBOXRCHILD format

TEXTBOXLOAD format

TEXTBOXRCHILDLOAD format

TEXTBOXIMG format

TEXTBOXCHILDIMG format

TextBOXRCHILDIMG format

PROMPT TEXTBOX…

VIEW base, mfn, format

R. Differences with WinISIS There are some differences in the print formatting language syntax between WinISIS and J-ISIS. J-ISIS is using a grammar for defining the syntax and generating the syntax analyzer or parser. The grammar was designed from the WinISIS Reference Manual and is stricter than WinISIS.

For example:

 %/ is not accepted and should be replaced by %#

 V07 should be replaced by V7

 “Conditional literal” should be followed by a field

 literals can contain the literal delimiter escaped:

o 'l\'inventaire' J-ISIS Reference Manual Page 91

o "quote: \"AhAh\""

o | value1 \| value2 |

The J-ISIS print formatting language is also used for indexing, sorting, printing, reformatting, validating, exporting and importing records. The formatting language has a strict syntax and semantics and formats entered by the user are parsed before being accepted by the system.

10. Field Definition Table (FDT)

A. Introduction The Field Definition Table (FDT) provides information on the contents of the master records in a given data base. In particular it defines the various fields which may be present and a number of parameters for each field. The FDT is used to control the creation of data entry worksheets for the data base and to validate the contents of fields, and it is created or modified by means of the Edit -> Field Definition Table of the Main menu bar. A sample FDT, as displayed by the line editor, is shown below.

Each line of the FDT defines one field of the Master file record and contains 7 parameters: the field tag, name, type, presence of a indicators (Marc21), repeatability, first subfield and subfields delimiters or pattern These are described below.

Field Tag - The tag is a unique numeric value identifying the field. As in CDS/ISIS, you will use the tag of the field each time you want J-ISIS to perform a given operation on the field. The tag is stored in the master record and is associated with the contents of the corresponding field. J-ISIS Reference Manual Page 92

Field Name - The field name is a descriptive name you assign to the field. It is normally used in data entry worksheets to label the field on the screen. You may consider that this is the name of the field as you know it, whereas the tag is the name by which the field is known to J-ISIS.

Field type - The field type indicates possible restrictions on the data characters which may be stored in the field. The field type may be one of the following:

ALPHANUMERIC ALPHABETIC NUMERIC PATTERN DATE TIME BLOB URL DOC

Indicators – Indicates if the field has indicators as defined in bibliographic formats such as Marc21. If this check box is checked, the advanced worksheet editor will automatically generate data entry fields for the indicators. In that case, the indicators are stored in in the first subfield as defined below.

First Subfield – Indicates if the first subfield of a subfielded field has a subfield delimiter Note that the first subfield of a subfielded field need not have a subfield delimiter, provided that it is always present. For example, if in a title field you wanted to use a subfield for the subtitle, the title part of the field, which will obviously always be present, need not have an explicit delimiter. Thus the following entry for this field would be possible: Il nome della rosa^bUn manoscritto

If this box is checked, the advanced worksheet editor will automatically generates a data entry element for this implicit subfield.

Repeatability - This parameter defines whether the field is repeatable (i.e. it may occur more than once in any given record) or not.

Subfields/Pattern - Depending on the type of field defined, this entry defines either the set of subfields, if any, allowed in the field, or the pattern (for type PATTERN).

Subfields - If the field contains subfields, the allowed subfield identifiers are defined here, in the order in which they must appear. Note that the not sign (^) identifying the subfield delimiter is not entered. For example, if a field may contain the subfields ^a ^b and ^c, these are defined in the FDT as abc (and not ^a ^b and ^c)

B. General data base design guidelines The generalized nature of J-ISIS allows you to define data bases according to your specific requirements. J-ISIS never makes any assumptions about the data you are processing and, in particular, it has no knowledge of their meaning. It simply provides a set of functions, normally required in any information storage and retrieval package, which help you in establishing efficient information systems. Because of this, it is impossible to provide a set of fixed rules for designing a data base, but only broad guidelines. The following paragraphs cover some basic topics on data base design. However, in order to get the most out of J-ISIS you should be fully familiar with all the facilities it offers, and particularly with the specific techniques described in this chapter, as a poor data base design may later prevent the use of some of the J-ISIS features. For example, a thorough understanding of such advanced features as the REF function of the formatting language (see under

J-ISIS Reference Manual Page 93

―REF(expression, format)‖ or the Groovy programming services are essential in the design of integrated data bases.

1. Data elements A data element, as its name implies, is an elementary piece of information. The first step in designing your data base should be a careful and comprehensive analysis of the data elements required. Items normally eligible to be selected as data elements would be those that must be able to be processed individually. In determining this, you should ask yourself typical questions such as: ―Will the item be needed for sorting?‖; ―Must it be searchable?‖; ―Will there ever be a need to print it differently than others, e.g. in bold face or upper case?‖; etc. If the answer to any of these questions is yes, then the item should be selected as a data element.

2. Fields and subfields Data elements may stored in fields or subfields. A field is identified by a numeric tag and is defined in the FDT of the data base. You may think of the tag as the name of the field as it is known by J-ISIS. Each time you want J-ISIS to perform an operation on a particular data element you must supply the tag of the field where that data element is stored. For example, in the FDT given above, the title is assigned tag 24. If you wanted to display the contents of the title field you would ask J-ISIS to display V24 (which is the formatting language command to display a field). J-ISIS normally treats the contents of a field as a continuous string of characters and as a single entity. You may, however, subdivide a field into subfields. In this case the field contains more than one data element, each being stored in a different subfield. Unlike fields, subfields are not identified by a tag but by a subfield delimiter. A subfield delimiter is a 2-character code preceding and identifying a variable length subfield within a field. It consists of the character ^ (not sign) followed by an alphabetic or numeric character, e.g., ^a. In the FDT listed above, the Imprint field has been defined as containing the place of publication, publisher and date of publication in the three subfields a, b and c respectively. A sample Imprint could be:

^aParis^bUnesco^cl985

A field containing subfields may be accessed as a single entity, by referring only to the tag of the field (e.g. v26). In this case J-ISIS provides options for displaying subfield delimiters (normally for proofreading purposes), or automatically replacing them by punctuation marks. However, because subfields are identifiable through their subfield delimiter, you may also access each subfield individually, by specifying both the field tag and the relevant subfield delimiter. For example, V26^b refers to the Publisher subfield of the Imprint field and V26^a to the Place of publication subfield. In designing a data base, remember that the J-ISIS formatting language has a facility for automatically replacing subfield delimiters by punctuation marks. Try, if possible, to chose delimiter codes in such a way that the replacing punctuation is suitable for the application, otherwise you will have to format each subfield individually. The standard delimiter replacement table is given under ―Mode command‖. Note that the first subfield of a subfielded field need not have a subfield delimiter, provided that it is always present. For example, if in a title field you wanted to use a subfield for the subtitle, the title part of the field, which will obviously always be present, need not have an explicit delimiter. Thus the following entry for this field would be possible:

Il nome della rosa^bUn manoscritto

3. Repeatable fields In those cases where a given data element may occur more than once in a given record, CDS/ISIS will create as many fields as required to hold all the occurrences of the data element. This type of field is called a repeatable field. A typical example is the Author field in a bibliographic record. All the occurrences of a repeatable field have the same tag. J-ISIS offers facilities for handling and formatting repeatable fields. You can access a J-ISIS Reference Manual Page 94

particular occurrence of a repeatable field individually through the formatting language. If, for example, it is the case, that the first occurrence of a repeatable field needs a particular treatment (e.g. the first author), it is possible. Repeatable fields may contain subfields, which gives you a facility for handling 2-dimensional tabular data (one dimension being the field, and the other the subfields). Furthermore, you may define a field to be repeatable even though it contains a single data element. It may be useful, for example, to be able to break down a relatively long text such as an abstract or a summary, into paragraphs to improve its legibility in a printout. By defining such a field as repeatable, you may then use the formatting language facilities, provided for repeatable fields, to indent the first line of each paragraph. Another example is when you want to be able to search such long fields by words. By entering each paragraph as a separate occurrence, you may later use the (F) operator of the search language to restrict to a paragraph the search for two or more words, which you would not be able to do if the field was not repeatable (see under ―Field level and proximity search operators‖)

4. Control characters Certain characters stored in a field, although keyed in as data, will be interpreted by J-ISIS as control characters, rather than data characters, and will normally activate some special type of processing. Control characters are normally reserved for J-ISIS use and may not therefore be used as data. Subfield delimiters are an example of control characters. Other control characters recognised by J-ISIS are described below. a. Search term delimiters Search term delimiters may be used to identify key terms or phrases assigned to each record to enable its retrieval. The various techniques which J-ISIS provides to index records are described under The Field Select Table (FST). Keywords may be delimited in either of two ways: by enclosing them between a pair of slashes (/.../) or by enclosing them in triangular brackets <>. The advantage of using triangular brackets over using slashes, is that, these, unlike slashes, are reserved characters, and J-ISIS provides options to either display the brackets or suppress them, whereas no option is provided to suppress slashes. When brackets are suppressed, they are normally deleted from the displayed version of the field, except when an open bracket immediately follows a closed one: in this case J-ISIS will replace them with a semicolon and a space. For example, by selecting the appropriate display mode the following entry: will be displayed as follows: university course; documentation training; library school.

Except for the case mentioned above, you must ensure that the required spaces precede and follow the open and closed bracket respectively. For example when keywords are embedded within other text in the field as below: Mission report describing a in at an East African the spaces surrounding the keywords must be present in order to produce the correct display: Mission report describing a university course in documentation training at an East African library school

If the field was entered as follows: Mission report describing ainat an East African

J-ISIS would display it as: Mission report describing auniversity courseindocumentation training atan East Africanlibrary school

In other words, J-ISIS simply ignores the brackets and does not replace them with spaces.

J-ISIS Reference Manual Page 95

b. Filing information When producing printed catalogues you will need to sort the contents of one or more fields in order to print the records in the required sequence. J-ISIS will try to produce a sorting sequence according to normally accepted filing rules, but sometimes this will not be possible. In these cases J-ISIS offers you the possibility to state explicitly how a given field must be sorted by supplying filing information at the time you enter the data. Filing information is permanently recorded in the field. This facility allows you to instruct J-ISIS to replace or ignore any sequence of data characters in a field whenever the field is used as a filing element, by using one of the following specifications:

in this case, J-ISIS will replace text-a by text-b when the field is used in sorting, but use text-a (and ignore text-b) when displaying the field;

in this case text-a will be ignored when sorting and only used to display the field.

Below are a few cases in which this facility is normally used (but its use is not restricted to just these cases):

Entered as evolution of information systems Sorted as EVOLUTION OF INFORMATION SYSTEMS Displayed as The evolution of information systems

Entered as <100=one hundred> days Sorted as ONE HUNDRED DAYS Displayed as 100 days

Entered as Pherson, J Sorted as MACPHERSON J Displayed as Mcpherson, J

C. J-ISIS Databases And MARC Records

1. ISO 2709 MARC Records The term MARC stands for: Machine Readable Cataloging. MARC records consist of structure, markup and content.

The structure of all MARC records is based on an exchange format for bibliographic records as specified in the ANSI/NISO Z39.2 and ISO 2709:1996 standards.

The markup and content are different for the different national formats (UNIMARC, USMARC, CANMARC, UKMARC, MARC21, etc.) and reflects the standards used related to cataloging like cataloging rules, classification scheme‘s and subject headings.

The cataloging rules describe what information to enter in a MARC record, where to find this information on the resource, the order in which to enter this information, and even the punctuation that should be used for entering each different peace of this information.

For example, MARC21 format provides instructions on how to enter information into a MARC record. The MARC21 coding manuals tell us which field tags should be used for which type of information, what indicators are appropriate for those field tags, which subfields contains which pieces of information, and whether or not fields tags and subfields are repeatable. (Information data tagging, MARC21 tags and subfield codes).

J-ISIS Reference Manual Page 96

It's important to distinguish between

. Structure – record syntax • MARC leader, directory, indicators, variable fields, etc.

. Content – the descriptive information • title, subject, coded description, etc.

. Markup – data tagging • MARC 21 tags and subfield codes

2. All ISO 2709 MARC Formats use the same MARC record structure

The structure of MARC records is pretty straightforward, but it is not human readable. It consists of a byte stream with four building blocks:

Leader The leader is a fixed length field of 24 characters containing record processing information such as the record length, the status of the record, the type of material being catalogued and the base address of data. The base address of the data is the starting position for the variable fields.

Directory The directory immediately follows the leader and provides an index to the data fields. For each data field the directory provides an entry containing the field identifier or tag (three digits), the fields length (four digits) and the starting position (five digits). The directory is terminated by a field separator.

Variable Fields The variable fields containing the actual record data follow after the directory. There are three kinds of variable fields:

 Control number field (a special control field identified by tag 001)

 control fields (identified by tags 002 through 009)

 data fields (identified by tags 010 through 999)

The control fields contains only data (no indicators and subfield delimiters) while data fields contain indicators and subfields

4. Information Interchange Format (IIF), ANSI Z39.2, ISO standard 2709

IIF is the "Information Interchange Format", a record serialization format specified in ISO standard 2709, also published as ANSI Z39.2. IIF is mostly a plaintext format, in that almost any information is encoded using ASCII characters (no binary numbers) and the only control characters used are byte values 29 (record terminator RT), 30 (field terminator FT) and 31 (as subfield delimiter). This standard specifies the requirements. for a generalized information interchange format that will accommodate many types of data, especially bibliographic description of all forms of materials and related data J-ISIS Reference Manual Page 97

such as authority, holdings, circulation, etc.‘ It describes a generalized structure, a framework, designed specifically for exchange of data between processing systems and not necessarily for use as a processing format within systems. This standard does not specify the content of a record and does not, in general, assign meaning to tags, indicators, or data element identifiers. Such specifications shall be provided by particular implementations of the standard. The format may be used for the interchange of records using various communication media.

5. XML MARC Record Structure MARCXML The Library of Congress' Network Development and MARC Standards Office is developing a framework for working with MARC data in a XML environment. This framework is intended to be flexible and extensible to allow users to work with MARC data in ways specific to their needs. The framework will contain many components such as schemas, stylesheets, and software tools developed and maintained by the Library of Congress.

6. J-ISIS Record Structure and MARC Records The record object model implemented by J-ISIS is close to the MARC record structure, making J-ISIS particularly well suited for managing bibliographic data. Data elements are stored in fields, each of which is assigned a numeric tag indicative of its contents. A field may be optional (i.e. it may be absent in one or more records), it may contain a single data element, or two or more variable length data elements. In the latter case the field is said to contain subfields, each of which is identified by a 2-character subfield delimiter preceding the corresponding data element. Furthermore a field may be repeatable, i.e. any given record may contain more than one instance or occurrence, of the field. J-ISIS uses the standard CDS/ISIS separator, character ―^‖. A subfield delimiter is a 2-character code preceding and identifying a variable length subfield within a field. It consists of the character ^ followed by an alphabetic or numeric character, e.g. ^a. J-ISIS is very flexible and allows to define fields that corresponds to the different national formats (UNIMARC, USMARC, CANMARC, UKMARC, MARC21, etc.) through the field definition table. Furthermore fields may be set to have indicators. The Advanced Worksheet Editor and Data Entry modules allow to edit records with indicators. The information contained in the Leader of a MARC record (positions: 05/06/07/08/09/17/18/19) can be stored in special fields of J-ISIS, with tags over 3000. The present version uses tags 3000+offset (offset = byte position, for example, the Ldr/05 is saved in field 3005). The transfer back and forth of Leader data into special fields of J-ISIS is done when importing/exporting. Important Note: When creating a new database intended to store MARC records, it's quite important to respect the MARC rules for identifying the different types of Variable Fields: i) Control number field (a special control field identified by tag 001); ii) control fields (identified by tags 002 through 009) iii) data fields (identified by tags 010 through 999). The control fields contains only data (no indicators and subfield delimiters) while data fields contain indicators and subfields. Following these rules will assure correct data export to ISO2709, MARCXML and other XML formats.

However, by default J-ISIS make no assumptions on the variable field types; when importing they are all supposed to be data fields whatever their tag.

J-ISIS Reference Manual Page 98

Checking import/export of a ISO2709 MARC21 format record For this purpose, we will use the Carl Sandburg's Arithmetic (single record) provided on the Library of Congress.

Original MARC (2709) Record (sandburg.mrc)

Import In J-ISIS Database

J-ISIS Reference Manual Page 99

Once the external ISO-2709 file is successfully imported, we can view the J-ISIS database clicking on "Data Viewer" in the "Browse" menu

J-ISIS Reference Manual Page 100

J-ISIS Reference Manual Page 101

Export from J-ISIS Database to ISO2709 MARC file sandburg.iso

MARC-8 standard for character coding, and so will code this position in the MARC leader with a blank space.

zero for No lines

Once the J-ISIS sandburg database exported as an external ISO-2709 file, we can use any file compare tool to make a comparison between the exported file "sandburg.iso" and the original file "sandburg.mrc" . For example using TexPad Compare Tool, we get:

Compare: (<)C:\jisis_suite 18 June 2014\jisis_suite\work\sandburg.iso (1142 bytes) with: (>)C:\jisis-workspace\LOC-Test files\sandburg.mrc (1142 bytes)

The files are identical J-ISIS Reference Manual Page 102

Export from J-ISIS Database to MARCXML file sandburg.mrcxml Until now we have been processing MARC data in ISO 2709 format, but J-ISIS can also read (import) and write (export) data in MARCXML format.

Once the J-ISIS sandburg database exported as an external MarcXML file, we can make a comparison between the exported file "sandburg.mrcxml" and the LC MarcXml file "sandburg.xml" . For example using TexPad Compare Tool, we get:

Compare: (<)C:\jisis-workspace\LOC-Test files\sandburg.xml (3522 bytes) with: (>)C:\jisis_suite 18 June 2014\jisis_suite\work\sandburg.mrcxml (3522 bytes)

The files are identical J-ISIS Reference Manual Page 103

J-ISIS can also read (import) and write (export) data in other XML formats such as MODS and Dublin Core. In fact, when importing the input is pre-processed using a stylesheet to convert into MARCXML format, and then the result is imported into a J-ISIS database. In a same way, a J-ISIS database is exported into MARCXML and then processed with a stylesheet to produce the other XML format(s).

The Library of Congress provide stylesheets to transform MODS data to MARCXML and MARCXML to MODS, a well as other stylesheets to transform different bibliographic formats from and to MARCXML.

Please note that these stylesheets are accessed remotely through Internet, thus an Internet connection is necessary and it may take some time before getting the stylesheet from the Library of Congress depending on the connection quality.

J-ISIS Reference Manual Page 104

11. The Field Select Table (FST)

A Field Select Table (FST) defines criteria for extracting one or more elements from a J-ISIS database record. Depending on the context in which an FST is being used, these elements may then be used to: i) create the Lucene index entries for the record from which they were extracted, ii) for sorting records in the desired sequence before producing a printed report, iii) or to reformat records during an import or export operation. An element can be generally defined as a fragment of a record resulting from a particular process. Although in many cases elements will be actual data elements, i.e. a field or a subfield, in other cases they may be words, phrases, or any other piece of data which has a particular meaning to a specific application.

FSTs are created or modified by means of the FST Manager under Tools > FST Manager

. A sample FST is displayed below:

An FST consists of one or more lines each defining three mandatory parameters and an optional parameter:

1. A field identifier (column labeled ID); 2. A field name 3. An indexing technique (column labeled Technique); and 4. A data extraction format coded using the J-ISIS formatting language (column labeled Format).

J-ISIS Reference Manual Page 105

Whenever J-ISIS is requested to extract elements using an FST, it will read the relevant database records and carry out, for each record and for each FST entry, the following process:

1. Execute the format to extract from the record the corresponding data; 2. Apply the specified indexing technique to the data produced by the format; and 3. Assign to each element thus produced the specified field identifier prefixed with an underline character ( _ ) or the field name if defined.

The process described above is strictly mechanical and is performed exactly as described. There is no transmission of knowledge between one step and the next, only of data, although all steps co-operate in achieving the desired result. For example, the fact that a particular field was extracted during step 1 is not known to step 2: step 1 uses the full power of the formatting language to produce a string of characters and pass it on to the step 2. This step operates on this character string according to the specified indexing technique. Indexing techniques are defined as processes on character strings, not on records or fields. It is because of this generalized design that FSTs may be used for such different purposes as defining the contents of the Inverted file or specifying the sorting requirements of a printed listing, which might appear, at first sight, totally unrelated. In the most general terms, you may think of an FST as a device able to produce elements of data required to perform a certain task.

A. FST parameters The four parameters of an FST line are described below in the order they are processed.

1. Data extraction format This is coded using the J-ISIS formatting language described under "The Formatting Language". Because the data produced by this format is not meant to be displayed, but further processed, J-ISIS does not restrict the line width to any particular value and, consequently, it will never split data between lines. The concept of lines, however, may be relevant to a particular indexing technique applied to the output produced by the format. In this case J-ISIS will guarantee that lines will only be created in response to explicit new line commands you specify in the format. Because of this, certain formatting commands such as the C, the indentation or the escape sequence commands would normally be irrelevant in a data extraction format and may, in some cases, produce unexpected results. They should therefore be avoided, unless they are necessary to achieve the intended result. On the other hand, the mode (see "Mode command") selected to output certain fields may be instrumental to the correct functioning of a particular indexing technique: certain techniques require in fact a specific mode (this is indicated under each indexing technique discussed below). It is your responsibility to insert the appropriate mode command(s) in the data extraction format, if necessary. Also note that requesting upper case translation, may adversely affect other further processes applied to the data produced by the FST. As a general rule you should not request upper case translation (use modes mpl, mhl or mdl as applicable, rather than mpu, mhu or mdu), unless you are sure it is needed and will not have any side effects. J-ISIS will automatically perform normalization of terms before storing them in the index (or dictionary). Normalization means conversion to uppercase and stripping diacritic characters. upper case translation whenever needed as well as . For example, all elements generated by the Inverted file FST will be normalized before they are stored in the dictionary, even when the FST produces them in lower case with diacritic characters. Query terms will also be normalized to discover the right term suggestions and to make a search. For example: Jóború will be indexed as JOBURU and querying for Jóború will search records with fields/subflieds containing JOBORU but only Jóború will be highlighted

J-ISIS Reference Manual Page 106

2. Indexing Techniques An indexing technique specifies a particular processing to be performed on the data produced by the format in order to identify the specific elements to be indexed. There are eight indexing techniques which you may use. They are given a numeric code from 0 to 4 as explained below. a. Indexing technique 0 Build an element from each line extracted by the format. This technique is normally used to index whole fields or subfields. Note, however, that J-ISIS will build elements from lines, not from fields. This is because J-ISIS looks upon the output of the format as a string of characters where fields are no longer identifiable. It is therefore your responsibility to produce the correct data through the format, especially when you are indexing repeatable fields and/or more than one field. In other words, when using this technique, your data extraction format should output one line for each element to be indexed. b. Indexing technique 1 Build an element from each subfield or line extracted by the format. As J-ISIS will search the output of the format for subfield delimiter codes, for this technique to work correctly your format must specify proof mode (or no mode at all, as this is the default mode), because it is the only mode preserving the subfield delimiter codes on output (remember that heading and data mode replace subfield delimiters by punctuation marks). Note that indexing technique 1 is in fact a shortcut to using indexing technique 0. For example: Record content: ^aParis^bUnesco^c1965

FST Format output Elements(terms) produced 1 1 mpl,v26 ^aParis^bUnesco^c Paris l965 Unesco 1965 1 0 mhl,v26^a/v26^b/v26^c Paris Paris Unesco Unesco 1965 1965 1 1 mdl,v26 Paris, Unesco, Paris, Unesco, 1965 1965

c. Indexing technique 2 Builds an element from each term or phrase enclosed in triangular brackets (<...>). Any text outside brackets is not indexed. Note that this technique requires proof mode, because the other modes delete the brackets. The advantages of using triangular brackets over using slashes (Indexing technique 3), are discussed under "Search term delimiters‖.

A field containing ―Mission report describing a in at an East African ‖ will produce the following elements when indexed with this technique: university course documentation training library school d. Indexing technique 3 Does the same processing as indexing technique 2 except that terms or phrases are enclosed in slashes (/../). For example the following text:

Mission report describing a /university course/ in /documentation training/ at an East African /library school/ J-ISIS Reference Manual Page 107

will produce the following elements when indexed with this technique: university course documentation training library school e. Indexing technique 4 Build an element from each word in the text extracted by the format. A word is any sequence of contiguous alphabetic characters. When you use this indexing technique, you may prevent certain non-significant words from being indexed by defining them in a special file called the Stopword file (see under ―Creating a stopword file‖ for details on how to build a stopword file). Note: when this technique is used to index an entire field containing subfield delimiters, you must specify heading or data mode (mhl or mdl) in the corresponding data extraction format so that subfield delimiter replacement will take place before indexing, otherwise alphabetic subfield delimiter codes will be considered part of a word. It is also advisable to use heading or data mode if the field being indexed contains filing information, so that only the display form of the field is indexed and any data required for sorting the field is ignored (see under ―Filing information‖). f. Indexing techniques 5 to 8 The following 4 indexing techniques will allow specifying a prefix for search terms extracted with indexing techniques 1, 2, 3 and 4. These techniques are numbered 5, 6, 7 and 8 respectively. The prefix is specified in the data extraction format as an unconditional literal as follows:

'dp...pd', [format]

Where: „d‟ is a delimiter of your choice (which does not occur in the prefix itself) p...p is the actual prefix

For example: 1 8 '/TI=/',v24

This will index each word of field 24 and prefix each term with TI=. g. Indexing techniques 9 The indexing technique 9 is a J-ISIS specific indexing technique. It will allow to index sentences in the output produced by the Print Formatting language. g. Indexing techniques 10 Technique 10 generates N-Grams Tokens from the phrases recognized in the text.

3. Field identifier

The field identifier is a number (in the range 1-32767) which is assigned to each element created during the analysis/extraction step. The meaning of the field identifier depends on the purpose the FST is being used for, as explained below. Index/Inverted file FST: J-ISIS will use the field identifier prefixed with an underline character ( _ ) as the Lucene field name if the FST entry field name is empty. Otherwise, the FST entry field

J-ISIS Reference Manual Page 108

name will be used as the field name to be used for creating Lucene fields. The Lucene field name will be used for indexing and searching; Sorting FST: the field identifier is the field tag to be used in a user-supplied heading format (see ―Heading format‖); Reformatting FST: the field identifier is the ISO tag to be assigned to an exported field (see under ―Reformatting FST‖), or the J-ISIS tag to be assigned to an imported field (see under ―Reformatting FST‖).

4. Field Name

The FST entry field name will be used as the field name to be used for creating Lucene fields if present. Otherwise, J-ISIS will use the field identifier prefixed with an underline character ( _ ). The Lucene field name will be used for indexing and searching. For example, we could extend the CDS data base FST with field names as follow:

Important Notes: i) field selection table entry names should consist of alphanumeric or underscore characters only and not start with a digit. For example CD/DVD/Cartridge is not allowed as a field name but CD_DVD_Cartridge is allowed; ii) If several field selection table entries have the same ID, then the corresponding names should be the same as in the following example:

J-ISIS Reference Manual Page 109

After any change in the index FST (called master) of a database, you will need to re-index the data base to make the changes effective. Saving the modified FST and re-indexing the CDS database will change the dictionary as follow:

The Lucene field names are now the ones specified in the FST. It will also be reflected in the Search module:

B. Index/Inverted file FST As noted earlier one FST for each data base defines the contents of the corresponding index/inverted file. The elements built by this FST, once stored in the inverted file, constitute the dictionary of searchable terms for the data base.

J-ISIS Reference Manual Page 110

1. Building a Search Index

J-ISIS uses Lucene for building the index and searching. The Index FST is used to produce the terms that will be stored in the Lucene index. One Lucene Document will be generated for each J-ISIS Database record, storing the elements produced by the FST entry fields as well as the master file number (MFN) to further retrieve the J-ISIS database records.

J-ISIS supports all advanced search language facilities provided by Lucene, such as proximity search operators, phrase search, wildcard searches, etc.

Furthermore, in order to support Lucene advanced search language facilities, it is necessary to index the individual words in the record fields using indexing technique 4.

In this example the two repetitive fields with tag 8 and 9 respectively have two lines in the FST. The first line generates index terms for each occurrence, and the second line generate index terms for each word recognized in all occurrences. Both lines process the same field and assign the same ID to the format output.

As a simple example, the FST lines for ID 9 applied to occurrence:

Will produce the following index terms:

J-ISIS Reference Manual Page 111

HACIENDA and PUBLICA are produced by the line 9 4 (V9/) while HACIENDA PUBLICA is produced by the line 9 0 (V9/).

If we now consider the following occurrence of field 8:

J-ISIS will applied the two FST lines that involve field 8:

The 1rst line tell J-ISIS to create an index term for each occurrence of field 8 in a particular record. Thus the whole text of the occurrence will be indexed. In that particular case, it is not very useful, except that in the Guided Search, you will get the whole text as a suggestion if you begin to type "apropi" .

The second line is more interesting because it tells J-ISIS to create an index term for all words recognized in all occurrences, i.e. Apropiación, indebida, de, tributos, etc.

Indexing individual tokens like this opens the door for using all advanced search language facilities provided by Lucene, such as proximity search operators, phrase search, wildcard searches, etc.

Indexing with technique 4 will index all tokens if a stopwords file is not provided. Providing a stopwords file will index only the tokens that are not in the stopwords file.

2. Stopwords File (stopwords.txt)

If you are indexing a field by separate words (indexing technique 4) you may want to prevent common, non- informative words such as 'an' or 'the' from being indexed. This can be achieved by setting up a stopwords file for the database. Words on the stopwords file will not be indexed using indexing techniques 4 (though they may still appear as part of phrases produced with other indexing techniques). Note that there can only be one stopwords file for a given database, not different files for different fields.

The stopwords file needs to be set up outside J-ISIS using a text editor or word processor . It must have a file name stopwords.txt and must reside in the same folder as the FDT file for the database.

The file must contain one stopword on each line with no preceding spaces, and the words must be in lowercase letters. An example is shown below. a able about above according accordingly across actually after J-ISIS Reference Manual Page 112

afterwards again against all allow allows almost alone along already ......

______

The J-ISIS installation directory has a sub directory called /stopwords that contains some predefined stopwords files for English, French and Spanish. If you wish to use the French or Spanish, copy the relevant stopwords file in the /ifdt directory of the database and rename it to stopwords.txt

C. Reformatting FST A reformatting FST can be provided during Import/Export operations. Reformatting will be performed before importing or exporting records if this option is checked and a FST name is provided. When used as a reformatting tool, the FST is interpreted in the following manner:

Each line of the FST represents an output field;

Each output field is assigned an ISO tag equal to the field identifier defined in the corresponding FST line;

The data extraction format given in the FST defines the contents of the field. In this format you must use the J- ISIS tag of the fields as defined for the database. Each line produced by the format ( or each element, if the FST specifies indexing techniques 2, 3, or 4) will generate a new occurrence of the output field.

Assume for example that your database contains the following fields:

70 Author (repeatable)

24 Title

69 Keywords (repeatable)

50 Notes

A reformatting FST could be:

1 0 mfn Output field 1 contains the MFN

J-ISIS Reference Manual Page 113

100 0 (V70/) Output field 100 same as input field 70 (note the use of a repeatable group in the format to output each occurrence of field 70 as a separate line)

200 0 V24 Output field 200 same as input field 24

300 0 |<|V69|>| Output field 300 contains keywords enclosed in <...>, each keyword taken from one occurrence of input field 69

Note that, as none of the formats references field 50, this field will not be exported. You may therefore use a reformatting FST to only export selected fields. You will find below a FST created in the FST Manager that reflects the example.

You can test the reformatting FST option by exporting the CDS database with the above reformatting FST

J-ISIS Reference Manual Page 114

export-test.iso

Reformatting FST

And now if we import the export-test.iso file into a J-ISIS database, we can see that the changes done through the reformatting FST.

J-ISIS Reference Manual Page 115

12. The FST Manager

12.1 Presentation

The FST manager offers a workspace for creating, editing, testing, and deleting FSTs.

The master index FST as well as other type of FST can be edited, format syntax can be checked as well as output produced when applying FST content to specific database records.

J-ISIS Reference Manual Page 116

The top panel displays the available fields from the Field Definition Table (FDT), and the bottom panel displays the Field Selection Table entries.

 Selecting a row in the top panel and clicking on the down arrow will add a new entry in the FST with an ID identical to the field tag. The FST ID can be changed and may be different than any FDT tag.

 Selecting a row in the bottom panel and clicking on the up arrow will remove the FST entry.

 The FST named "master" is the FST used for indexing the database and cannot be deleted.

A) FST Manager Control Panel

The FST Manager control panel contains the following items:

Allows to select a different FST. By clicking on this field the list of available FSTs (as defined in the /ifst folder of the server machine) is displayed.

Clicking on this icon allows to create a new FST, a dialog asking for the name of the new FST pops up. An empty FST will be displayed in the bottom panel after entering the name and clicking on OK. You can create FST lines by selecting a FDT line in the top panel and clicking on the down arrow icon between both panels. You can remove a FST line by selecting a FST line in the bottom panel and clicking on the up arrow icon between both panels (This operation has no effect on the FDT top panel).

Clicking on this icon saves the current FST. The FST is formatted in

XML and saved in the /ifst folder of the server machine using UNICODE UTF-8 encoding).

Clicking on this icon deletes the current FST. The XML FST file is deleted from the /ifst folder on the.

Clicking on this icon checks the syntax of the formatting language

commands written in the FST entries. The Output Console displays the FST entries analyzed and any syntax error detected.

Clicking on this icon applies the FST extraction commands to the 1st database record and displays the extracted terms in the Output Console. You will then be able to check the terms that will be extracted and indexed according to the FST entries (formatting commands and indexing technique)

J-ISIS Reference Manual Page 117

Output is displayed differently in case the FST is not the "master" index FST. For example if we want to reformat CDS database title field 24 for MARC21 we could have a FST named "reformat-marc21" and an entry such as:

Clicking on this icon would display as follow in the Output Console

Clicking on this icon applies the FST to the previous record and displays the results in the Output Console.

Clicking on this icon applies the FST to the next record and displays the results in the Output Console.

Clicking on this icon applies the FST to the last database record and

displays the results in the Output Console.

You can enter a specific MFN, and clicking on applies the FST to the record identified by this record.

B) FST Manager Create/Delete FST Entries Buttons

You can create FST lines by selecting a FDT line in the top panel and clicking on the down arrow icon between both panels.

You can remove a FST line by selecting a FST line in the bottom panel and clicking on the up arrow icon between both panels (This operation has no effect on the FDT top panel).

Checking this box will store a copy of the record into the index. This is intended to be used with Search servers such as SOLR

J-ISIS Reference Manual Page 118

Checking this box will make a catch-all search field that references all extracted terms by the FST. This is intended to be used for making keyword queries as Google.

J-ISIS Reference Manual Page 119

13. Building a Search Index

Before being able to search quickly the data stored in a J-ISIS database or referenced by J-ISIS records, it's necessary to index it.

Understanding the Indexing Process We can break down the indexing process into three major and functionally distinct groups: i) the text is first collected from records and/or documents. It should be plain text, stripped from any formatting commands; ii) The plain text is then analyzed according the Field Select Table (FST) to produce a stream of terms; iii) The terms are normalized (i.e. converted to uppercase and stripped from diacritic characters) iv) Finally, those terms are added to the index.

J-ISIS Reference Manual Page 120

14. Searching

J-ISIS provides two searching methods; Guided Search and an Expert Search which allows all Lucene capabilities.

Guided search is the simpler of the two and only supports Boolean terms. The structure of the search is constrained by the user interface, making it difficult to enter incorrect queries. Expert search permits a wider range of searching functions including proximity searching and searching for repeatable fields.

Fields that are indexed for searching are specified in the field selection table. Fields that are not indexed cannot be searched unless a free text search is used, which scans the entire contents of the records.

Indexed Terms The indexed terms are normalized (i.e. converted to uppercase and stripped from diacritic characters). i) Guided Search The suggested terms are normalized. For example, " Jóború, Magda" is indexed and displayed as "JOBORU, MAGDA"

But making a search on "JOBORU, MAGDA" will retrieve the record with "Jóború, Magda"

J-ISIS Reference Manual Page 121

ii) Expert Search The User must enter the query which is broken up into terms and operators. There are two types of terms: Single Terms and Phrases. The single terms and phrases are normalized and stopwords are removed if a stopword.txt file is provided. This process is done before making the search. Phrase searching of a sequence of terms can only be done on FST fields using indexing technique 4 for extracting term to index.

For example FST field 24 (or Title) of database CDS is indexed with technique 4 and we can do phrase searching on sequence of terms.

In this example, we search for "saline water" in field _24 (Title) which is the default search field. The record with the sequence of terms "saline water" is retrieved while individual words are indexed as terms, i.e. "saline" and "water" but not "saline water". In that case we could also make proximity search, for example "biology tropical"~10 which means find records with biology and tropical within 10 words.

J-ISIS Reference Manual Page 122

Or we can search for a long phrase:

iii) Index Terms for hyphenated words Extracting index terms with indexing technique 4 will split hyphenated words, i.e. "sud-ouest" will be indexed as "sud" and "ouest". This has the advantage to retrieve the records with the query "sud ouest" or "sud-ouest".

J-ISIS Reference Manual Page 123

Search without the hyphen:

Search with the hyphen:

Please note that in that case, the hyphenated word is highlighted because the record text match the query term.

J-ISIS Reference Manual Page 124

14.1 Guided Search

Guided Search is selected by default. The new Guided Search module uses auto complete user interface features that provide users with suggested queries or results as they type their query in the search box. This is also commonly called autosuggest or incremental search. J-ISIS auto complete implementation is very fast even on large indices in under a few milliseconds so that the user sees results pop up as he types them.

A) Single Term Searching Typing ―T‖ or ―t‖ on the CDS data base pops up the following results:

―Keywords‖ is the FST field name (also the Lucene field name) and the number between rectangular brackets is the number of occurrences for the term.

J-ISIS Reference Manual Page 125

If we want to select a term from the pop up list, we just click on the term in the term in the list. For example, if we click on ―TAIWAN‖, the term is completed in the query field.

Now, if we click on the ―Search‖ button, we get:

B) Multiple Term Searching We can connect several terms by ―AND‖ or ―OR‖ using the ―Match all of the following‖ (AND) and ―Match any of the following‖ (OR) radio buttons.

For example, we keep ―Taiwan‖ as first term and we click on the ―+‖ to display a new search term line:

J-ISIS Reference Manual Page 126

We type ―geo‖

And we click on GEOMORPHOLOGY in "Title" field suggested term to fill the search box with the ―GEOMORPHOLOGY‖ term

We click on search doing the default AND matching (match all of the following):

J-ISIS Reference Manual Page 127

Now we enable the ―Match any of the following‖ radio button, and click on the ―Search‖ button:

The query is therefore: ―Taiwan‖ OR ―geomorphology‖ in any search field

The list of records is by default presented from the most relevant retrieved record to the least relevant(s). You can check the 'Sort By MFN' check box to display the list in ascending MFN order.

J-ISIS Reference Manual Page 128

We can change the PFT to html2

Another Example:

J-ISIS Reference Manual Page 129

You can refine the query on specific fields:

J-ISIS Reference Manual Page 130

Suggestions will be restricted to terms that are in the specific field "Title"

J-ISIS Reference Manual Page 131

14.2 Expert Search

The checkbox ―Guided Search‖ is checked by default and must be unchecked. When the checkbox is unchecked, a new panel is displayed that allow to enter a query: Query Panel Unchecked checkbox

Lucene Query syntax should be used knowing that the field names are either the field tags prefixed with underline character "_" or the field identifier provided in the FST entry name column. http://lucene.apache.org/java/2_1_0/queryparsersyntax.pdf

14.2.1 Terms A query is broken up into terms and operators. There are two types of terms: Single Terms and Phrases. A Single Term is a single word such as "test" or "hello". A Phrase is a group of words surrounded by double quotes such as "hello dolly". Multiple terms can be combined together with Boolean operators to form a more complex query (see below).

Note: You can use phrase query when a FST field is indexed with indexing technique 4

J-ISIS Reference Manual Page 132

14.2.2 Fields Lucene supports fielded data. J-ISIS data base index FST defines the Lucene fields. Each FST line (or line) defines a Lucene field. The Lucene field name is defined by the FST entry field name or the FST entry field identifier prefixed with underline character "_" if no field name is specified.

When performing a search you can either specify a field, or use the default field. In J-ISIS, the Lucene field names are the FST field identifiers prefixed with the underline character ( _ ) or the FST entry field name if any specified. The default Lucene field is the 1st field defined in the FST.

You can search any field by typing the FST field name or the FST field identifier prefixed with the underline character ( _ ) followed by a colon ":" and then the term you are looking for.

FST field identifier FST field name

As an example, let's assume that the CDS data base FST contains three entries with field identifier 24, 69 and 70, and that the FST name column has empty cells.

If you look at the dictionary, the Lucene field names are _24, _69, _70 respectively. The 1st FST entry has afield identifier 24, therefore _24 will be the Lucene default field (which is also the title).

J-ISIS Reference Manual Page 133

If you want to find the record that contain broadcasting in the title and India as keyword, you can enter:

_24:broadcasting AND _69:India or broadcasting AND _69:India

Since field identifier 24 (Title) is the default field, the field indicator is not required.

J-ISIS Reference Manual Page 134

Note: The FST field identifier is only valid for the term that it directly precedes, so the query: _69:satellite broadcasting Will only find " satellite " in the field _69. It will find "broadcasting" in the default field (in this case _24).

14.2.3 Term Modifiers Lucene supports modifying query terms to provide a wide range of searching options.

Wildcard Searches Lucene supports single and multiple character wildcard searches: o To perform a single character wildcard search use the "?" symbol. o To perform a multiple character wildcard search use the "*" symbol. The single character wildcard search looks for terms that match that with the single character replaced. For example, to search for "wind" or "wood" you can use the search: w??d

J-ISIS Reference Manual Page 135

Multiple character wildcard searches looks for 0 or more characters. For example, to search for technical, technique, techniques, technologies, or technology, you can use the search: tec*

J-ISIS Reference Manual Page 136

You can also use the wildcard searches in the middle of a term. tec*ques

Please note that in that case term highlighting doesn't work for the time being.

Note: You cannot use a * or ? symbol as the first character of a search.

J-ISIS Reference Manual Page 137

Fuzzy Searches Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. Lucene‘s FuzzyQuery matches terms similar to a specified term. The Levenshtein distance algorithm determines how similar terms in the index are to a specified target term. (See http://en.wikipedia.org/wiki/Levenshtein_Distance for more information about Levenshtein distance.) Edit distance is another term for Levenshtein distance; it‘s a measure of similarity between two strings, where distance is measured as the number of character deletions, insertions, or substitutions required to transform one string to the other string. For example, the edit distance between three and tree is 1, because only one character deletion is needed.

For example to search for a term similar in spelling to "wood" use the fuzzy search: wood~

This search will find terms like flood, world, wind, tool, book, and food.

An additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example: roam~0.8

The default that is used if the parameter is not given is 0.5.

J-ISIS Reference Manual Page 138

Proximity Searches Lucene supports finding words are within a specific distance away. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search:

"vegetation tropical"~10

Range Searches

Range Queries allow one to match documents whose field(s) values are between the lower and upper bound specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically. mod_date:[20020101 TO 20030101]

This will find documents whose mod_date fields have values between 20020101 and 20030101, inclusive. Note that Range Queries are not reserved for date fields. You could also use range queries with non-date fields:

Title:{Aida TO Carmen}

This will find all documents whose titles are between Aida and Carmen, but not including Aida and Carmen. Inclusive range queries are denoted by square brackets. Exclusive range queries are denoted by curly brackets.

14.2.3 Boolean Operators Boolean operators allow terms to be combined through logic operators. Lucene supports

AND, "+", OR, NOT and "-" as Boolean operators.

Note: Boolean operators must be written in upper case.

The OR operator is the default conjunction operator

This means that if there is no Boolean operator between two terms, the OR operator is used. The OR operator links two terms and finds a matching document if either of the terms exist in a document. This is equivalent to union using sets. The symbol || can be used in place of the word OR. J-ISIS Reference Manual Page 139

To search for documents that contains either "soils and vegetation" or just "vegetation" use the query:

"soils and vegetation" vegetation

or

"soils and vegetation" OR vegetation

AND

The AND operator matches documents where both terms exist anywhere in the text of a single document. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND. To search for documents that contain plants AND techniques use the query: plants AND techniques

"water balance" AND hydrature

J-ISIS Reference Manual Page 140

+ (required operator)

The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document. To search for documents that must contain "saline" and may contain "water" use the query:

+saline water

NOT

The NOT operator excludes documents that contain the term after NOT. This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT. To search for documents that contain "biology" but not "water" use the query: biology NOT water

J-ISIS Reference Manual Page 141

Note: The NOT operator cannot be used with just one term. For example, the following search will return no results:

NOT water

- (prohibit operator)

The "-" or prohibit operator excludes documents that contain the term after the "-" symbol. To search for documents that contain "water" but not "saline" use the query: water -saline

14.2.4 Grouping Lucene supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query. To search for either "biology" or "measurements" and "plant" or plants use the query:

(biology OR measurements) AND (plant OR plants)

J-ISIS Reference Manual Page 142

(biology OR measurements) AND plant*

J-ISIS Reference Manual Page 143

This eliminates any confusion and makes sure you that plant or plants must exist in the title and either term biology or measurements may exist also in the title field (_24 in the CDS database).

Field Grouping Lucene supports using parentheses to group multiple clauses to a single field. To search for a title that contains both the word "biology" and the phrase "saline water" use the query:

_24:(+biology +"saline water")

14.2.5 Escaping Special Characters Lucene supports escaping special characters that are part of the query syntax. The current list special characters are + - && || ! ( ) { } [ ] ^ " ~ * ? : \ To escape these character use the \ before the character. For example to search for (1+1):2 use the query:

\(1\+1\)\:2

14.3 Expert Search Examples

14.3.1 Using Boolean Operators The index term ―INTERNATIONAL COOPERATION‖ has 3 occurrences in FST field _69 (Keywords) The index term ―INTERNATIONAL STANDARDS‖ has 1 occurrence in FST field _69 (Keywords) The index term ―INTERNATIONAL CONFLICT‖ has 2 occurrences in FST field _69 (Keywords) The index term ―INTERNATIONAL RELATIONS‖ has 1 occurrence in field _69 (Keywords) J-ISIS Reference Manual Page 144

Expert Search on terms in field 69 of data base CDS:

1st, we uncheck the ―Guided Search‖ checkbox and we enter the query as follow:

_69:"international conflict" OR _69:"international cooperation" OR _69:"international standards" OR _69:"international relations"

J-ISIS Reference Manual Page 145

14.3.2 Wildcard Search Example:

14.3.3 Proximity Search Field 24 is indexed with technique 4, i.e. that the output produced by the Print Format " MHU, V24" is tokenized into uppercase words that are individually indexed.

For example to search for a "measurement" and "plants" within 10 words of each other in FST field 24 (Title) of a record, we can use the search:

"measurement plants"~10

J-ISIS Reference Manual Page 146

Note: In that particular case, the search terms are not highlighted. This is a bug and it will be hopefully fixed in next release.

14.3.4 Searching Arabic (ISA Database)

J-ISIS Reference Manual Page 147

15. Search History

J-ISIS Reference Manual Page 148

This box contains the list of the search expressions which have been executed so far. For each expression it gives the set number, the data base name, the number of hits and the search expression. A new element is added to this list each time a search expression is executed.

Double clicking an element of this list will display the corresponding results in the Data Viewer List.

Clicking on OK will open a Data Viewer List where you can browse only the retrieved records.

J-ISIS Reference Manual Page 149

J-ISIS Reference Manual Page 150

16. WARNING ON A GENERAL EDITING ISSUE

There are many places in J-ISIS GUI where the data is presented in a table and the user has the possibility to edit a cell by clicking on the cell. These include the FDT Editor, FST Editor and many other places.

For example in the FST manager:

If you start modifying the format by clicking on the cell and type directly in the cell (‗new text‘ literal added) as follows:

The modification is not taken into consideration (i.e. saved physically) until you press ―ENTER‖ or click on another cell. Even if the change is displayed, it may not have been saved!

J-ISIS Reference Manual Page 151

17. Importing

16.1 Importing ISO 2709 files

You should have established a database server connection before importing. In the examples below, we will use the WinISIS cds example database that has been exported twice on ISO 2709 files cds0 and cds80 available in ―jisis_suite 21 December 2012\jisis_suite\Test DB\WinISIS cds‖ . The iso file cds0 contains one record per line and cds80 ISO file contains records which are split in lines of 80 characters.

The Import of databases in ISO2709 format has been extensively tested. WinISIS Databases in format ISO2709 and encoding CP850, CP1256 Arabic Windows, and UTF-8 have been successfully imported. Big databases with more than 170 000 (Louvre DB) , 370 000 (MARC DB) and 1 800 000 (Index Translatonium) records have been successfully imported.

Please note that for performance reasons, indexing is not performed when importing and should be done after through the ―Re Index Database‖ menu item of the ―Database‖ menu bar.

J-ISIS Reference Manual Page 152

Step 1: Select External File

Select the appropriate format, encoding, and the external file: Please note that for the CDS WinISIS database, we use Code page 850 which is a code page that was used in Western Europe, under DOS. The default encoding is ISO-8859-1 which is used by Windows. Thus it is needed to change the encoding to CP850. Then click on ―Next‖

Click on ―Next‖ as we want to create a new Database.

Step 2: Select the Import Option

The available options are similar to those available in WinISIS

J-ISIS Reference Manual Page 153

Step 3: Database o Provide the database name, o Click on ―Create a Database from Existing Plain Old FDT and FST‖ o Provide the FDT and FST path:

Step 4: Parameters

Change the default parameters if needed

Click on ―Finish‖, and then you will see the following dialog:

J-ISIS Reference Manual Page 154

Click on ―OK‖, check the parameters and click on OK if they are correct

Import will start and you can follow the status at the bottom on the right side

When import is finished, you will get the following dialog:

Click on ―OK‖ and you can now browse the database (―Browse‖->‖DB Browser‖):

J-ISIS Reference Manual Page 155

DON’T FORGET TO INDEX THE DATABASE! J-ISIS uses Lucene to index the database records. Terms are generated from the formats provided in the FST. The index can be rebuilt at anytime for the current DB, through the ―Re Index Database‖ menu item of the ―Database‖ menu bar. All WinISIS indexing techniques are implemented.

J-ISIS Reference Manual Page 156

Wait until the progress indicator disappears and you can see:

You can now check the index by browsing the dictionary:

J-ISIS Reference Manual Page 157

17.2 Importing MARC files

The difference is that you can import the record leader information in 30XX fields by checking the check box in step 4 and that you can re-use MARC FDT and FST templates. Please note that the information stored in 30XX fields (if any) will be move in the record leader when exporting.

There is one small marc file named “summerland.mrc” and an extract from ABCD called “marc- ABCD.iso” located in “\jisis_suite\Test DB\marc”.

J-ISIS Reference Manual Page 158

Step 2 is identical and in step 3 you select the MARC template FDT and FST:

And in step 4 :

 Don‘t forget to change the Input line length to ―0‖

 check the ―Move Leader Info into 30XX fields‖ checkbox

J-ISIS Reference Manual Page 159

If you look at the database in the ―Data Viewer‖, you will see:

J-ISIS Reference Manual Page 160

18. Exporting

The "Export database" menu item allows you to extract all data of a data base or a portion thereof normally for transmitting it to other users. You may also use this command to perform some reformatting of the records of a data base and then use the import function to store the reformatted data into the original or a different data base.

You can export in the following Marc formats: ISO 2709, MarcXML and MODS. The structure of all MARC records is based on an exchange format for bibliographic recors as specified in the ANSI/NISO Z39.2 and ISO 2709:1996 standards. If the database contains leader fields 3000:3004, these fields will be placed in the record leader when exporting. Let's try to export the simple marc database (summerland) that we just imported.

A) Select Output Format

J-ISIS Reference Manual Page 161

Keeping ―ISO2709 "and clicking on "Next" button will popup the dialog below:

B) Output Parameters

Name of output ISO file: Enter the name of the output file. Extension will be ".iso" for ISO2709, ".mrcxml" for MarcXML and ".mod" for MODS.

Output Directory: You can select the output directory where the output file will be stored. The "work" directory is used by default.

Field separator: This field defines the field separator character to be used in the output file. The standard field separator defined in ISO 2709 is the ASCII character 30 (hexadecimal 1E).

Record separator: This field defines the record separator character to be used in the output file. The standard field separator defined in ISO 2709 is the ASCII character 29 (hexadecimal 1D).

Subfield delimiter: This field defines the separator character for subfields to be used in the output file. J-ISIS usually uses character ^. However many bibliographic standards use character $. Default is ―^‖.You may also specify any ASCII character as field separator, by using the combo box.

J-ISIS Reference Manual Page 162

19. PFT Manager

19.1 Presentation

The PFT Manager offers a workspace for creating, editing, testing, converting and deleting PFTs.

PFT editor with syntax highlighting, Copy/Paste Undo/Redo, Syntax checking, PFT generation, etc...

The format editor has a left column used to display row numbers and a text area used to enter and display format elements. Format elements are displayed in different colors as they are recognized. The current row number is displayed in red and the cursor row/column positions are displayed in the left bottom corner.

J-ISIS Reference Manual Page 163

The top control panel has two parts: i) the left framed part allows to select, create, save or delete a format; ii) the text box editing buttons

Text box editing buttons -

Clicking on Generate HTML button opens a dialog that offers to generate different types of formats with html formatting.

Clicking on Convert button opens a dialog for converting plain old formats to UNICODE UTF-8.

Clicking on the Find/Replace Button opens a dialog that stays on top of the editor and that allows to find or replace strings in the editor.

The Quote button is enabled when some text is selected in the editor. Clicking on the Quote button will insert single quotes (') at the beginning and end of lines delimited by the selection. Thus the quoted lines will be parsed as J-ISIS format unconditional literals. This is useful when you insert HTML and/or JavaScript code in the format. Please note that the quote character is part of JavaScript language and thus, embedded quote must be escaped as \'.

Clicking on the Syntax button will perform a syntax analysis of the edited format to ensure that it conforms to the formatting language rules. The message "ISIS FMT Parser: PFT program parsed successfully." will be displayed in the output console if the format is parsed successfully.

J-ISIS Reference Manual Page 164

Whenever J-ISIS detects an error in the format, it interrupts the parsformatting, and issues the messages: ISIS FMT Parser: Encountered errors during parse. Encountered " "AHAHAERROR "" at line 2, column 16. Was expecting one of: ......

This message, together with the offending token ("AHAHAERROR " in the example) and the row/column positions, will help you in determining the erroneous part of the format. Clicking on the Apply button will open a browser window in the bottom panel displaying the output of the editor format applied to the first database record.

J-ISIS Reference Manual Page 165

You can browse database records, or jump to a specific record and the output produced by applying the editor format to the record will be displayed.

Clicking on the Source Tab will display the XHTML+CSS+JavaScript code:

J-ISIS Reference Manual Page 166

19.2 Re-Using Plain Old WinISIS PFTs

You can copy the old WinISIS pfts into the /ipft directory of the J-ISIS database

After copying the PFTs, closing all databases and re-opening the cds database, you will have access to these formats.

J-ISIS Reference Manual Page 167

19.3 Problems you may be faced when using old PFTs

a) WinISIS formats may be split arbitrary in lines of 80 characters as in this example:

Clicking on the ―Syntax‖ button will give an error as shown above. The solution is to rework the format so that line splitting doesn‘t occur in the middle of an expression.

b) Strange characters are displayed:

J-ISIS Reference Manual Page 168

J-ISIS is UNICODE and all data is stored using UNICODE encoding including data stored on file.

The solution is to change the encoding to UNICODE by clicking on the ―Convert‖ button:

Selecting the old PFT encoding which is CP850 for Most DOS/Windows old formats, and then clicking on OK button will change the encoding to UNICODE UTF-8 and reload the format. Once converted, the UNICODE format needs to be saved.

J-ISIS Reference Manual Page 169

Click on save to keep the new encoding

J-ISIS Reference Manual Page 170

20. J-ISIS Groovy Console

The J-ISIS Groovy Swing Console allows a user to enter and run Groovy scripts. This chapter documents the features of this user interface.

J-ISIS provides a Groovy Console tab that you can open through the ―Tools‖->‖Groovy Console‖ menu item:

Important Note: When you open the Groovy console in J-ISIS, a console window is also attached to the J-ISIS application. This console window is not used but is needed sending output produced by Groovy scripts in the bottom panel. This console window will disappear when closing the J-ISIS Groovy module. It looks like this:

Basics

1. The Console has an input area and an output area. 2. You type a Groovy script in the input area. 3. When you select "Run" from the "Actions" menu, the console compiles the script and runs it. 4. Anything that would normally be printed on System.out is printed in the output area. 5. If the script returns a non-null result, that result is printed.

J-ISIS Reference Manual Page 171

Features

Running Scripts

Handy tips for running scripts:  Ctrl+Enter and Ctrl+R are both shortcut keys for "Run Script".  If you highight just part of the text in the input area, then Groovy runs just that text.  The result of a script is the the value of the last expression executed.  You can turn the System.out capture on and off by selecting "Capture System.out" from the "Actions" menu

Editing Files

You can open any text file, edit it, run it (as a Groovy Script) and then save it again when you are finished.  Select File -> Open (shortcut key ctrl+O) to open a file  Select File -> Save (shortcut key ctrl+S) to save a file  Select File -> New File (shortcut key ctrl+Q) to start again with a blank

Input area

History and results  You can pop-up a gui inspector on the last (non-null) result by selecting "Inspect Last" from the "Actions" menu. The inspector is a convenient way to view lists and maps.  The console remembers the last ten script runs. You can scroll back and forth through the history by selecting "Next" and "Previous" from the "Edit" menu. Ctrl-N and ctrl-P are convenient shortcut keys.  The last (non-null) result is bound to a variable named '_' (an underscore).  The last result (null and non-null) for every run in the history is bound into a list variable named '__' (two underscores). The result of the last run is _[-1], the result of the second to last run is ___[-2] and so forth.  And more  You can attempt to interrupt a long running task by clicking the "interrupt" button on the small dialog box that pops up when a script is executing.  You can change the font size by selecting "Smaller Font" or "Larger Font" from the "Actions menu"  The console can be run as an Applet thanks to groovy.ui.ConsoleApplet  Code is auto indented when you hit return  You can drag'n drop a Groovy script over the text area to open a file  You can modify the classpath with which the script in the console is being run by adding a new JAR or a directory to the classpath from the Script menu  Error hyperlinking from the output area when a compilation error is expected or when an exception is thrown.

Notes: 1. The Groovy Console has a specific menu bar and toolbar which is embedded in the Groovy Console Window. You create, edit, save, load or execute Groovy script through them. J-ISIS Reference Manual Page 172

2. It may be necessary to add the J-ISIS libraries path to the ClassPath through the Script->Add Directory to ClassPath

In J-ISIS, the Groovy programming language can be used in two different contexts:

 Using Groovy to write Format exits (Call from the PFT to external functions). Then the value returned by the script is used to build the format output.

 Writing a Groovy program that is executed inside the Groovy console through the Exec toolbar button. This feature is intended to replace the ISIS Pascal one. The ISIS dll is replaced by the jisis-core.jar library.

21. Groovy Programming Language

Groovy is a dynamic language for the Java™ Virtual Machine (JVM). It offers full object-orientation, scripting, optional typing, operator customization, lexical declarations for the most common data types, advanced concepts like closures and ranges, compact property syntax and seamless Java™ integration. From Groovy, you can call any Java code like you would do from Java. It‘s identical. You can also call Groovy code from Java

21.1 Classes & Scripts

A Groovy class declaration looks like in Java. Default visibility modifier is public:

class MyClass {

void myMethod(String argument) {

}

}

When a .groovy file or any other source of Groovy code contains code that is not enclosed in a class declaration, then this code is considered a Script, e.g.

J-ISIS Reference Manual Page 173

println "Hello World"

Scripts differ from classes in that they have a Binding that serves as a container for undeclared references (that are not allowed in classes). println text // expected in Binding

result = 1 // is put into Binding

Methods may have parameters with or without default value and may return an expression:

def someMethod(para1, para2 = 0, para3 = 0) { // Method code goes here return expression }

21.2 Groovy Tutorial A Groovy tutorial is available at the following url:

http://groovy.codehaus.org/Beginners+Tutorial

21.3 Using Groovy to write Format exits (Call from the PFT to external functions)

The TestFunc, SimpleTestFunc and pdfCatalogue groovy scripts are provided with this distribution in the jisis-suite\work directory and the PFTs SimpleTestFunc, TestFunc are defined in the ASFAEX database.

For example, you can create the following Simple Groovy Function:

J-ISIS Reference Manual Page 174

You can test the function by clicking on the Execute Groovy Script Toolbar button as below:

Save it in the work directory with name SimpleTestFunc.groovy. Then, you can create SimpleTestFunc PFT that call the SimpleTestFunc Groovy script.

A Format exit is invoked as follows: &Name(format) Where: & identifies what follows as a Format exit invocation; Name is the name of the Groovy Script to be executed;

J-ISIS Reference Manual Page 175

Groovy (and Format exit) Naming

Every programming language has its own set of rules and conventions for the kinds of names that you're allowed to use, and the Groovy programming language is no different. It follows Java rules and conventions for naming variables, classes and methods. It can be summarized as follows:

Variable names are case-sensitive. A variable's name can be any legal identifier — an unlimited-length sequence of Unicode letters and digits, beginning with a letter, the dollar sign "$", or the underscore character "_". The convention, however, is to always begin your variable names with a letter, not "$" or "_". Additionally, the dollar sign character, by convention, is never used at all. You may find some situations where auto-generated names will contain the dollar sign, but your variable names should always avoid using it. A similar convention exists for the underscore character; while it's technically legal to begin your variable's name with "_", this practice is discouraged. White space is not permitted.

Subsequent characters may be letters, digits, dollar signs, or underscore characters. Conventions (and common sense) apply to this rule as well. When choosing a name for your variables, use full words instead of cryptic abbreviations. Doing so will make your code easier to read and understand. In many cases it will also make your code self-documenting; fields named cadence, speed, and gear, for example, are much more intuitive than abbreviated versions, such as s, c, and g. Also keep in mind that the name you choose must not be a keyword or reserved word.

If the name you choose consists of only one word, spell that word in all lowercase letters. If it consists of more than one word, capitalize the first letter of each subsequent word. The names marcRecord and jisisRecord are prime examples of this convention. Please note that naming a groovy function MARC-DISPLAY() is not correct while MARC_DISPLAY() is accepted.

Save it by clicking on ―Save‖

And now if you click on the Apply button, you will see the output generated by applying this format to the 1st record :

J-ISIS Reference Manual Page 176

Note that the output produced will be the same for all database records as we don't use any data from the record.

The full J-ISIS core library API is available for developing Groovy Scripts. It means that you can access the current record, the current database, the output produced by the format, or even other databases, FSTs, PFTs, parsing and execution of PFT, including searching, the index dictionary, iText Open Source PDF library, etc.

Here is below an example that accesses data from database records:

Please note the statement:

J-ISIS Reference Manual Page 177

IRecord rec = binding.getVariable("record");

This statement allows the script to access the current record which is provided through a &Name(format) command in a PFT. The current Database and the output produced by the format can be accessed through the following statements:

IDatabase db = = binding.getVariable("db")

String format = binding.getVariable("format")

Writing a PFT called "TestFunc" with a single format exit command and clicking on Apply button will display as follow:

Clinking on the Last button will display data from the last database record.

J-ISIS Reference Manual Page 178

J-ISIS Reference Manual Page 179

22. Database Creation

To create a database in J-ISIS, a database definition wizard is used. This consists of a sequence of dialogs that prompt the user for input to create four core database elements:

o Field definition table (FDT), o Data entry worksheet (WKS), o Default print format definition (PFT), o Field selection table (FST).

The field definition table defines the tag, name and type of fields in the database. Data entry worksheets create data entry interfaces that include only those fields that the user selects. Print format definitions are written in the ISIS formatting language and define the appearance of records. The field selection table selects fields to index and a corresponding indexing method.

J-ISIS Reference Manual Page 180

Clicking on next will provide a Wizard panel that displays the Field Definition Table Editor.

Field Definition Table (FDT) – Database Structure The Field Definition Table (FDT) provides information on the contents of the master records in a given data base. In particular it defines the various fields which may be present and a number of parameters for each field. The FDT is used to control the creation of data entry worksheets for the data base and to validate the contents of fields.

A field is created or updated by providing data in the fields of the upper line:

Each line of the FDT defines one field of the Master file record and contains 7 parameters: the field tag, name, type, presence of a indicators (Marc21), repeatability, first subfield and subfields delimiters or pattern These are described below. J-ISIS Reference Manual Page 181

Field Tag - The tag is a unique numeric value identifying the field. As in CDS/ISIS, you will use the tag of the field each time you want J-ISIS to perform a given operation on the field. The tag is stored in the master record and is associated with the contents of the corresponding field.

Field Name - The field name is a descriptive name you assign to the field. It is normally used in data entry worksheets to label the field on the screen. You may consider that this is the name of the field as you know it, whereas the tag is the name by which the field is known to J-ISIS.

Field type - The field type indicates possible restrictions on the data characters which may be stored in the field. The field type may be one of the following:

Indicators – Indicates if the field has indicators as defined in bibliographic formats such as Marc21. If this check box is checked, the advanced worksheet editor will automatically generate data entry fields for the indicators.

First Subfield – Indicates if the first subfield of a subfielded field has a subfield delimiter.

Note that the first subfield of a subfielded field need not have a subfield delimiter, provided that it is always present. For example, if in a title field you wanted to use a subfield for the subtitle, the title part of the field, which will obviously always be present, need not have an explicit delimiter. Thus the following entry for this field would be possible:

Il nome della rosa^bUn manoscritto

If this box is checked, the advanced worksheet editor will automatically generates a data entry element for this implicit subfield.

Repeatability - This parameter defines whether the field is repeatable (i.e. it may occur more than once in any given record) or not.

Subfields/Pattern - Depending on the type of field defined, this entry defines either the set of subfields, if any, allowed in the field, or the pattern (for type PATTERN). Subfields - If the field contains subfields, the allowed subfield identifiers are defined here, in the order in which they must appear. Note that the not sign (^) identifying the subfield delimiter is not entered. For example, if a field may contain the subfields ^a ^b and ^c, these are defined in the FDT as abc (and not ^a ^b and ^c)

Let‘s create a field with tag 10, ―title‖ as name and no subfields

J-ISIS Reference Manual Page 182

We first defined the tag, name and subfields in the line editor, and then we click on the « Add/Update » button to create it in the FDT table below

Let‘s create a second field with tag 20, ―Authors‖ as name and subfields ―ab‖ that is repetitive.

Clicking on next will provide a Wizard panel that displays the Worksheet Editor.

J-ISIS Reference Manual Page 183

Data Entry Worksheet

Clicking on the 2 arrows down button will create worksheet fields from the FDT fields as above

Clicking on next button will provide a Wizard panel that displays the Field Selection Table Editor.

Field Selection Table (FST)

Fields can be moved from the FDT into the FST or removed from the FST by clicking on the down and up

arrows respectively.

Clicking on ―Finish‖ button will create the Database

J-ISIS Reference Manual Page 184

J-ISIS Reference Manual Page 185

23. Data Entry

J-ISIS Edit menu provides a Data Entry module and an Advanced Data Entry module. The Data Entry module allows to enter data at the field level specifying explicitly the subfield delimiters, while the Advanced Data Entry module displays a hierarchical view of the worksheet fields and subfields that allows to enter data at the subfield and field level.

Data is entered manually through a data entry interface specified by the user through a worksheet definition. Data entry worksheet(s) or Advanced Data Entry worksheet(s) are used to create and/or update the master records of the data base. J-ISIS Edit menu provides two specially designed editor to create these worksheets, Data Entry Worksheet module and Advanced Worksheet Editor. The worksheet file xml format is compatible for both editor, and the Advanced Worksheet Editor allows to enter further detailed information for data entry at the subfield level.

23.1 Data Entry Worksheet

Whenever you select the Data Entry Worksheet module of the Edit menu, it opens the data entry worksheets editor.

J-ISIS Reference Manual Page 186

There is a dichotomy between the field definition table (FDT) fields and the worksheet fields, i.e. a worksheet field corresponds to a FDT field sharing the field tag as identifier. The top panel displays the field definition table (FDT) from which fields can be selected and copied to create a worksheet field in the worksheet bottom panel. Worksheet fields can be edited while FDT fields cannot be changed.

To create a worksheet field you need to select a field in the FDT panel and to click on the down arrow icon.

To remove a worksheet field, you need to select a field in the worksheet panel and click on the up arrow icon.

When a field is selected in the FDT panel, the double down arrow icon is enabled and you can click on it to create worksheet fields for all FDT fields.

J-ISIS Reference Manual Page 187

When a field is selected in the worksheet panel, the double up arrow icon is enabled and you can click on it to remove all worksheet fields.

Select a worksheet

Create a new worksheet, a dialog pops up asking for the the worksheet name Delete the current worksheet

Restore the worksheet as it was before any modification.

Save the current worksheet.

Close the worksheet editor

Changing the order of the worksheet fields

The worksheet fields are defined sequentially in the order of their respective creation. The data entry module will display the data entry fields according to the worksheet field sequence.

The order of the worksheet fields can be changed in the worksheet editor by selecting a field row and doing drag and drop until the new wished field position. J-ISIS Reference Manual Page 188

23.2 Data Entry Module

A) Presentation Whenever you select the Data Entry module of the Edit menu.

the first worksheet is selected and a window like this is displayed:

J-ISIS Reference Manual Page 189

Worksheet Selection Worksheet Scroll Bar

Data Entry Fields This window displays the field prompts, default values and help messages defined in the worksheet named "Default Worksheet". The data fields are empty except if we have a default value defined in the worksheet. A scroll bar allows to view the fields that does not fit into the viewing space. It contains a bar (or thumb) that can be dragged along a trough (or track) to move the body of the worksheet as well as two arrows on either end for precise adjustments.

Repeatable fields are displayed with a "+" button that allows to add occurrences:

J-ISIS Reference Manual Page 190

A repeatable field that has more than one occurrences has the occurrences (i+1) displayed with a "-" button. Clicking on the (i+1) occurrence minus button will delete this occurrence.

Pick List Button

Following WinISIS implementation, Pick Lists are defined in a file called databaseName.val where databaseName stands for the name of the database. This file must be stored in the /iwks folder of the database and is unique. Pick lists can be defined for all fields that need it, and only the pick lists defined in the selected worksheet will be available. The following pick list:

69:choice:<>:notype:multi::'Keywords'/'keyword 1'/'keyword 2'/'keyword 3' is defined in a file called CDS.val stored in /CDS/iwks/ folder. The data entry window provides a "Pick List" button in front of the data field and the data area is greyed because of the notype pick list parameter

Pick List Button

J-ISIS Reference Manual Page 191

Clicking on the Pick List button will popup the Pick List dialog that allows multiple selections (multi parameter).

We can for example select the first element with a click and the third element with Ctrl/click, and then click on OK. And the result is:

Note: You can double click an element from the Pick List dialog, but only the double clicked element will be retained which is not what we want in that case.

Digital Library Document Selection Button

A worksheet that contains a DOC type field will initially display a single occurrence data entry field with a special button for selecting the document to load. The document to load maybe in any format (PDF, DOC, DOCX, XLS, PPT, etc.)

Once selected, the document will be converted into plaintext format and stored into the DOC field first occurrence. It will also be copied verbatim in the ―/idocs‖ database folder, and the hyperlink (url) will be stored in the DOC field second occurrence, thus allowing access to the original document.

J-ISIS Reference Manual Page 192

Digital Library Selection Button

Clicking on the DL Selection Button will pop up a dialog for selecting a document:

Selecting for example the document "Z39-2.pdf" in the list above will fill field 10 (Document Text) occurrences as follow:

First occurrence contains the extracted text that will be used for indexing. The second occurrence contains yet the local link (or path) as the document is stored on the client side until we save the record on the server side as well as a copy of the original document.

J-ISIS Reference Manual Page 193

After saving the record by clicking on the diskette icon, we can see the server hyperlink. And now a copy of the original document is stored in the DL-example/idocs folder

Looking at the record in the data viewer, we can see that the second occurrence contains a link to the server side document that can be served by the Jetty embedded Web server.

Clicking on the link will launch the pdf reader:

J-ISIS Reference Manual Page 194

You can define a RAW format or another format that displays record content without the extracted text but with the document hyperlink and possibly other information.

J-ISIS Reference Manual Page 195

B) Data Entry Process Subfields and repeatable fields are permitted. Existing records can be modified or deleted through the same interface. Records are stored in a Berkeley DB that plays the role of the Master File. There is only one Berkeley DB for a database. A Lucene index and a field selection table (FST) are associated to a database. The field selection table defines the print format to be applied to a record for extracting the terms to index.

The basic data entry facilities called CRUD (Create, Retrieve, Update & Delete using a user worksheet) are implemented, and the index is updated each time a record is saved or deleted.

The ―Dictionary Browser‖, ―DB Browser‖ and ―Data Viewer‖ are synchronized with the ―Data Entry‖ when they are opened simultaneously in the application.

Subfielded fields When you enter a field containing subfields you must key in the required subfield delimiters in front of each subfield. A subfield delimiter is a 2-character code preceding and identifying a variable length subfield within a field. It consists of the character ^ followed by an alphabetic or numeric character, e.g.

^a

If the subfield code is alphabetic, you may enter it in either upper or lower case: J-ISIS makes no difference between ^a and ^A. You may therefore use the most convenient form.

Do not insert spaces or punctuation marks either before or after the subfield delimiter, unless you have been specifically instructed to do so. Entering spaces or punctuation may adversely affect the printing of the field later on. Here is an example of a field with three subfields:

^aParis^bUnesco^c1976

Another example: J-ISIS Reference Manual Page 196

Repeatable fields

If the field you are entering is repeatable and you need to enter more than one occurrence, enter each occurrence separately, and click on the repeatable field icon (―+‖) for each new occurrence to be added.

Data Editing Commands Copy/Paste (with Ctrl/C Ctlr/V) and Undo/Redo (Ctrl/Z Ctrl/Y) can be used during data Entry.

J-ISIS Reference Manual Page 197

Control Panel

The data entry window control panel contains the following items: Allows you to select a different worksheet. By clicking on this field the

list of available worksheets (as defined in the /iwks folder) is displayed.

This field contains the current MFN number. Clicking on this field allows you to edit a particular record by typing the desired MFN number and then pressing the Enter key.

Displays the first record. If you are editing a search result the first record matching the search expression is displayed. If you are editing the data base sequentially, the first data base record is displayed.

Displays the previous record. If you are editing a search result the previous record (if any) matching the search expression is displayed.

Displays the next record. If you are editing a search result the next record (if any) matching the search expression is displayed.

Displays the last record. If you are editing a search result the last record matching the search expression is displayed. If you are editing the data base sequentially, the last data base record is displayed.

Creates a new record. The current worksheet is displayed with all its fields empty.

Saves the current record in the Master file.

This toggle switch allows you to show (or remove) empty fields from the screen This toggle switch allows you to enable Right To Left data entry and display Applies the validation rules if any.

Cancels all the changes made and restores the record to its initial status

Creates a new record with the same content than the current one. The created record is assigned the next available MFN.

Clears the contents of all the fields in the worksheet.

Copy the current record in the stack

Paste a record from the stack

J-ISIS Reference Manual Page 198

How to delete a field: Empty the content of the 1st occurrence and delete all other occurrences. The field will be deleted when the record is saved.

BLOB Fields with Images J-ISIS offers two methods for storing images: i) images are stored separately in the /images folder of the databases and are accessed through an image hyperlink; ii) images are stored in the record as Binary Large Object (BLOB)

You can define a BLOB field type in the FDT. A BLOB field allows to copy and paste a mixture of text and images, or a single image. The Data Entry module allows entering images by doing Cut and Paste. In that particular case, the image is stored in the record itself.

23.3 Advanced Worksheet Editor Module The advanced worksheet editor uses a Tree-Table layout and allows defining a worksheet that goes at the subfield level, define repetitive subfields and that may contain field indicators and implicit subfields.

J-ISIS Reference Manual Page 199

THE ADVANCED WORKSHEET EDITOR BUILD WORKSHEETS THAT ARE INTENDED TO BE USED BY THE ADVANCED DATA ENTRY MODULE. HOWEVER WORKSHEETS REMAINS COMPATIBLE WITH THE STANDARD DATA ENTRY MODULE.

There are two use cases:

1) Enrich a worksheet created with the Worksheet Editor to get subfield data entry.

2) Create or Modify an advanced worksheet

Enrich a worksheet created with the Worksheet Editor to get subfield data entry. The default worksheet definition created at database creation is defined with the standard worksheet editor. This worksheet or any worksheet created with the Data Entry Editor defines data entry fields at the field level. It doesn‘t include the subfields. The Advanced Worksheet Editor allows defining data entry fields at the subfield level. If we load a standard worksheet, the subfields are not defined. For example, the CDS1 worksheet would look as follow:

J-ISIS Reference Manual Page 200

Field with tag 12 has subfields and a first implicit subfield. To make them available, we remove the tag 12 field from the worksheet

and add again tag 12 field from the Available Fields panel

Then, we have now the new field 12 with a ―+‖ node that can be expanded to see the subfields by clicking on it:

:

J-ISIS Reference Manual Page 201

We have now worksheet entries for the subfields and we can change the default prompt and indicates if the subfield is repetitive (the type can also be changed).

Double clicking on ―v12^*‖ prompt cell will allow editing the prompt:

Once the subfield prompts entered, it will look like this:

This process could be done for each field that contains subfields. Please note that it is important to correctly define the subfield definition in the FDT so that they are reflected in the worksheet definition.

Create or Modify an advanced worksheet The following dialog pops up when Clicking on New button

J-ISIS Reference Manual Page 202

And an empty worksheet definition is displayed.

Clicking on the Add All button will add all the fields to the worksheet definition.

And fields with subfields have a ―+‖ node that can be expanded to see them by clicking on it:

J-ISIS Reference Manual Page 203

The subfield prompts can now be edited to replace the Vxx^x by a meaningful text.

Standard and advanced worksheet definitions are stored in the same file using xml formatting. Worksheets created or modified with the advanced worksheet editor will contain more information but remain compatible with the standard data entry module.

We have used the ―Add All‖ button to move all fields but fields can be selected individually and inserted at any place in the bottom Tree-Table, you just have to select the node (or tree root) after which you want to insert the field, and the sub-nodes will be created automatically.

It‘s quite easy to define template worksheets for Marc21 bibliographic records or Authority records and Unimarc bibliographic records.

Example of a Marc21 bibliographic worksheet:

J-ISIS Reference Manual Page 204

Example of a Marc21 authority control worksheet:

J-ISIS Reference Manual Page 205

23.4 Advanced Data Entry 22.4 Advanced Data Entry

The Advanced Data Entry Editor uses a Tree-Table layout and worksheets defined with the advanced worksheet editor. Editing is done at the subfield level if a field has subfields or at the field level for fields without subfields.

It provides also interactivity for the basic functionality of entering, editing, viewing, or deleting records, that is, CRUD (Create Read Update Delete).

A) Advanced Data Entry Control Panel

The data entry window control panel contains the following items:

Allows you to select a different worksheet. By clicking on this field the list of available worksheets (as defined in the /iwks folder) is displayed.

This field contains the current MFN number. Clicking on this field allows you to edit a particular record by typing the desired MFN number and then pressing the Enter key.

Displays the first record. If you are editing a search result the first record matching the search expression is displayed. If you are editing the data base sequentially, the first data base record is displayed.

Displays the previous record. If you are editing a search result the previous record (if any) matching the search expression is displayed.

J-ISIS Reference Manual Page 206

Displays the next record. If you are editing a search result the next record (if any) matching the search expression is displayed.

Displays the last record. If you are editing a search result the last record matching the search expression is displayed. If you are editing the data base sequentially, the last data base record is displayed.

Creates a new record. The current worksheet is displayed with all its fields empty.

Saves the current record in the Master file, updating the index.

Deletes the current record in the Master file, updating the index.

This toggle switch allows you to show (or remove) empty fields from the screen This toggle switch allows you to enable Right To Left data entry and display Applies the validation rules if any.

Cancels all the changes made and restores the record to its initial status

Creates a new record with the same content than the current one. The created record is assigned the next available MFN.

Clears the contents of all the fields in the worksheet.

Copy the current record in the stack

Paste a record from the stack

B) Advanced Data Entry Window When selecting ―Advanced Data Entry‖ from the ―Edit‖ menu, we get a data entry display form driven by the first worksheet name in alphabetical order (CDS1 for CDS data base). A different worksheet can be selected if needed. The data field and subfield areas are empty and MFN equals 0. This is the data entry New state.

Please note that fields with subfields have a ―+‖ node that is not initially expanded. They can be expanded to see subfields by clicking on "+":

J-ISIS Reference Manual Page 207

Data entry can be started by double clicking on a subfield or field (without subfields) prompt cell

Pressing Enter key will save the current data and move the cursor to the next editable data element

Up and Down keyboard keys can be used to move the cursor respectively to the previous or next editable data element.

The Pick button will be enabled if a pick list is provided for a field.

Dark pink cells cannot be edited

J-ISIS Reference Manual Page 208

Clicking on the pencil will provide a dialog with an editor.

Add/Delete/Clear field occurrences Clicking on the right mouse button when a repeatable field occurrence is selected will open the following context menu:

Clicking on Add Occurrence will add an occurrence:

J-ISIS Reference Manual Page 209

Pick List Example

J-ISIS Reference Manual Page 210

Copy Record Content From One Database To Another You can open two databases at the same time, and start the Advanced Data Entry module on both databases. Then you can copy the record content of one record in one database and then after clicking on New in the other database, you can paste the record content. The new record will need to be saved if you are satisfied with the content.

J-ISIS Reference Manual Page 211

24. Sorting and Printing

Sorting and printing is done on a database which is opened. Printing output is always directed to a disk file.

Printing can be done without sorting the database record, i.e. the records will be printed in the master file number (MFN) sequence order.

This command allows you to print all the records or to print a selected range of records. You may sort the records by virtually any combination of fields and subfields.

The GUI offers two tabs called respectively ―Print‖ and ―Sort‖.

24.1 Quickly Printing All/Or a Selected Range of Records

1. Specify which records you want to print. You may print the whole data base or a specific range of records. You can enter a list of MFNs and/or MFN ranges separated by commas. For example: 1,10,100-150, 50. 2. Select the print format which defines which fields must be printed and how they should be formatted 3. Give a name to the output file and select the directory where it should be stored. HTML is for the moment the only output format supported, but you can send the output to a pdf file when printing if you have installed a Pdf driver such as PdfCreator: http://sourceforge.net/projects/pdfcreator/ 4. Click on the “Print” button.

J-ISIS Reference Manual Page 212

Please note the ―SORTING‖ radio buttons where you choose between ―Don‘t Use Sorting‖, ―Use the selected Hit Sort File for driving the output‖ and ―Sort the records according to keys defined in the Sort Tab‖.

J-ISIS displays the following message once printing on a disk file is done.

Open the output file in your favorite browser:

J-ISIS Reference Manual Page 213

24.2 Sorting the Records before Printing

You will need to click on the ―Sort‖ tab to specify the sorting parameters. As a first example, we will sort the ASFAEX records on field 543 ―Date of publication‖:

FSTs are discussed in detail in the FST chapter. You may either supply the name of a pre-defined FST or enter one directly. If you want to use a pre-defined FST enter the name preceded by an at sign (@). The @ sign tells J- ISIS that this is a name, rather than an actual FST. To provide an actual FST, you must enter the three components separated by a space in the following order: field identifier, indexing technique, and format. In case you need to enter a multi-line FST, separate each line with a + sign surrounded by spaces. Here are two sample entries: the first one instructs J-ISIS to use the pre-defined FST called AUTHOR; the second instructs the system to create a sort key from field 10 and a sort key from each descriptor in field 20.

@AUTHOR

1 0 V10 + 1 2 V20

J-ISIS Reference Manual Page 214

Thus, we define one sorting key by checking the check box of the first sorting key, we keep a default length of 15 characters and we provide the FST entry: ―543 0 V543‖

Coming back to the Sort tab, we:

1) Check the ―Sort the records according to keys defined in the Sort Tab‖ radio button

2) Give a name to the output file (―SortOnDate‖)

J-ISIS Reference Manual Page 215

3) Select the SortOnDate PFT:

4) Click on the ―Print‖ button

J-ISIS Reference Manual Page 216

25. Multilingual UNICODE Databases

J-ISIS is fully UNICODE for text storage and indexing. If you are unable to read some Unicode characters in your browser, it may be because your system is not properly configured. Here are some basic instructions for doing that. There are two basic steps:

 Install fonts that cover the characters you need  Configure J-ISIS to use them. J-ISIS Reference Manual Page 217

25.1 Windows

For Windows XP, getting additional languages installed is as follows: Start > Settings > Control Panel > Regional Options and Language Options. In the Languages tab, check the Supplemental language support option(s) you want. Setting both options will install all optional fonts. This adds fonts as well as system support for these languages.

25.2 Full fonts: If you have Microsoft Office 2000 and newer versions, you can get the Arial Unicode MS font, which is the most complete. To get it, insert the Office CD, and do a custom install. Choose Add or Remove Features. Click the (+) next to Office Tools, then International Support, then the Universal Font icon, and choose the installation option you want.

25.3 Configuring a J-ISIS database to use a special font.

1) Select the database:

2) Select the font for the database

J-ISIS Reference Manual Page 218

Arial Unicode MS is the best choice as it allows to mix language, alphabets and scripts:

J-ISIS Reference Manual Page 219

J-ISIS Reference Manual Page 220

J-ISIS Reference Manual Page 221

26. Client Z39.50

It works for MARC21 and UNIMARC, you can access Z3950 servers with User ID and Password, Parallel search on multi servers. Records are converted to UNICODE, You can select the records that you wish to export from the retrieved records. You can export to ISO2709, XML, MarcXML, Text and J-ISIS DB

J-ISIS distribution provide templates for Marc21 and Unimarc, i.e. FDT and FST (+Worksheets in the future for each format so that it is very easy to create a new database based on these formats and to get bibliographic records with Z3950.

The Export button allows to export the selected lines to ISO 2709, XML, MarcXML, Text and J-ISIS DB

J-ISIS Reference Manual Page 222

27 Authentication And Authorization in J-ISIS

27.1 Introduction

When securing systems, two elements of security are important: authentication and authorization. Though the two terms mean different things, they are sometimes used interchangeably because of their respective roles in application security.

Authentication deals with verifying a user's identity. When you authenticate users, you confirm that they really are who they claim to be. In most applications, authentication is done through a combination of a user name and password. As long as users choose passwords that are sufficiently difficult for others to guess, the combination of a user name and password is usually enough to establish identity. However, other means of authentication, such as fingerprints, certificates, and generated keys, are also available.

Once the authentication process successfully establishes identity, authorization takes over to restrict or grant access. It is possible that through authentication, a user can log in to a system but, through authorization, not be allowed to do anything. It is also possible to have a certain level of authorization for users that have not been authenticated.

J-ISIS uses Apache Shiro open-source security framework to handle authentication and authorization. Apache Shiro is a powerful and flexible security framework that cleanly handles authentication, authorization, enterprise session management, single sign-on and cryptography services. Shiro is able to read authorization data from active directory, ldap, ini file, properties file and database.

This first implementation of J-ISIS authentication and authorization configures Shiro by reading data from an ini file called "shiro.ini" located in the /conf subfolder of jisis_suite folder. Future implementations will use a J-ISIS database to store authentication and authorization data.

27.2 INI File Sections Understood By Shiro: # ======# Shiro INI configuration # ======

[main] # Objects and their properties are defined here, # Such as the securityManager, Realms and anything # else needed to build the SecurityManager

[users] # The 'users' section is for simple deployments # when you only need a small number of statically-defined # set of User accounts.

[roles] # The 'roles' section is for simple deployments J-ISIS Reference Manual Page 223

# when you only need a small number of statically-defined # roles.

[urls] # The 'urls' section is used for url-based security # in web applications. We'll discuss this section in the # Web documentation

A) [main] Section

B) [users] Section

The [users] section allows you to define a static set of user accounts. Here's an example:

[users] admin = secret lonestarr = vespa, goodguy, schwartz darkhelmet = ludicrousspeed, badguy, schwartz

Line Format

Each line in the [users] section must conform to the following format: username = password, roleName1, roleName2, ..., roleNameN

 The value on the left of the equals sign is the username  The first value on the right of the equals sign is the user's password. A password is required.  Any comma-delimited values after the password are the names of roles assigned to that user. Role names are optional.

C) [roles] Section

The [roles] section allows you to associate Permissions with the roles defined in the [users] section. Again, this is useful in environments with a small number of roles or where roles don't need to be created dynamically at runtime. Here's an example:

[roles] # 'admin' role has all permissions, indicated by the wildcard '*' admin = * # The 'schwartz' role can do anything (*) with any lightsaber: schwartz = lightsaber:* # The 'goodguy' role is allowed to 'drive' (action) the winnebago (type) with # license plate 'eagle5' (instance specific id) goodguy = winnebago:drive:eagle5

Line Format

Each line in the [roles] section must define a role-to-permission(s) key/value mapping within the following format: rolename = permissionDefinition1, permissionDefinition2,..., permissionDefinitionN

J-ISIS Reference Manual Page 224

where permissionDefinition is an arbitrary String, but most people will want to use strings that conform to the org.apache.shiro.authz.permission.WildcardPermission format for ease of use and flexibility. See the Permissions documentation for more information on Permissions and how you can benefit from them.

[urls]

This section and its options are useful for Web Applications and will be described in the Web- JISIS Authentication and Authorization chapter.

27.3 Shiro INI File Used By J-ISIS

Authorization has three core elements: permissions, roles, and users. Permissions represent explicitly what can be done in J-ISIS application. A Role is a named entity that typically represents a set of permissions. Roles are typically assigned to user accounts, so by association, users can 'do' the things attributed to various roles. Users are allowed to perform certain actions in your application through their association with roles or direct permissions.

The [roles] section allows you to associate Permissions with the roles defined in the [users] section.

Line Format

Each line in the [roles] section must define a role-to-permission(s) key/value mapping within the following format:

rolename = permissionDefinition1

J-ISIS recognized permission definitions (permissionDefinition) that are written as string with the following format:

database:action_group:database(s)

"database" is the domain name, it is followed by the colon(:) separator followed by an action group name that defines the list of actions permitted, then the colon(:) separator followed by the list of databases to which the action permissions applies.(Note that the colon is a special character used to delimit the next part in the permission string).

As a first attempt, J-ISIS Shiro INI file recognizes three roles that defines respectively permissions for all actions, 'oper' action group name, and 'guest' action group name:

[roles] ROLE_ADMIN = database:*:* ROLE_OPER = database:oper:* ROLE_GUEST = database:guest:*

They are defined as follow:

Role Permissions User

ROLE_ADMIN database:*:* User can do any actions on any database

J-ISIS Reference Manual Page 225

User is authorized to: i) Create/Read/Update/Delete records; ii) Manage worksheets and PFTS on any database but is not ROLE_OPER database:oper:* autorized to refactor database(s), re-index or run a Groovy script. User is only authorized to perform read-only actions such as ROLE_GUEST database:guest:* browse database(s), search, or PrintSort on any databases

The last wildcard character "*" grant a user access to any database(s) hosted on the server machine, but we could also add new roles for specific databases as in the following examples.

[users] guest =guest, ROLE_GUEST admin = admin, ROLE_ADMIN cds-admin = cds-admin,ROLE_ADMIN_CDS amj-admin = amj-admin,ROLE_ADMIN_AMJ [roles] ROLE_ADMIN = database:*:* ROLE_OPER = database:oper:* ROLE_GUEST = database:guest:* ROLE_ADMIN_CDS = database:*:CDS ROLE_ADMIN_AMJ = database:*:AMJ_BOOKS,database:*:AMJ_LOAN,database:*:AMJ_Member

Implicit Permissions (read-only actions)

 View records within a specific logical database.  Search and Display records within a specific database (Search->Open Search Window) This includes access to the following J-ISIS menu actions: Database  Open Connection  Open Database  MRU  Application Display Font Selection  Print Sort  Close Database  Close All Databases Browse  Data Viewer  DB Browser  Open Dictionary Window  Search History Edit  Field Definition Table (read only)  Field Selection Table (read only) Search  Open Search Window View  IDE log

J-ISIS Reference Manual Page 226

27.4 Default J-ISIS 'Shiro.ini' File [main] #cm = org.apache.shiro.authc.credential.HashedCredentialsMatcher #cm.hashAlgorithm = SHA-512 #cm.hashIterations = 1024 # Base64 encoding (less text): #cm.storedCredentialsHexEncoded = false

#iniRealm.credentialsMatcher = $cm

[users] jdoe = jdoe, ROLE_ADMIN guest =guest, ROLE_GUEST asmith = asmith, ROLE_GUEST admin = admin, ROLE_ADMIN cds-admin = cds-admin,ROLE_ADMIN_CDS cds-oper = cds-oper,ROLE_OPER_CDS cds-guest = cds-guest,ROLE_GUEST_CDS amj-admin = amj-admin,ROLE_ADMIN_AMJ oper = oper, ROLE_OPER [roles] ROLE_ADMIN = database:*:* ROLE_GUEST = database:guest:* ROLE_OPER = database:oper ROLE_ADMIN_CDS = database:*:CDS ROLE_OPER_CDS = database:oper:CDS ROLE_GUEST_CDS = database:guest:CDS ROLE_ADMIN_AMJ = database:*:AMJ_BOOKS,database:*:AMJ_LOAN,database:*:AMJ_Member

27.5 Some examples

Connection to CDS database as a user with role "oper" (for operator). User name is defined as 'cds-oper' and password 'cds-oper'. they are defined in the 'shiro.ini' file and could be changed.

The 'cds-oper' user is granted the actions defined by the ROLE_OPER_CDS role which is written as 'database:oper:CDS'. 'database' is the domain, 'oper' represents a group of actions granted, and 'CDS' specifies that this group of actions is granted only for the 'CDS' database. Thus when clicking on the 'Open Database...' menu item, the user will be see only the CDS database. J-ISIS Reference Manual Page 227

Clicking on 'New Database...' menu item or any menu item that run actions which are not granted will pop up a dialog

Connection to AMJ_BOOKS, AMJ_LOANS and AMJ_Member databases as a user with role "admin" (for Administrator). If the connection was already established for a user 'cds-oper' with ROLE_OPER_CDS permissions as in the previous example, we need first to close the current connection. This can be done by moving the mouse cursor above the 'Connections Pool' tab on the left to display the 'Connection Pool' window. After selecting the connection and clicking on the left mouse button, a context menu is displayed. Clicking on 'Close Connection' will close all databases opened as well as the connection.

We can now open a new connection with user name and password 'amj-admin' by clicking on the menu item 'Database'->'Open Connection'

J-ISIS Reference Manual Page 228

We can now open the three databases and display the LOAN database records with the Loans PFT

J-ISIS Reference Manual Page 229

Annex 1

Installing Java SE Development Kit 7u45

Please note that it is important to remove any previously installed JDK 1.7 uXX before installing the latest JDK. See next section “Checking Java Runtime Environment Settings” for checking which JDK is installed.

1.1 Downloading JDK 1.7 u45 You can download the latest JDK 1.7 from Oracle:

http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html

Accept the license agreement and download the appropriate jdk for your operating system:

The Windows 32-bit version of JDK 1.7 u45 is named jdk-7u45-windows-i586.exe (Windows XP or Windows 7 32-bit) and the 64-bit is named jdk-7u45-windows-x64.exe. NOTE: new versions or updates may be available. If you download a new version or an update version, the file name may be slightly different from jdk-7u45-windows-i586.exe or jdk-7u45-windows- x64.exe.

1.2 Installing JDK 1.7u45 on Windows

Follow the steps below to install JDK 1.7:

1. Double click jdk-7u45-windows-i586.exe or jdk-7u45-windows-x64.exe to run the installation program. You will see the JDK License dialog displayed.

2. Click Accept to display the JDK Custom Setup dialog.

3. You may install JDK in a custom directory. For simplicity, don‘t change the directory. Click Next to install JDK. After a while, the JRE Custom Setup dialog is displayed.

4. You may install JRE in a custom directory. For simplicity, don‘t change the directory. Click Next to install JRE.

5. After installation completed, the complete dialog is displayed. Click Finish to close the dialog.

J-ISIS Reference Manual Page 230

Checking Java Runtime Environment Settings

Windows XP

 Click on the Start button and then click on the Control Panel option.  Double click on the Java icon to open the Java Control Panel.

Click on the Java Panel

Click on the View button J-ISIS Reference Manual Page 231

Checking Java Runtime Environment Settings

These settings will be used when a Java application is launched. The Java Runtime Environment Settings dialog looks like the following on Windows:

Each row in the Java Runtime Versions panel represents a Java Runtime Environment that is installed in your computer. You may modify the value in each cell by double-clicking it:

 Platform: The version of the Java Runtime Environment  Product: The full version number of the Java Runtime Environment (which includes the update number)  Location: The URL that Java Update Scheduler uses to launch automatic updates  Path: The full path name of the Java Runtime Environment  Runtime Parameters: Optional custom options used to override the Java Plug-in default startup parameters  Enabled: If this check box is not selected, then Java Plug-in and Java Web Start will not use this JRE to deploy Java applications. Note that desktop applications already installed in your computer can still use this JRE if this check box is not selected.  Click the Find button to launch the JRE Finder. This utility searches for unregistered private Java Runtime Environments installed in your computer and adds them to the Java Runtime Versions panel.

J-ISIS Reference Manual Page 232

 Click the Add button to manually add a Java Runtime Environment to the Java Runtime Versions panel. When you click the Add button, a new row appears in the Java Runtime Versions panel; however, there are no values for Platform, Product, Path, Runtime Parameters, and Enabled; you must specify them yourself.  Click the Remove button to remove the selected Java Runtime Environment from the Java Runtime Versions panel.

Windows 7, Vista

 Click on the Start button and then click on the Control Panel option.  In the Control Panel Search enter Java Control Panel.  Click on the Java icon to open the Java Control Panel.

Windows 8

 Use search to find the Control Panel Press Windows logo key + W to open the Search charm to search settings

OR

 Drag the Mouse pointer to the bottom-right corner of the screen, then click on the Search icon.  In the search box enter Java Control Panel.  Click on Java icon to open the Java Control Panel.

Mac OS X 10.7.3 and above

 Click on Apple icon on upper left of screen.  Go to System Preferences  Click on the Java icon to access the Java Control Panel.

J-ISIS Reference Manual Page 233

Annex 2

How to run J-ISIS in Spanish:

You can run J-ISIS in Spanish version by editing the jisis_suite.conf file which is in the /etc/ folder of the J-ISIS installation:

Edit the jisis_suite.conf file which is in the /etc/ folder, and change the command line switches:

# command line switches

default_options="--branding jisis_suite -J-Xms64m -J-Xmx256m -J- Dnetbeans.logger.console=true -J-Dnetbeans.slow.system.clipboard.hack=false - J-ea -J-Duser.language=en -J-Duser.country=EN"

Change the country code from English to Spanish as follow:

default_options="--branding jisis_suite -J-Xms64m -J-Xmx256m -J- Dnetbeans.logger.console=true -J-Dnetbeans.slow.system.clipboard.hack=false - J-ea -J-Duser.language=es -J-Duser.country=ES"

Save the jisis_suite.conf file and restart J-ISIS

J-ISIS Reference Manual Page 234

Annex 3 How to use jisis core library in Groovy scripts or other Web Applications

1 J-ISIS Core Library Application Programming Interface (API) J-ISIS (as CDS/ISIS) is not a relational database system. Records are variable-length records and are identified by a unique ID called the master file number (mfn). Records are made of variable-length fields identified by a tag. They can be repetitive and can have several occurrences. A non repetitive field has a single occurrence. An occurrence can contain several subfields. The classes and objects that come naturally from the CDS/ISIS application are connections, database server, databases, records, fields, occurrences, subfields, field definition table, indexes, field selection table, queries, etc.

J-ISIS Reference Manual Page 235

IConnection (ID= hostName, port) The Java programming language provides a mechanism for defining a type that permits multiple implementations: interfaces. Interfaces cleanly separate the API from the 1 : m implementation. By convention, in J-ISIS, interface names begin with the letter ―I‖.

IDatabase Interfaces provide the method signatures that (ID = dbHome, dbName) class implementations must provide.

The figure on the left summarizes J-ISIS 1 : m main interfaces.

For each J-ISIS database server connection IRecord (ID = mfn) identified by a host name and a port (1111), we may have m databases.

1 : m For each database identified by a root folder dbHome and a database name dbName, we IField may have m records. (ID =tag) For each record identified by the master file number mfn, we may have m fields. 1 : m For each field identified by the field tag, we may have m occurrences. IOccurrence (ID=sequence index) For each occurrence identified by it sequential number, we may have m 1 : m subfields.

For each subfield identified by its subfield ISubfield delimiter, we have data. (ID=subfield delimiter)

J-ISIS Reference Manual Page 236

The figure on the left summarizes ConnectionNIO params= (hostName, port) the classes that implement J-ISIS main interfaces on the client side.

1 : m

ClientDbProxy (params= dbHome, dbNname)

1 : m

Record (params = mfn)

1 : m

Field (params=tag)

1 : m

StringOccurrence (params=sequence index)

1 : m

Subfield (params=subfield code)

J-ISIS Reference Manual Page 237

The Java Packages which are associated with these classes are the following:

import org.unesco.jisis.corelib.common.IConnection import org.unesco.jisis.corelib.client.ConnectionNIO import org.unesco.jisis.corelib.client.ClientDbProxy import org.unesco.jisis.corelib.common.IDatabase import org.unesco.jisis.corelib.record.IRecord import org.unesco.jisis.corelib.record.IField import org.unesco.jisis.corelib.record.StringOccurrence import org.unesco.jisis.corelib.record.Subfield

J-ISIS Reference Manual Page 238

2 Code Snippets:

Establishing a connection, opening a database and Browsing the database record by record: import org.unesco.jisis.corelib.common.IConnection import org.unesco.jisis.corelib.client.ConnectionNIO import org.unesco.jisis.corelib.client.ClientDbProxy import org.unesco.jisis.corelib.common.IDatabase import org.unesco.jisis.corelib.record.IRecord import org.unesco.jisis.corelib.record.IField import org.unesco.jisis.corelib.record.StringOccurrence import org.unesco.jisis.corelib.record.Subfield

def snippet1() { // Initialize the server parameters username = "admin"; password = "admin"; port = "1111"; hostname = "localhost"; // Establish a connection to the server def connection = ConnectionNIO.connect(hostname, Integer.valueOf(port), username, password); // Create a Database object bind to this server ClientDbProxy db = new ClientDbProxy(connection) // Let's use DB ASFAEX on root defined by DEF_HOME dbHome = "DEF_HOME"; dbName = "ASFAEX" // Open the database db.getDatabase(dbHome, dbName) // Get first record IRecord rec = db.getFirst(); // Iterate over the records in the database until nomore while (rec != null) { // Process the record-> println '\n======Record MFN: '+rec.getMfn()+' ======' IField field = rec.getField(700); System.out.println(field.getFieldValue()); // ….. // Get the next sequential record in the mfn order rec = db.getNext(); }

// Close the database db.close();

} snippet1();

The code of snippet1 Groovy script is provided in the /work directory of the J-ISIS installation. Thus you can load the file and execute it. It will output field 700 content for all records:

J-ISIS Reference Manual Page 239

Note: Re-using the above Groovy code, it could be possible to change field data and update records using the IDatabase updateRecord(Record record)method. However, it is strongly recommended to make a backup of the database before embarking in such modifications. Database backup and restore are fairly easy. You just have to copy the database folder with all its subdirectories somewhere else or in a zip file to backup, and to restore the database folder with all its subdirectories for restoring after deleting the database folder to replace.

J-ISIS Reference Manual Page 240

Exploring the record data

// Get the number of fields in the record int nfields = rec. getFieldCount();

// Iterate over all fields for (int i=0; i<=nfields; i++) { // Get the ieme field Field field = rec.getFieldByIndex(i);

// Get the number of occurrences int nocc = field.getOccurrenceCount();

if (nocc>0) { // Iterate over the occurrences for (j=0; j

}

Processing Specific Fields – Example 1

// Get the Monographic Level Authors (tag 200) field = rec.getField(200); // Process the field if it exists if (field != null) { // Get the number of occurrences int nocc = field.getOccurrenceCount();

if (nocc>0) { // Output a title if we have occurrences chapter.add (new Paragraph ("Monographic Level Authors:", h1Font)); // Build a list from the occurrences List list = new List (false, 30); for (int i=0; i

J-ISIS Reference Manual Page 241

Processing Specific Fields – Example 2

// Get the Corporate Authors (tag 210) field = rec.getField(210); if (field != null) { // A field has at least one occurrence int nocc = field.getOccurrenceCount(); if (nocc>0) { // Output a title if we have occurrences chapter.add (new Paragraph ("Corporate Authors:", h1Font));

// Build a list from the subfields in the occurrences List list = new List (false, 30); // Iterate over the occurrences for (int i=0; i

} }

3 The API

public interface IRecord extends Serializable {

// Get the type of record

public int getRecordType();

// Get field with tag “tag”

public IField getField(int tag) throws DbException;

// Get the field with index “index”

public IField getFieldByIndex(int index) throws DbException;

// Get the number of fields

public int getFieldCount() throws DbException;

J-ISIS Reference Manual Page 242

// Get MFN

public long getMfn();

// Get a List of fields

public List getFields() throws DbException;

// Set the MFN

public void setMfn(long mfn);

// Get an html representation

public String toHtml();

// Get a serialized value

public byte[] toBytes() throws IOException;

}

public interface IField extends Serializable {

// Get field tag

public int getTag();

// Get field type

public int getType();

// True if the field has occurrences, false otherwise

public boolean hasOccurrences();

// True if the field has subfields, false otherwise

public boolean hasSubfields();

public Object getFieldValue();

// Get the overall field value, occurrences are separated by % // and subfields by their respective delimiters

public String getStringFieldValue();

// get the occurrence value as object //for the occur ième occurrence [0:nOccu-1]

public Object getOccurrenceValue(int occur);

J-ISIS Reference Manual Page 243

// get the occurrence value as a Occurrence object //for the occur ième occurrence [0:nOccu-1]

public IOccurrence getOccurrence(int occur);

// get the occurrence value as a string (should be of type String) //for the occur ième occurrence [0:nOccu-1]

public String getStringOccurrence(int occur);

// get the subfield with delimiter subfield as a string for the // occur ième occurrence

public String getSubfield(int occur, String subfield);

// Get the number of occurrences

public int getOccurrenceCount();

public void setFieldValue(Object value) throws DbException;

public void setOccurrence(int occur, Object value) throws DbException;

public void removeOccurrence(int occur) throws DbException;

public void setType(int type);

public byte[] toBytesEx() throws IOException;

public int fromBytes(byte[] buf, int pos);

}

public interface IOccurrence extends Serializable {

// Returns true if this occurrence has subfields

public boolean hasSubfields();

// Returns the occurrence value

public Object getValue();

// Returns the subfield with subfieldTag

public String getSubfield(String subfieldTag);

// Sets the occurrence value

J-ISIS Reference Manual Page 244

public void setValue(Object value);

// Get a bytes representation

public byte[] toBytesEx() throws IOException;

// Build the occurrence from bytes

public int fromBytes(byte[] buf, int pos);

}

public interface ISubfield extends Serializable {

// Returns the Subfield code that identifies the data element.

public char getSubfieldCode();

// Sets the data element identifier.

public void setSubfieldCode(char code);

// Returns the data element.

public String getData();

// Sets the data element.

public void setData(String data);

// Returns true if the given regular expression matches a subsequence // of the string public boolean find(String pattern);

}

J-ISIS Reference Manual Page 245

public interface IDatabase {

// Open a database

public void getDatabase(String dbHome, String dbName) throws DbException;

// Create a database

public boolean createDatabase(CreateDbParams createDbParam) throws DbException;

// Get the number of records in the database

public long getRecordsCount() throws DbException;

// Get the server connection for this database

public IConnection getConnection();

// Get the home string

public String getDbHome();

// Get the database name

public String getDbName();

// Close database

public boolean close() throws DbException;

/**************************************************

* Management of observers that will be notified

* when a change in the database occurs

**************************************************/

// Add an observer for this database

public void addObserver(Observer newObserver);

// Delete the observer for this database

public void deleteObserver(Observer observer);

/************************************************** * Create, read, update and delete (CRUD) * The four basic functions of persistent storage **************************************************/

// Create a new empty record, the returned IRecord contains //the mfn allocated public IRecord addNewRecord() throws DbException;

J-ISIS Reference Manual Page 246

// Add a record to the database without updating the index

public Record addRecord(Record record) throws Exception;

// Read record with key "mfn"

public IRecord getRecord(long mfn) throws DbException;

// Read record with key "mfn" using the cursor method

public IRecord getRecordCursor(long mfn) throws DbException;

// Update and existing record or Create a new record if mfn=0

public Record updateRecord(Record record) throws Exception;

// Delete record with key "mfn"

boolean deleteRecord(long mfn) throws DbException;

// Get the last mfn allocated

public long getLastMfn() throws DbException;

/************************************ * Record iteration ************************************/ public IRecord getFirst() throws DbException;

public IRecord getLast() throws DbException;

public IRecord getNext() throws DbException;

public IRecord getPrev() throws DbException;

// Get last accessed record (current record)

public IRecord getCurrent() throws DbException;

/*************************************** * Reading a chunck of records ***************************************/

public List getRecordChunck(int from, int to) throws DbException;

public List getRecordChunck(long fromMfn, int nRecords) throws DbException;

public Vector getRecordChunck(long[] mfnChunck) throws DbException; /******************************* * Field Definition Table *******************************/ public FieldDefinitionTable getFieldDefinitionTable() throws DbException;

public boolean saveFieldDefinitionTable(FieldDefinitionTable fdt) throws DbException;

/******************************* J-ISIS Reference Manual Page 247

* Field Definition Table *******************************/ public FieldDefinitionTable getFieldDefinitionTable() throws DbException;

public boolean saveFieldDefinitionTable(FieldDefinitionTable fdt) throws DbException; /******************************* * Field Selection Tables *******************************/

public FieldSelectionTable getFieldSelectionTable() throws DbException;

public boolean saveFieldSelectionTable(FieldSelectionTable fst) throws DbException;

public String[] getFstNames() throws Exception;

public boolean saveFst(String name, FieldSelectionTable fst) throws Exception;

public FieldSelectionTable getFst(String name) throws DbException;

public boolean removeFst(String name) throws DbException;

public String getDefaultFstName() throws DbException;

/******************************* * Print Formats *******************************/

public String getDefaultPrintFormat() throws DbException;

public String getDefaultPrintFormatName() throws DbException;

public String getPrintFormat(String name) throws DbException;

public String getPrintFormatAnsi(String name) throws DbException;

public String[] getPrintFormatNames() throws DbException;

public boolean removePrintFormat(String name) throws DbException;

public void saveDefaultPrintFormat(String format) throws DbException;

public boolean savePrintFormat(String name, String format) throws Exception;

J-ISIS Reference Manual Page 248

/******************************* * Worksheets ******************************/

public WorksheetDef getWorksheetDef(String name) throws DbException;

public String[] getWorksheetNames() throws DbException;

public boolean removeWorksheetDef(String worksheetName) throws DbException;

public boolean saveWorksheetDef(WorksheetDef wkDef) throws Exception;

/******************************* * Index ******************************/

public boolean buildIndex() throws DbException; public boolean clearIndex() throws DbException; public boolean reIndex() throws DbException;

public IndexInfo getIndexInfo() throws DbException;

/*************************************** * Reading a chunck of records ***************************************/ public List getDictionaryTermsChunck(int from, int to) throws DbException;

public List getDictionaryTermsChunckEx(String from, int n) throws DbException;

public List getSortedDictionaryTermsChunck(int from, int to)throws DbException;

public long getDictionaryTermsCount() throws DbException;

/******************************* * Search ******************************/ public long[] search(String query) throws DbException;

public long[] searchLucene(String query) throws DbException; }

J-ISIS Reference Manual Page 249

public interface IConnection {

public void close() throws DbException;

public void echo() throws DbException;

public String[] getDbHomes() throws DbException;

public Vector getDbNames(String dbHome) throws DbException;

public UserInfo getUserInfo();

public String getServer();

public int getPort();

}

4 Writing a Groovy Application to produce a pdf catalogue Suppose that we want to produce a catalogue of the records which are in the ASFAEX example database. We will use the j-isis core library to extract the records and the iText open source library to format the catalogue.

Generating a document in pdf, rtf or html with iText involves the following five steps:

Step 1: Create a Document.

Step 2: Get a DocWriter instance (in this case, a PdfWriter instance)

Step 3: Open the Document.

Step 4: Add content to the Document.

Step 5: Close the Document.

A document is created as follow:

Document doc = new Document(PageSize.A4)

By default, the orientation is Portrait. You can change this to Landscape by invoking the rotate method:

Document doc = new Document(PageSize.A4.rotate())

J-ISIS Reference Manual Page 250

The Document class describes a document's page size (Letter, Legal, A4, and so on), margins, and other important attributes. It is also a container for a document's chapters, sections, images, paragraphs, and other content.

The Groovy code is provided in the pdfCatalogue.groovy file located in the work directory: import org.unesco.jisis.corelib.client.ClientDbProxy; import org.unesco.jisis.corelib.client.ConnectionPool; import org.unesco.jisis.corelib.common.Global; import org.unesco.jisis.corelib.common.IDatabase; import org.unesco.jisis.corelib.record.IRecord; import org.unesco.jisis.corelib.record.IField; import org.unesco.jisis.corelib.record.StringOccurrence; import org.unesco.jisis.corelib.record.Subfield; import org.unesco.jisis.corelib.client.ConnectionNIO; import org.unesco.jisis.corelib.common.IConnection; import java.awt.Color; import java.io.*; import com.lowagie.text.*; import com.lowagie.text.pdf.*;

class pdfCatalogue {

def bf = BaseFont.createFont (BaseFont.HELVETICA,

BaseFont.CP1252,

BaseFont.NOT_EMBEDDED);

// Establish a title font for all record titles.

def titleFont = new Font (Font.HELVETICA, 18, Font.BOLD,

J-ISIS Reference Manual Page 251

new Color (0, 0, 128));

def h1Font = new Font(Font.HELVETICA, 12, Font.BOLD,

new Color(0, 0, 128));

def process() {

// Create an instance of the Document class

Document doc = new Document ();

PdfWriter writer;

writer = PdfWriter.getInstance (doc,

new FileOutputStream ("asfaex.pdf"));

writer.setViewerPreferences (PdfWriter.PageModeUseOutlines);

doc.open ();

initDocument(doc, writer);

def username = "admin";

def password = "admin";

def port = "1111";

def hostname = "localhost";

// Establish a connection to the server

def connection_ = ConnectionNIO.connect(hostname, Integer.valueOf(port),

username, password);

// Create a Database object bind to this server

ClientDbProxy db_ = new ClientDbProxy(connection_)

J-ISIS Reference Manual Page 252

// Let's use DB ASFAEX defined on root DEF_HOME

def dbHome = "DEF_HOME";

def dbName = "ASFAEX"

// Open the database

db_.getDatabase(dbHome, dbName)

// Get first record

IRecord rec = db_.getFirst();

// Iterate over the records

while (rec != null) {

// Create a record chapter.

doc.add (recordChapter(rec));

rec = db_.getNext();

}

// Close

doc.close ();

writer.close();

}

J-ISIS Reference Manual Page 253

def initDocument(doc, writer) {

// Establish a footer that shows the page number between a pair dashes.

HeaderFooter footer = new HeaderFooter (new Phrase ("- "), new Phrase (" -"));

footer.setAlignment (Element.ALIGN_CENTER);

doc.setFooter (footer);

// Create the title page.

PdfContentByte cb = writer.getDirectContent ();

cb.rectangle (doc.left (), doc.bottom (), (float)(doc.right () - doc.left ()),

(float)(doc.top ()-doc.bottom ()));

cb.stroke ();

cb.beginText ();

cb.setFontAndSize (bf, 34);

cb.showTextAligned (PdfContentByte.ALIGN_CENTER, "ASFA",

(float)((doc.right ()-doc.left ()) / 2 + doc.leftMargin ()),

(float)((doc.top ()-doc.bottom ()) / 2 + doc.topMargin ()),

0);

cb.setFontAndSize (bf, 12);

cb.showTextAligned (PdfContentByte.ALIGN_CENTER,"The Aquatic Sciences and Fisheries Abstracts (ASFA) Bibliographic Database",

(float)((doc.right ()-doc.left ()) / 2 + doc.leftMargin ()),

(float)((doc.top ()-doc.bottom ()) / 2 + doc.topMargin ()-18),

0);

cb.endText ();

// Create the Introduction chapter.

J-ISIS Reference Manual Page 254

Paragraph title = new Paragraph ("Introduction", titleFont);

title.setAlignment (Element.ALIGN_CENTER);

title.setSpacingAfter (18.0f);

Chapter chapter = new Chapter (title, 0);

chapter.setNumberDepth (0);

Paragraph p = new Paragraph ("The Aquatic Sciences and Fisheries Abstracts (ASFA) Bibliographic Database is the" +

"principal information product produced through the cooperative efforts of the international" +

"network of ASFA Partners (http://www.fao.org/fishery/asfa/1,1/en) and FAO. " +

"The database contains more than 1 million bibliographic references (or records) to the world's" +

"aquatic science literature accessioned since 1971.");

p.setAlignment (Element.ALIGN_JUSTIFIED);

chapter.add (p);

doc.add (chapter);

}

J-ISIS Reference Manual Page 255

def recordChapter (rec) {

// Create a record chapter.

Paragraph title = new Paragraph ("Record "+rec.getMfn(), titleFont);

title.setAlignment (Element.ALIGN_CENTER);

title.setSpacingAfter (18.0f);

Chapter chapter = new Chapter (title, 1);

chapter.setNumberDepth (0);

chapter.setBookmarkOpen (false);

chapter.setBookmarkTitle (("Record "+rec.getMfn()));

// Get the English Title (tag 220)

IField field = rec.getField(220);

chapter.add (new Paragraph ("English Title:", h1Font));

Paragraph p = new Paragraph (field.getStringFieldValue());

p.setAlignment (Element.ALIGN_JUSTIFIED);

chapter.add (p);

// Get the Original Title (tag 224)

field = rec.getField(224);

chapter.add (new Paragraph ("Original Title:", h1Font));

p = new Paragraph (field.getStringFieldValue());

p.setAlignment (Element.ALIGN_JUSTIFIED);

chapter.add (p);

// Get the Serial Title (tag 324)

field = rec.getField(324);

chapter.add (new Paragraph ("Serial Title:", h1Font));

p = new Paragraph (field.getStringFieldValue());

p.setAlignment (Element.ALIGN_JUSTIFIED);

J-ISIS Reference Manual Page 256

chapter.add (p);

// Get the Abstact (tag 700)

field = rec.getField(700);

chapter.add (new Paragraph ("Abstract:", h1Font));

p = new Paragraph (field.getStringFieldValue());

p.setAlignment (Element.ALIGN_JUSTIFIED);

chapter.add (p);

/*

Image image = Image.getInstance ("mercury.gif");

image.setAlignment (Image.ALIGN_MIDDLE);

chapter.add (image);

*/

// Get the Monographic Level Authors (tag 200)

field = rec.getField(200);

if (field != null) {

int nocc = field.getOccurrenceCount();

if (nocc>0) {

chapter.add (new Paragraph ("Monographic Level Authors:", h1Font));

List list = new List (false, 30);

for (int i=0; i

list.add (new ListItem (field.getStringOccurrence(i)));

}

chapter.add (list);

}

}

J-ISIS Reference Manual Page 257

// Get the Corporate Authors (tag 210)

field = rec.getField(210);

if (field != null) {

// A field has at least one occurrence

int nocc = field.getOccurrenceCount();

if (nocc>0) {

chapter.add (new Paragraph ("Corporate Authors:", h1Font));

List list = new List (false, 30);

for (int i=0; i

StringOccurrence occ = field.getOccurrence(i);

Subfield[] subfields = occ.getSubfields();

for (int j=0; j

list.add (new ListItem (subfields[j].getData()));

}

}

chapter.add (list);

}

}

return chapter;

}

}

We create an instance of the pdfCatalogue class and we call the process method:

def catalogue = new pdfCatalogue()

catalogue.process()

Click on the “Execute Groovy Script” Toolbar button to execute pdfCatalogue script. J-ISIS Reference Manual Page 258

If you don’t provide a full path, the output file “asfaex.pdf” will be stored in the j-isis root folder it should look like:

J-ISIS Reference Manual Page 259

J-ISIS Reference Manual Page 260

J-ISIS Reference Manual Page 261

INDEX

/ Control characters, 90 control fields, 96 /idata, 11 Control number field, 96 /ifdt, 11, 111 Creating a data base, 90 /ifst folder, 115 Crossreference file, 11 /stopwords, 111 CSS, 13, 30, 40, 82, 84, 85, 87, 167

“ D

―RAW‖ format, 31 Data base Close, 23 3 Open, 23 Data Base, 9 30XX fields, 25, 158, 159 Data base definition, 10 32-bits Windows, 19 Data base definition services, 10 Data base structure, 10 6 Data element, 9, 90 Data Entry, 2, 11, 185, 187, 197, 207, 208 64-bits Windows, 19 Data Entry Window, 187, 190 data fields, 96 A data mode, 47, 50, 62, 105, 106 Data Viewer, 24, 28, 29, 30, 148, 160, 197 A(field selector, 71 Database definition, 90 Access points, 12 Database menu, 22, 23 Adding a field, 187, 190 database root directory, 18, 28 Advanced Worksheet Editor, 96, 187, 200, 201 database server, 3, 13, 14, 23, 24, 26, 151, 229, 230 Alternate data base, 66 Databases Pool, 29 ANSI/NISO Z39.2, 161 DATE(exp), 68 Application Display Font Selection, 25 DB, 69 auto complete, 123 DB Browser, 35, 155, 197 dbhome.conf, 14, 17, 18, 28 B DEF_HOME, 14, 18, 27, 28, 233, 245 Deleting a field, 187, 190 B Tree, 11, 12 Deleting a record, 187, 190 Begin and End coding, 187, 190 diacritic characters, 104, 118, 119 Berkeley DB, 11, 18, 197 Dictionary, 12 BLOB, 2, 91, 200 Dictionary Browsing, 37 Boolean expressions, 55 Digital Library, 2, 193 Boolean functions, 70 Display format, 40 Boolean Operators, 138, 143 Display formats, 10, 40 Document Object Model, 13 C DOM, 13, 14 DOS, 2, 50, 152, 170 catch-all, 117 Dublin Core, 102 CDS/ISIS, 2, 3, 9, 40, 41, 45, 47, 48, 50, 56, 67, 68, 77, 90, Dummy field selectors, 49, 52 92, 104, 183, 229 client, 2, 3, 12, 13, 23, 26, 87, 88, 231, 232, 233, 245 Client/Server, 3, 12, 13, 21, 23, 26, 87 E Close, 25, 233, 240, 244, 245 Editing a field, 187, 190 Close all, 25 encoding, 3, 13, 25, 151, 152, 170, 172 Code page 850, 152 Expert Search, 119, 120 Connections Pool, 30 Export, 23 J-ISIS Reference Manual Page 262

Export Database, 25 I Exporting, 161 Expressions, 53 IF command, 71 external format, 76 Images, 2, 200 Extracting a fragment of a field or subfield, 43 Import, 23 Extraction formats, 40 Importing, 2, 151, 158 Indentation command, 45 F index. See indexes indexes, 3, 9, 10, 11, 13, 30, 47, 229 F function, 46, 54, 63, 77 Indexes, 12 F(expr-1 ,expr-2,expr-3), 63 Indexing Techniques, 105 FDT, 3, 10, 11, 13, 24, 25, 42, 90, 91, 92, 110, 150, 154, interoperable, 3, 13 181, 182, 183, 184, 185, 186, 223 Inverted file, 12 Field, 9 Inverted File, 12 Field command, 42 Inverted file maintenance, 23 Field Definition Table, 90 Inverted file update, 187, 190 Field editing, 187, 190 ISO 2709, 151, 161, 162 Field occurrences, 44 ISO2709, 96, 97, 100, 151, 162, 223 Field Selectors, 42 ISO-8859-1, 152 Field separator, 162 Field types, 90 J Field validation, 187, 190 Format, 10 Java SE Development Kit, 16, 17, 224 Format errors, 75 JavaFX, 13, 16, 17 Format exits, 69, 174, 175 JavaScript, 3, 13, 30, 82, 83, 84, 85, 165, 167 Format variables, 77 J-ISIS application, 3, 13, 21, 172 Formatting Language, 40, 104 J-ISIS database server, 13, 23, 26 FST, 3, 10, 11, 12, 13, 19, 25, 30, 40, 59, 64, 93, 103, J-ISIS embedded browser, 40 104, 105, 107, 109, 110, 111, 118, 150, 154, 156, 181, J-ISIS Import Wizard, 25 186, 197, 215, 216, 223 J-ISIS Web Server, 14 FST entry field identifier, 131 FST entry field name, 107, 131 L FST manager, 114 FSTs. See FST L(format), 59 Functions, 57 LINK command, 86 Fuzzy Searches, 137 Linux, 3, 13, 17, 19 Literals, 47, 49, 50 G localhost, 13, 14, 18, 23, 26, 87, 233, 245 LR((format)[, from, to]), 60 Groovy, 3, 9, 69, 91, 172, 173, 174, 175, 176, 177, 178, Lucene, 11, 12, 103, 109, 110, 119, 131, 132, 134, 137, 229, 233, 234, 244, 245, 252 138, 139, 140, 141, 143, 156, 197 Guided Search, 119 M H Mac OS X, 2, 3, 13, 227 heading mode, 47 Main Window, 21 Hit list, 12 MARC, 2, 151, 158, 159, 161 Horizontal and vertical spacing commands, 47, 48 Marc21, 90, 91, 183, 205, 206, 223 HTML5, 3, 13, 14 MarcXML, 25, 161, 162, 223 Hypertext features, 86 MARCXML, 96, 101, 102 hyphenated words, 121 Master File, 11 Master File, 11, 35 Master File Number. See MFN master index FST, 114 J-ISIS Reference Manual Page 263

menu bar, 21, 90, 151, 156, 174 Re Index Database, 25, 151, 156 Menu bar, 21 Record, 9, 96 Menus Record validation, 187, 190 Database menu, 23 record leader, 158, 161 MFN, 11, 12, 35, 40, 41, 42, 45, 46, 50, 53, 54, 55, 59, Record separator, 162 64, 67, 109, 208, 213, 233, 236 REF(expression format), 64 MFN command, 46 reformatting FST, 111, 112, 113 MFNs, 12, 213 Reformatting FST, 111 Mode command, 47, 59, 92, 104 Repeatable fields, 9, 90, 92, 93, 187, 190, 191, 198 MODS, 25, 102, 161, 162 Repeatable groups, 72 Most Recently Used DB, 24, 25 RMAX(format), 59 Multiple Term Searching, 124 RMIN(format), 59 root directory, 10, 27, 28 N RSUM(format, 58 NetBeans Platform framework, 13 S NOCC(Vtt), 60 normalized, 119 S(format), 67 NPST(format), 60 Save file, 12 Number of fields of a record, 19 Search, 37, 38, 93, 105, 109, 110, 118, 119, 123, 124, Number of occurrences of a field, 19 126, 131, 143, 144, 145, 147, 227, 243 Number of records in a database, 19 server.conf, 17, 18 Numerical expressions, 53 Single Term Searching, 123 Numerical functions, 57 Size of a field, 19 Size of a record, 19 O SIZE(format), 61 SOLR, 116 OCC, 61 Sorting, 36, 107, 138, 213, 214, 215 Occurrence, 9 Status bar, 21 Open Connection, 23, 24, 26 Stopword, 106 Open Database, 24, 27 Stopwords, 110 open database dialog box, 24 String expressions, 55 Opening Databases, 27 String functions, 63 OpenJDK, 17 Subfield command, 42 Subfield delimiter, 162 P Subfield delimiters, 9 Subfielded fields, 197 P(field selector), 70 Subfields, 9, 96, 187, 190 Patterns, 90 Substring: SS(pos, length, format, 68 PFT Manager, 31, 34, 164 Suffix-Literals, 50 Pick List, 192, 193, 211 System functions, 9 Pick-lists, 187, 190 System services, 10 port, 13, 14, 18, 23, 26, 230, 233, 245 Prefix-Literals, 50 T Print format, 40 Printing, 23 Tag, 9 PrintSort, 25, 40 TAG, 63 proof mode, 47, 105 TCP/IP, 3, 13, 26 Proximity Searches, 138 Techniques, 13 term index, 37 R Term Modifiers, 134 tool bar, 21 RAVR(format), 59 Tools, 13 raw format, 13 TYPE(type, format), 62 J-ISIS Reference Manual Page 264

U Windows, 3, 13, 19, 21, 25, 82, 87, 151, 152, 170, 219, 224, 225, 226, 227 UNICODE, 3, 12, 13, 55, 165, 170, 218, 219, 223 Data Entry, 187, 190 Unimarc, 205, 223 WinISIS, 2, 3, 82, 87, 89, 151, 152, 156, 168, 169 UTF-8, 3, 13, 151, 165, 170 Wizard, 90 Worksheets, 3, 11, 13, 223, 243 V Worksheets (data entry), 10, 187

VAL(format), 57 X Validation, 187, 190 Validation file, 187, 190 XHTML, 2, 30, 82, 85, 87, 167 XML, 223 W Z Web Browser, 13 Web server, 3, 13, 14, 18, 85 Z39.50, 2, 223 Wildcard Searches, 134 Z3950. See Z39.50 zip distribution, 19

J-ISIS Reference Manual Page 265