Catherine Harbor

The Birth of the Music Business: Public Commercial Concerts In London 1660–1750 Catherine Harbor Volume 2 369 Appendix A. The Register of Music in London Newspapers 1660–1750 Database A.1 Database Design and Construction Initial database design decisions were dictated by the over-riding concern that the Register of Music in London Newspapers 1660–1750 should be a source-oriented rather than a model-oriented database, with the integrity of the source being preserved as far as possible (Denley, 1994: 33-43; Harvey and Press, 1996). The aim of the project was to store a large volume of data that had no obvious structure and to provide a comprehensive index to it that would serve both as a finding aid and as a database in its own right (Hartland and Harvey, 1989: 47-50). The result was what Harvey and Press (1996: 10) term an ‘electronic edition’ of the texts in the newspapers, together with an index or coding scheme that provided an easy way of retrieving the desired information. The stored data was divided into a text base with its physical and locational descriptors, and the index database. The design and specification of the database tables was undertaken by Charles Harvey and Philip Hartland using techniques of entity-relationship modelling and relational data analysis. These techniques are discussed in numerous texts on databases and database design and have been applied to purely historical data (Hartland and Harvey, 1989; Harvey and Press, 1996: 103-130). The Oracle relational database management system was used to create the tables, enter, store and manipulate the data. Initially the Register of Music took the form of three linked tables: TITLE, LOCATION and TEXT. The TITLE table stored the full titles of the newspapers and their associated codes; its structure is shown in Table A-1. Any changes in title were recorded and given a new code. Table A-1: The TITLE Table Field Name Type Length Description NEWSCODE CHAR 4 A unique code identifying the title of the newspaper, normally based on the initial letters of the title words NEWSNAME CHAR 180 The title of the newspaper in full 370 The LOCATION table gave sufficient details to locate the musical reference exactly, narrowing down to a particular issue of a particular newspaper and a location within the issue in question. The table structure is shown in Table A-2. Table A-2: The LOCATION Table Field Name Type Length Description TEXTNUM NUMBER 7 A unique numerical identifier incremented automatically within the data-entry application; used to synchronise the LOCATION and TEXT tables NEWSCODE CHAR 4 A unique code identifying the title of the newspaper, normally based on the initial letters of the title words DAY NUMBER 2 Day of the month of publication MONTH CHAR 3 Month of the year of publication YEAR NUMBER 4 Year of publication PAGENUM NUMBER 2 The page on which the musical reference occurs The TEXT table stored the transcription of the musical reference and various labels classifying its content. Text was transcribed in its original spelling but without indication of font changes or original line-length; it was felt that retention of these would serve no useful purpose and would necessitate an extremely complicated interface with a word-processor outside Oracle. The table structure is shown in Table A-3. Originally 38 columns were created to store the texts, but this was later expanded to 52 columns to accommodate longer texts. This was still not sufficient for a few very long texts but trials with further expansion caused the software to crash when saving records. Table A-3: The TEXT Table Field Name Type Length Description TEXTNUM NUMBER 7 A unique numerical identifier used to synchronise the LOCATION and TEXT tables NATURE CHAR 13 Classification of the nature of the musical reference as advertisement, news, puff- preview, report, commentary or miscellaneous 371 Field Name Type Length Description PROSE CHAR 1 These three fields allowed classification of the content of the musical reference as prose, VERS CHAR 1 verse or correspondence by use of a simple CORRS CHAR 1 Y (yes) or N (no) check; they need not be mutually exclusive L1, L2, CHAR 80 The actual text of the musical reference L3… L38 The relationships between these three tables is shown graphically using an entity-relationship diagram in Figure A-1; Harbor (1996; 2006) describes the entire database in detail. A data-entry application with easy-to-use forms was designed using Oracle’s Interactive Application Facility (IAF); 84 the first screen of this is shown in Figure A-2. 84 IAF was a character-cell video tool used to generate data-entry forms for early versions of the Oracle relational database management system. Complete information about Oracle version 4, which was used for the earliest version of the Register , can be found in the program documentation (1982–5). Bronzite (1989) provides a more accessible discussion of Oracle version 5, most of which also applies to Oracle version 4. 372 Figure A-1: Register of Music Entity-Relationship Diagram Version 1 TITLE # NEWSCODE * NEWSNAME contains is in LOCATION # TEXTNUM * NEWSCODE * DAY * MONTH * YEAR * PAGENUM TEXT is at contains # TEXTNUM * NATURE * PROSE * VERSE * CORRS * L1, L2, L3... Figure A-2: Register of Music Data Entry Screen Version 1 *** RHBNC REGISTER OF MUSIC IN LONDON NEWSPAPERS 166O - 18OO *** DATA ENTRY AND CORRECTION -------------------------------- TEXT LOCATION -------------------------------- NEWSPAPER CODE ____ TEXT NO _______ NEWSPAPER TITLE _______________________________________________________________________________ DATE DAY __ MONTH ___ YEAR ____ PAGE NO __ ------------------------- TEXT DESCRIPTION AND CONTENT ------------------------ - Text No _______ Nature _____________ Prose _ Verse _ Correspondence _ Text _______________________________________________________________________________ _______________________________________________________________________________ _______________________________________________________________________________ _______________________________________________________________________________ _______________________________________________________________________________ _______________________________________________________________________________ 373 Technical limitations in early versions of the database software necessitated the storing of the text of each musical reference in a large number of 80-character long fields (data-type CHAR) within the TEXT table. It could not be stored in a single field of data-type LONG as the Structured Query Language (SQL) used with Oracle and many other relational database management systems had no facilities for searching fields of this type at that time.85 The selected design meant that searches of the text of the references were possible but were rather cumbersome. An example SQL query follows: SELECT * FROM TEXT WHERE L1 LIKE ‘Handel%’ OR L2 LIKE ‘Handel%’ OR L3 LIKE ‘Handel%’ OR L4 LIKE ‘Handel%’ … It was quickly realised that the multiplicity of detail within the texts required more sophisticated searching techniques than simple string searches. Initially it was planned to construct an index to the texts; this single table would act as an independent database, containing information extracted from the full texts. The material transcribed into the Register being of the utmost variety, it was felt that no single format would encompass all of the items included; the only way to index such heterogeneous information was to fragment the text into discrete details and index each separately. While providing an immensely flexible and sensitive research tool, this system of indexing, given the name ‘Dynamo’ by the project team, would have been enormously time-consuming to set up manually, involving as it did the review of every single record, and entry into the index of every single relevant detail, no matter how many times it occurred in the database. Essentially the shortcomings of the database were caused by the technical limitations of the database software then available: 85 Numerous books are available explaining SQL and how it is used; Date and Darwen (1997) give a good introduction. 374 • Spelling in the seventeenth and eighteenth centuries was not yet standardised, especially where names were concerned, and dealing with these orthographical vagaries when searching posed problems. • The use of multiple CHAR fields in which to store the text resulted in: o A cumbersome editing method; this was especially problematic for texts that were repeated with minor changes or additions that increased the length of a line beyond the 80-character limit thus necessitating re-entry of the complete text. o An unwieldy search method (see above) although this could be alleviated by running a stored SQL query containing a substitution variable character that could be replaced by the required text at run time. • Very long texts could still not be stored in their entirety but had to span two or more text numbers. • Indexing of text using the ‘Dynamo’ system would be very time-consuming. A.2 Updating Search and Retrieval Systems In the late 1980s the development by Oracle Corporation of SQL*TextRetrieval, a free-text retrieval and search module, provided the means to solve many of the shortcomings identified above by allowing searching of the previously unsearchable data type LONG. 86 When loaded into the SQL*TextRetrieval module, each word in the LONG field within a table was indexed and its exact location recorded, thus making searches for individual words within LONG fields

Load more