Testing the Impact of Syllable Aggregation in Romanized Fields of Chinese Language Bibliographic Records

Total Page:16

File Type:pdf, Size:1020Kb

Testing the Impact of Syllable Aggregation in Romanized Fields of Chinese Language Bibliographic Records 143 Clement Arsenault Faculty of Information Studies, University of Toronto, CANADA Testing the Impact of Syllable Aggregation in Romanized Fields of Chinese Language Bibliographic Records Abstract: Today, two Romanization systems for Chinese data are in use in most libraries in the Western world: 1) Wade-Giles, and 2) Hanyu pinyin (simplY referred to as pinyin). In 1997, the Library of Congress finallyofficially announced the adoption of pinyin for Romanizing Chinese data in its bibliographic records. One of the main problems in implementing the pinyin standard for library use is that pinyin, as opposed to Wade-Giles, aggregates Chinese "words" into single linguistic units. Chinese characters represent monosyllabic morphemes rather than words and are equally spaced from one another, and theChinese text, in its original form, does not provide visual cues as to where a word starts or ends. When the script is romanized it is however essential that syllables or words be separated from one another, since, in most information retrieval techniques, the identificationof "visual words" is required. In this respect, the Romanized strings could be divided either in monosyllables or in polysyllable words. This study aims to explore the impact of using either unaggregated pinyin (monosyllabic) or aggregated pinyin (polysyllabic) Romanization in Chinese�language bibliographic records. An experiment, using transaction log analysis, was carried out to observe variations in the retrieval performance of title searches-both phrase and keyword-in a large OPAC of Chinese language records. General results are presented and a summary of the pros and cons of using either method is given. 1. Introduction The first online public access catalogues (OPACs) developed in large institutions, and the bibliographic databases produced and maintained by cataloguing agencies, did not have, until the mid 1980s, built�in capabilities to handle non�Roman scripts. Mainly because of limitations of coding space for large character sets, non-Roman scripts were solely represented by romanized fields, Entering non-Roman vernacular script in MARC records is now technically possible but it should nevertheless be noted that even today, most local OPACs in the Western world are still not equipped with the necessary typographical utilities to display the characters contained in these records, let alone with a proper interface to input them into query strings, leaving the end�user back to square one, that is with romanized enhies. Today, two Romanization systems for Chinese data are in use in most libraries in the Western world: 1) Wade-Giles�the system used in most NOlih-American libraries; 2) Pinyin, the system developed and officially adopted in 1958 by the People's Republic of China (PRC}--called Hanyu pinyin but simply referred to as pinyin�used mainly in European and Australian libraries. With the recent adoption of the Hanyu pinyin Romanization standard (pinyin) by the Library of Congress (LC), the replacement of Wade-Giles strings with pinyin entries in bibliographic records is eminent and will affect many libraries in North America in the coming years. Using pinyin over Wade-Giles will have a significant impact on retrieval in OPACs. The conversion from Wade-Giles to pinyin will likely be beneficial since end-users are, for the great majority, more familiar with pinyin than with Wade-Giles (Young, 1992). Pinyin entries in bibliographic records can be constructed following either a monosyllabic or a polysyllabic pattern. The goal of the current study is to investigate how polysyllabic transcription affects retrieval perfonnance in item-specific title searching in OPACs. 144 2. Background of the Research 2.1. Basic Characteristics of Chinese Language There exists a quasi one-ta-one syllable-morpheme---character pattern in Chinese (Kratochvil, 1968, 156), in the sense that virtually each character represents, at a given time, a single syllable. This quasi one-ta-one relationship between syllables, morphemes and characters has often been a source of confusion in defining what, in Chinese, constitutes a word. It is estimated that around 28% of Chinese words are composed of one character, while 67% are two-character words; the remaining 5% are fonned with three or more characters (Suen, 1986, 8). While there exist several thousand Chinese characters, modem standard Chinese (Mandarin) has only about 1300 different syllables (counting tones). There is inevitably a large number of homophone characters. This problem is further compounded by the fact that, when tones are ignored-as is the case in Romanized fields of bibliographic records-the number of unique syllables is reduced to around 408; so unless tones are marked, there are a little over 400 different syllables that can be used to represent the thousands of existing Chinese characters. This is, needless to say, a source of great confusion for users who rely solely on monosyllabic Romanized fields for the identification and retrieval of their bibliographic references. Expressing linguistic word units in aggregated polysyllabic fonn greatly helps reduce the number of homonyms produced by the monosyllabic transcription method (Anderson, 1972, 12; King, 1983). 2.2. Conversion of Chinese Script in Bibliographic Records Transliteration has been defined as the process of representing the characters of one alphabet, the target script, into those of another alphabet, the host script (Wellisch, 1978, 28). Because Chinese is a non-phonological writing system, it is impossible to transliterate, in the strict sense of the tenn, Chinese characters into Roman letters. The only type of script conversion possible is indirect transcription, that is, using the writing system of one language, to represent the sounds of the Chinese characters. Some studies have shown that library users are usually not very successful at retrieving items for which only a Romanized form has been entered in the bibliographic record (Aissing, 1992; Young, 1992). However, in North America where most automated systems function primarily with the Roman script, Romanization, if used alongside the original script, could be used to enhance access. 2.3. Parsing of Romanized Chinese Entries In a Chinese text, apart frompunctuation which indicates the end of sentences and their syntactic division, there are no visual cues as to where syntactic words start and end. This lack of visual boundaries does not mean that syntactic words do not exist in Chinese. In a Romanized text the level of ambiguity created by homophony is such that it is often nearly impossible to make any sense of unaggregated (monosyllabic) Romanized Chinese text. Research has shown that the ambiguity is resolved about 95% of the time when syllables are aggregated into words (King, 1983, 57). Word segmentation is not an easy task, greatly due to the fact that the delimitation of words as syntactic units is often based both on historical and cultural conventions. To this day, no definitivestandard on word segmentation of Chinese has been unanimously adopted. For bibliographic control pinyin entries in bibliographic records can be constructed following either a monosyllabic or a polysyllabic pattern. Although the fonner is easier and less costly to implement, it seems rational to believe that, because monosyllabic pinyin transcription can only produce somewhere 410 different syllables for ! indexing (408 if diacritic marks are ignored by the retrieval algorithm) , it creates data strings that are inadequate for effective and efficient online retrieval of records. Using the polysyllabic method is potentially beneficial for end-users since combining single syllables into linguistic units greatly reduces ambiguity and increases dramatically the number of individual units available for indexing. The decision to use mono- or polysyllabic Romaniza- 145 tion will have direct implications all browsing, indexing and retrieval in OPACs and will have vast repercussions on the services we offer to library users. Recognizing the significance of that problem, the Research Libraties Group (RLG) published, in 1987, the Chillese Aggrega­ tion Guidelines (RLG, 1987) and adopted a policy where Chinese characters and romanized syllables could be joined with a special "aggregator" character. RLG will soon offer its subscribers the possibility of downloading records with or without these aggregators which means that local OPACs could contain records in either mano- Of polysyllabic Romanization. 3. Research Methodology The experiment was ptirnarily designed to measure the difference in retrieval performance in OPAC searches when replacing Wade-Giles (WO) entries with monosyllabic pinyin (mPY) and with polysyllabic pinyin (pPY) in Chinese-language bibliographic records. The focus was on item-specific retrieval using the exact-title and the keywords-in-title search modes. Data were obtained by asking 30 library users to perform a specific retrieval task. All participants were native Chinese speakers and were all graduate students at the University of Toronto. Three treatment groups were defined, namely WO, mPY, and pPY. Each patticipant 2 was assigned to a specifictreatment, 6 for WO and 12 for each of the pinyin groupS. The task consisted of using Romanization to search two lists of20 monograph titles each, in a database containing ca. 50 000 bibliographic records for Chinese monographs. The database records contained Romanized fields only while the printed lists of titles were given in original Chinese characters, so the
Recommended publications
  • A Chinese Mobile Phone Input Method Based on the Dynamic and Self-Study Language Model
    A Chinese Mobile Phone Input Method Based on the Dynamic and Self-study Language Model Qiaoming Zhu, Peifeng Li, Gu Ping, and Qian Peide School of Computer Science & Technology of Soochow University, Suzhou, 215006 {qmzhu, pfli, pgu, pdqian}@suda.edu.cn Abstract. This paper birefly introduces a Chinese digital input method named as CKCDIM (CKC Digital Input Method) and then applies it to the Symbian OS as an example, and it also proposes a framework of input method which adopted the Client/Server architecture for the handheld computers. To improve the performance of CKCDIM, this paper puts forward a dynamic and self-study language model which based on a general language model and user language model, and proposes two indexes which are the average number of pressed-keys (ANPK) and the hit rate of first characters (HRFC) to measure the performance of the input method. Meanwhile, this paper brings forward a modified Church-Gale smoothing method to reduce the size of general language model to meet the need of mobile phone. At last, the experiments prove that the dynamic and self-study language model is a steady model and can improve the performance of CKCDIM. Keywords: Chinese Digital Input Method, Architecture of Input Method, Dynamic and Self-study Language Model, HRFC, ANPK. 1 Introduction With the developing of communication technology and the popularization of the mobile phone in China, the use of text message in mobile phone is growing rapidly. According to CCTV financial news report, the total number of Short Message Service use will grow from 300 billions in 2005 to 450 billions in 2006 in China.
    [Show full text]
  • Database Globalization Support Guide
    Oracle® Database Database Globalization Support Guide 19c E96349-05 May 2021 Oracle Database Database Globalization Support Guide, 19c E96349-05 Copyright © 2007, 2021, Oracle and/or its affiliates. Primary Author: Rajesh Bhatiya Contributors: Dan Chiba, Winson Chu, Claire Ho, Gary Hua, Simon Law, Geoff Lee, Peter Linsley, Qianrong Ma, Keni Matsuda, Meghna Mehta, Valarie Moore, Cathy Shea, Shige Takeda, Linus Tanaka, Makoto Tozawa, Barry Trute, Ying Wu, Peter Wallack, Chao Wang, Huaqing Wang, Sergiusz Wolicki, Simon Wong, Michael Yau, Jianping Yang, Qin Yu, Tim Yu, Weiran Zhang, Yan Zhu This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S.
    [Show full text]
  • Discover Your Phone
    Discover Your Phone accept, except as required by applicable law, liability for any error, omission or discrepancy between this user 262000 color guide and the product described. The device is meant to main screen be connected to GSM/GPRS networks. Back of the phone: Charger/USB camera lens connector Hard keys Stylus pen Also called Red key or On/Off key. Touch screen Hang up key • In menu or edit mode, short press to return to idle screen. Camera key • During an incoming call or call in progress, press to end the call Hang up, or reject the call. Cancel and On/Off key • When the phone is switched Answer key off, long press to switch on the Up key, phone. keypad lock • When the phone is switched Down key, Vibration on, long press to switch off the on/off key phone anytime. Also called Green key or Send key: Philips continuously strives to improve its products. Answer key • Answer a call or dial a phone Therefore, Philips reserves the rights to revise this user number guide or withdraw it at any time without prior notice. • In idle mode, press to view the Philips provides this user guide “as is” and does not dialed calls list. Up key • Browse the menu/list on the Side camera • In idle mode, short press to and same menu level. key enter the camera, long press to Down key • Browse SMS contents or other enter video camera. long text messages. • When in camera/video camera • Scroll to the previous or next preview mode, short press to picture when viewing pictures.
    [Show full text]
  • Inventory of Romanization Tools
    Inventory of Romanization Tools Standards Intellectual Management Office Library and Archives Canad Ottawa 2006 Inventory of Romanization Tools page 1 Language Script Romanization system for an English Romanization system for a French Alternate Romanization system catalogue catalogue Amharic Ethiopic ALA-LC 1997 BGN/PCGN 1967 UNGEGN 1967 (I/17). http://www.eki.ee/wgrs/rom1_am.pdf Arabic Arabic ALA-LC 1997 ISO 233:1984.Transliteration of Arabic BGN/PCGN 1956 characters into Latin characters NLC COPIES: BS 4280:1968. Transliteration of Arabic characters NL Stacks - TA368 I58 fol. no. 00233 1984 E DMG 1936 NL Stacks - TA368 I58 fol. no. DIN-31635, 1982 00233 1984 E - Copy 2 I.G.N. System 1973 (also called Variant B of the Amended Beirut System) ISO 233-2:1993. Transliteration of Arabic characters into Latin characters -- Part 2: Lebanon national system 1963 Arabic language -- Simplified transliteration Morocco national system 1932 Royal Jordanian Geographic Centre (RJGC) System Survey of Egypt System (SES) UNGEGN 1972 (II/8). http://www.eki.ee/wgrs/rom1_ar.pdf Update, April 2004: http://www.eki.ee/wgrs/ung22str.pdf Armenian Armenian ALA-LC 1997 ISO 9985:1996. Transliteration of BGN/PCGN 1981 Armenian characters into Latin characters Hübschmann-Meillet. Assamese Bengali ALA-LC 1997 ISO 15919:2001. Transliteration of Hunterian System Devanagari and related Indic scripts into Latin characters UNGEGN 1977 (III/12). http://www.eki.ee/wgrs/rom1_as.pdf 14/08/2006 Inventory of Romanization Tools page 2 Language Script Romanization system for an English Romanization system for a French Alternate Romanization system catalogue catalogue Azerbaijani Arabic, Cyrillic ALA-LC 1997 ISO 233:1984.Transliteration of Arabic characters into Latin characters.
    [Show full text]
  • DICOM PS3.5 2021C
    PS3.5​ DICOM PS3.5 2021d - Data Structures and Encoding​ Page 2​ PS3.5: DICOM PS3.5 2021d - Data Structures and Encoding​ Copyright © 2021 NEMA​ A DICOM® publication​ - Standard -​ DICOM PS3.5 2021d - Data Structures and Encoding​ Page 3​ Table of Contents​ Notice and Disclaimer ........................................................................................................................................... 13​ Foreword ............................................................................................................................................................ 15​ 1. Scope and Field of Application ............................................................................................................................. 17​ 2. Normative References ....................................................................................................................................... 19​ 3. Definitions ....................................................................................................................................................... 23​ 4. Symbols and Abbreviations ................................................................................................................................. 27​ 5. Conventions ..................................................................................................................................................... 29​ 6. Value Encoding ...............................................................................................................................................
    [Show full text]
  • Me-Api Documentation Release V0.1.0
    me-api Documentation Release v0.1.0 lord63 February 20, 2016 Contents 1 User’s Guide 1 1.1 Introduction...............................................1 1.2 Qucikstart................................................2 1.3 Integrate with your sites.........................................2 1.4 Gallery: live demos...........................................6 2 Developer’s Guide 7 2.1 Contribute to me-api...........................................7 2.2 Develop a new middleware........................................8 3 Additional Notes 11 3.1 Authors.................................................. 11 i ii CHAPTER 1 User’s Guide 1.1 Introduction Me-api is a personal API built on python and flask that allows for extensible integrations. It’s a python port of the original Node.js version me-api. It is called me-api, you can build a personal website with it. With me-api, you can fetch your photos on instagram, get your tweets from twitter, show you code activity on github, list your blog post from medium, and etc. 1.1.1 Data Representation There are two main json files: me.json and modules.json. me.json: { "name":"lord63", "join_github":"20 Aug 2013" } It’s all about you. Your name, your age, your hobbies, you can add anything about youself. You’ll see them on the root path “/” once you’ve configured me-api. modules.json: { "modules":{ "medium":{ "path":"/blog", "data":{ "me":"@username" } }, "github":{ "path":"/code", "data":{ "me":"username" } } } } 1 me-api Documentation, Release v0.1.0 Just as its name, it has many modules. Using custom middleware, you can attach the data pulled from various social media feeds to specific endpoints in your API. “path” is the endpoint which you want to host the middleware on, “data” contains some other info so that we can fetch data from the site(some may need authentication).
    [Show full text]
  • Section 18.1, Han
    The Unicode® Standard Version 13.0 – Core Specification To learn about the latest version of the Unicode Standard, see http://www.unicode.org/versions/latest/. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trade- mark claim, the designations have been printed with initial capital letters or in all capitals. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries. The authors and publisher have taken care in the preparation of this specification, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. © 2020 Unicode, Inc. All rights reserved. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction. For information regarding permissions, inquire at http://www.unicode.org/reporting.html. For information about the Unicode terms of use, please see http://www.unicode.org/copyright.html. The Unicode Standard / the Unicode Consortium; edited by the Unicode Consortium. — Version 13.0. Includes index. ISBN 978-1-936213-26-9 (http://www.unicode.org/versions/Unicode13.0.0/) 1.
    [Show full text]
  • BGN/PCGN Romanization Guide
    TABLE OF CONTENTS I. Introduction II. Approved Romanization Systems and Agreements Amharic Arabic Armenian Azeri Bulgarian Burmese Byelorussian Chinese Characters Georgian Greek Hebrew Japanese Kana Kazakh Cyrillic Khmer (Cambodian) Kirghiz Cyrillic Korean Lao Macedonian Maldivian Moldovan Mongolian Cyrillic Nepali Pashto Persian (Farsi and Dari) Russian Serbian Cyrillic Tajik Cyrillic Thai Turkmen Ukrainian Uzbek III. Roman-script Spelling Conventions Faroese German Icelandic North Lappish IV. Appendices A. Unicode Character Equivalents B. Optimizing Software and Operating Systems to Display BGN-approved geographic names Table . Provenance and Status of Romanization Systems Contained in this Publication Transliteration Date Class Originator System Approved BGN/PCGN Amharic System 967 967 System BGN/PCGN Arabic 96 System 96 System BGN/PCGN Armenian System 98 98 System Roman Alphabet Azeri 00 Azeri Government 99 Spelling Convention BGN/PCGN Bulgarian System 9 9 System BGN/PCGN Burmese Burmese Government Agreement 970 970 Agreement 907 System BGN/PCGN Byelorussian 979 System 979 System Xinhua Zidian Chinese Pinyin System Agreement 979 dictionary. Commercial Press, Beijing 98. Chinese Wade-Giles Agreement 979 System BGN/PCGN Faroese Roman Script Spelling 968 Spelling Convention Convention BGN/PCGN 98 System 98 Georgian System BGN/PCGN German Roman Script Spelling 986 Spelling Convention Convention Greek ELOT Greek Organization for Agreement 996 7 System Standardization BGN/PCGN Hebrew Hebrew Academy Agreement 96 96 System System Japanese
    [Show full text]
  • Line Break in Google Spreadsheets
    Line Break In Google Spreadsheets Danish and multistorey Chelton transmigrated: which Georg is manubrial enough? Old-rose Antoni enrolls: he big-note his Punchinello half and avoidably. How birchen is Paten when unremarkable and unwithholding Patty hook-ups some alabaster? An early in its mobile phone images and line in the linked to master google updates or multiple documents in manage users cannot edit the monitor the custom selection Google spreadsheets as you will appear on it back on is for line break the lines of a second row, and mexico cruises. Google spreadsheet in line breaks, tips to the raw characters to recognize the instructions for the super user has the name. And line break the spreadsheets be included in this reply back after you the first element. If google spreadsheet in line break up. Take you a matter which were performed in different icons located in between shopify store after this on power? We have in. Thank you for line break in the lines in the. The elaboration of its flow forecast expects a break in your json in excel functions available the magic behind the product is nothing to automatically converted to. New lines into a break the spreadsheets. Empowering google spreadsheets, a line breaks and financial reporting tool for this as the global tech podcaster: apple podcasts google data? My spreadsheet at google spreadsheets and line breaks and if selected section header. Please do this block to break line breaks in bulk sheets for the. Install google spreadsheet, a line breaks. Thanks winston that google spreadsheet cell that every line breaks, but others will be due to add a function.
    [Show full text]
  • Open Data Plan for 2020 to 2022
    Office of the Government Chief Information Officer (OGCIO) Open Data Plan for 2020 to 2022 A. Departmental datasets to be released in 2020 (In Target Release Date Order) # Type of Data/ Name of Dataset Target Release Frequency of Update Remarks Date (in mm/yyyy) 1. IT / 01/2020 As and when there are new ICT Event Calendar aims at collating ICT Event Calendar entries major ICT events in Hong Kong to facilitate promotion of Hong Kong as an ICT hub. The dataset includes information on event name, event type (e.g. exhibition), date, time, organiser, venue, event description, link to event website and organiser contact information (JSON) 2. IT/ 01/2020 As and when there are new Titles, current challenges, expected List of Service Needs of Government entries outcomes of service needs raised by Departments posted in Smart Government government departments, and their Innovation Lab corresponding entries in the website of Smart Government Innovation Lab (JSON) 3. IT/ 01/2020 As and when there are new Titles, descriptions and supplier List of Solutions from IT Suppliers collected entries names of IT solutions proposed by IT in Smart Government Innovation Lab suppliers, and their corresponding entries in the website of Smart Government Innovation Lab (JSON) 4. IT / 01/2020 As and when necessary List of programmes funded by the Programmes in Encouraging ICT Adoption OGCIO in encouraging ICT adoption among the Elderly by OGCIO among the elderly. The dataset includes project information under each programme such as project name, implementer’s name and contact, implementation period and link to programme details on OGCIO website (JSON) Note: To replace Table D #10 dataset “IT/Past Programmes in Encouraging ICT Adoption among the Elderly by OGCIO” 5.
    [Show full text]
  • Design for Inclusion White Paper
    Design for Inclusion: Creating a New Marketplace Industry White Paper National Council on Disability October 28, 2004 National Council on Disability 1331 F Street, NW, Suite 850 Washington, DC 20004 Design for Inclusion: Creating a New Marketplace—Industry White Paper This report is also available in alternative formats and on NCD’s award-winning Web site (www.ncd.gov). Publication date: October 28, 2004 202-272-2004 Voice 202-272-2074 TTY 202-272-2022 Fax Notes: The views contained in this report do not necessarily represent those of the Administration as this and all NCD documents are not subject to the A-19 Executive Branch review process. Please note that reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not constitute or imply its endorsement by the National Council on Disability. National Council on Disability Members and Staff Members Lex Frieden, Chairperson, Texas Patricia Pound, First Vice Chairperson, Texas Glenn Anderson, Ph.D., Second Vice Chairperson, Arkansas Milton Aponte, J.D., Florida Robert R. Davila, Ph.D., New York Barbara Gillcrist, New Mexico Graham Hill, Virginia Joel I. Kahn, Ph.D., Ohio Young Woo Kang, Ph.D., Indiana Kathleen Martinez, California Carol Novak, Florida Anne M. Rader, New York Marco Rodriguez, California David Wenzel, Pennsylvania Linda Wetters, Ohio Staff Ethel D. Briggs, Executive Director Jeffrey T. Rosen, General Counsel and Director of Policy Mark S. Quigley, Director of Communications Allan W. Holland, Chief Financial Officer Julie Carroll, Attorney Advisor Joan M. Durocher, Attorney Advisor Martin Gould, Ed.D., Senior Research Specialist Geraldine Drake Hawkins, Ph.D., Program Analyst Pamela O’Leary, Interpreter Brenda Bratton, Executive Assistant Stacey S.
    [Show full text]
  • Spectrum Technology Platform Version 12.0
    Spectrum Technology Platform Version 12.0 Web Services Guide Table of Contents 1 - Getting Started REST 4 SOAP 27 2 - Web Services REST 51 SOAP 396 Chapter : Appendix Appendix A: Buffering 798 Appendix B: Country Codes 801 Appendix C: Validate Address Confidence Algorithm 833 1 - Getting Started In this section REST 4 SOAP 27 Getting Started REST The REST Interface Spectrum™ Technology Platform provides a REST interface to web services. User-defined web services, which are those created in Enterprise Designer, support GET and POST methods. Default services installed as part of a module only support GET. If you want to access one of these services using POST you must create a user-defined service in Enterprise Designer. To view the REST web services available on your Spectrum™ Technology Platform server, go to: http://server:port/rest Note: We recommend that you limit parameters to 2,048 characters due to URL length limits. Service Endpoints The endpoint for an XML response is: http://server:port/rest/service_name/results.xml The endpoint for a JSON response is: http://server:port/rest/service_name/results.json Endpoints for user-defined web services can be modified in Enterprise Designer to use a different URL. Note: By default Spectrum™ Technology Platform uses port 8080 for HTTP communication. Your administrator may have configured a different port. WADL URL The WADL for a Spectrum™ Technology Platform web service is: http://server:port/rest/service_name?_wadl For example: http://myserver:8080/rest/ValidateAddress?_wadl User Fields You can pass extra fields through the web service even if the web service does not use the fields.
    [Show full text]