CSIC 2014( March )

Total Page:16

File Type:pdf, Size:1020Kb

CSIC 2014( March ) 50/- ` ISSN 0970-647X | Volume No. 37 | Issue No. 12 | March 2014 12 | March | Issue No. 37 No. | Volume 0970-647X ISSN Article Cover Story BITCOIN – An Overview of the A Big Need for Indic-Language Solutions 7 Popular Digital Cryptocurrency 27 Technical Trends CIO Perspective Digitized Document Processing – Critical Success Factors of Global Character Recognition Techniques 9 Video Conference: A Case Study 31 Article Utility Computing and Cloud Computing - Card Skimming A Major Threat Summary of Speech by www.csi-india.org www.csi-india.org to E-Commerce 23 Prof. V. RajaramanCSI Communications 42 | March 2014 | 1 Inviting Proposals from CSI Student Branches to Organize National / Regional / State Level CSI Student Conventions During the Year 2014-15 Computer Society of India (CSI) organizes National, Regional, and State level Student Conventions annually, at the active Student Branches across India. These Conventions promote the awareness on technological developments and applications, and foster creative professional orientations among the student community. The Conventions off er excellent opportunities to the students to manifest their technical profi ciency and prowess through paper presentations, discussions and extensive interactions with peers and pioneers. CSI invites Proposals from Student Branches to conduct the National / Regional / State level Student Conventions to be held during the academic year 2014-15 (April to March). Criteria: The proposing Student Branch should be very active, with a track record of several CSI activities, and be in good standing through the years 2013-14 and 2014-15. The proposals for convention will be evaluated, broadly based on the parameters given below. a) Number of years of continuous valid student branch at the college (without break) b) Average student strength over the past three years c) Number, quality and level of activities at the student branch d) Prompt submission of activity reports and fi nancial accounts e) Ability to attract good speakers from Industry f) Availability of infrastructure and other resources g) Financial strength and potential h) Accessibility and other general conditions Schedule: State, Regional Student Conventions: To be conducted before January 2015 National Student Convention: To be conducted in February/March 2015 All the State and Regional Student Conventions are to be completed according to the above schedule, so that the winners can participate in the National Student Convention to be held in February / March 2015. The CSI Student Convention Manual (Please see http://www.csi-india.org/web/education-directorate/student-convention-manual1) describes the guidelines and norms to conduct the student conventions. The Proposal: Interested Student Branches are requested to send electronic proposals with all necessary data, including the information stated below. a) Type of convention proposed: National/Regional/State level (Proposers of National Convention must have ability to mobilize participation from multiple states and experience of having conducted regional/state level convention earlier ) b) Proposed dates (at least two days) – please indicate two sets of dates c) A statement of case why the SB should be considered favourably for the proposed event d) Signed undertaking by the head of the institution to provide all the required support (Document with Scanned signature) e) Name and contact details of the coordinator-designate for the proposed convention How to send: The Student Branches may send the proposals through the respective Regional Student Co-ordinator (http://www.csi- india.org/web/guest/about-csi) who may subsequently forward the proposals to the National Student Co-ordinator (mini.ulanat@ gmail.com), with a copy to Education Directorate (admn.offi [email protected]). Time line: Interested Student Branches may please send the proposals with all details through proper channel as explained above to reach CSI Education Directorate before 10 April 2014. Selection: A Committee constituted by CSI, including the Honourary Secretary, National Student Co-ordinator, Director (Education) will assess the proposals and make the decisions. CSI Support: CSI extends partial fi nancial assistance, in accordance with the availability of budgetary resources, subject to the approval of the Executive Committee. CSI also supports the publicity eff orts for the Conventions. Convention Helpline: CSI-Education Directorate shall be pleased to off er any information or help on the convention. Please do contact Mr Gnanasekaran (email: admn.offi [email protected] Mobile : 98403 41902) for any assistance. Rajan T Joseph Director (Education) Computer Society of India Education Directorate, National Headquarters C I T Campus, 4th Cross Road, Taramani, Chennai 600113 Ph: +91-44-2254 1102/1103/2874; Fax: +91-44-2254 1143 CSI Communications Contents Volume No. 37 • Issue No. 12 • March 2014 Editorial Board Cover Story Role of Optimization Techniques A Big Need for Indic- 25 in Digital Image Watermarking Chief Editor 7 Language Solutions Baisa L Gunjal and Dr. Suresh N Mali Dr. R M Sonar Dr. Deepali Kamthania BITCOIN – An Overview of the Editors 27 Popular Digital Cryptocurrency Dr. Debasish Jana Technical Trends Mr. K V N Rajesh Dr. Achuthsankar Nair Digitized Document Processing – 9 Character Recognition Techniques Resident Editor K Magesh Practitioner Workbench Mrs. Jayshree Dhere Programming.Tips() » Research Front 29 How to Connect PHP into Catalysing the Education Revolution MYSQL Database? Dr. K Valarmathi 12 Through ICT Based 24X7 Engagements of Students Dr. K Kotecha and Dr. Richa Mishra Programming.Learn(“R”) » Published by 30 File Input and Output – Part II Executive Secretary A Research Oriented Undergraduate Umesh P and Silpa Bhaskaran Mr. Suchit Gogwekar 14 Curriculum: Design Principles and For Computer Society of India Concrete Realization Rajeev Sangal CIO Perspective Design, Print and Critical Success Factors of Global Dispatch by 31 Video Conference: A Case Study CyberMedia Services Limited Articles Dushyant Thatte, Gurmeet Rao, Know Your Metadata Nupur Ray, and Sandeep Bhatt 1 6 Sriram Raghavan and Prof. S V Raghavan Social Media and Educational Security Corner 2 1 Institutions- Evolutionary Dynamics Information Security » Prerna Lal 34 Security Features in Contemporary Browsers for the Users Card Skimming A Major Krishna Chaitanya Telikicherla, Harigopal Threat to E-Commerce K B Ponnapalli and Dr. Ashutosh Saxena 23 Hemant Kumar Saini and Anurag Jagetiya Please note: CSI Communications is published by Computer Society of India, a non-profi t organization. Views and opinions expressed in the CSI PLUS Communications are those of individual authors, contributors and advertisers and they may IT.Yesterday(): CSI Surat Chapter diff er from policies and offi cial statements of 38 CSI. These should not be construed as legal or Dr. N L Kalthia professional advice. The CSI, the publisher, the editors and the contributors are not responsible Brain Teaser for any decisions taken by readers on the basis of Dr. Debasish Jana 39 these views and opinions. Although every care is being taken to ensure Ask an Expert genuineness of the writings in this publication, Dr. Debasish Jana 40 CSI Communications does not attest to the originality of the respective authors’ content. © 2012 CSI. All rights reserved. Happenings@ICT: ICT News Briefs in February 2014 H R Mohan 41 Instructors are permitted to photocopy isolated articles for non-commercial classroom use without fee. For any other copying, reprint or Inauguaral Speech Summary: Utility Computing and Cloud Computing republication, permission must be obtained Dr. Anirban Basu 42 in writing from the Society. Copying for other than personal use or internal reference, or of CSI News 44 articles or columns not owned by the Society without explicit permission of the Society or the CSI Reports 49 copyright owner is strictly prohibited. Published by Suchit Gogwekar for Computer Society of India at Unit No. 3, 4th Floor, Samruddhi Venture Park, MIDC, Andheri (E), Mumbai-400 093. Tel. : 022-2926 1700 • Fax : 022-2830 2133 • Email : [email protected] Printed at GP Off set Pvt. Ltd., Mumbai 400 059. CSI Communications | March 2014 | 3 Know Your CSI Executive Committee (2013-14/15) » President Vice-President Hon. Secretary Prof. S V Raghavan Mr. H R Mohan Mr. S Ramanathan [email protected] [email protected] [email protected] Hon. Treasurer Immd. Past President Mr. Ranga Rajagopal Mr. Satish Babu [email protected] [email protected] Nomination Committee (2013-2014) Prof. H R Vishwakarma Dr. Ratan Datta Dr.Anil Kumar Saini Regional Vice-Presidents Region - I Region - II Region - III Region - IV Mr. R K Vyas Prof. Dipti Prasad Mukherjee Prof. R P Soni Mr. Sanjeev Kumar Delhi, Punjab, Haryana, Himachal Assam, Bihar, West Bengal, Gujarat, Madhya Pradesh, Jharkhand, Chattisgarh, Pradesh, Jammu & Kashmir, North Eastern States Rajasthan and other areas Orissa and other areas in Uttar Pradesh, Uttaranchal and and other areas in in Western India Central & South other areas in Northern India. East & North East India [email protected] Eastern India [email protected] [email protected] [email protected] Region - V Region - VI Region - VII Region - VIII Mr. Raju L kanchibhotla Mr. C G Sahasrabudhe Mr. S P Soman Mr. Pramit Makoday Karnataka and Andhra Pradesh Maharashtra and Goa Tamil Nadu, Pondicherry, International Members [email protected] [email protected] Andaman and Nicobar, [email protected] Kerala, Lakshadweep [email protected]
Recommended publications
  • Two Schemas for Online Character Recognition of Telugu Script Based on Support Vector Machines
    2012 International Conference on Frontiers in Handwriting Recognition Two Schemas for Online Character Recognition of Telugu script based on Support Vector Machines Rajkumar.J, Mariraja K., Kanakapriya,K., Nishanthini, S., Chakravarthy, V.S., Indian Institute of Technology, Madras [email protected] , [email protected] , [email protected] , [email protected] , [email protected] Abstract genetic algorithms for recognition of Devnagari script was described in (Jitendra and Chakravarthy We present two schemas for online recognition of 2008).A system for OHCR of Tamil and Telugu Telugu characters, involving elaborate multi- characters based on elastic matching was proposed in classifier architectures. Considering the three-tier (Prashanth et al 2007). Jagdeesh Babu et al (2007) vertical organization of a typical Telugu character, present a Hidden Markov Model (HMM) based we divide the stroke set into 4 subclasses primarily OHCR system for Telugu, at symbol level and not at based on their vertical position. Stroke level character level. recognition is based on a bank of Support Vector In this paper we develop an OHCR system using Machines (SVMs), with a separate SVM trained on SVMs for Telugu script. Telugu is a language spoken each of these classes. Character recognition for in the southern part of India. The language consists of Schema 1 is based on a Ternary Search Tree (TST), 16 vowels and 35 consonants that can combine to while for Schema 2 it is based on a SVM. The two form about 10000 composite characters. Defining a schemas yielded overall stroke recognition stroke as what is written between touch-down and performances of 89.59% and 96.69% respectively lift-off of the pen, we found 235 unique strokes in surpassing some of the recent online recognition Telugu script.
    [Show full text]
  • Bringing Ol Chiki to the Digital World
    Typography and Education http://www.typoday.in Bringing Ol Chiki to the digital world Saxena, Pooja, [email protected] Panigrahi, Subhashish, Programme Officer, Access to Knowledge, Center for Internet and Society, Bengaluru (India), [email protected] Abstract: Can a typeface turn the fate of an indigenous language around by making communication possible on digital platforms and driving digital activism? In 2014, a project was initiated with financial support from the Access to Knowledge programme at The Center for Internet and Society, Bangalore to look for answers to this question. The project’s goal was to design a typeface family supporting Ol Chiki script, which is used to write Santali, along with input methods that would make typing in Ol Chiki possible. It was planned that these resources would be released under a free license, with the hope to provide tools to Santali speakers to read and write in their own script online. Key words: Ol Chiki script, typeface design, minority script, Santali language 1. Introduction The main aim of this paper is to share the experiences and knowledge gained by working on a typeface and input method design project for a minority script from India, in this case Ol Chiki, which is used to write Santali. This project was initiated by the Access to Knowledge programme at the Center for Internet and Society (CIS-A2K, whose mandate is to work towards catalysing the growth of the free and open knowledge movement in South Typography Day 2016 1 Asia and in Indic languages. From September 2012, CIS has been actively involved in growing the open knowledge movement in India through a grant received from the Wikimedia Foundation (WMF).
    [Show full text]
  • Dr. UB Pavanaja
    Dr. U.B. Pavanaja - a profile Born in a village bordering Karnataka and Kerala. Education - MSc (Mysore Univ), PhD (Bombay Univ), Post-doctoral research at Taiwan. Scientist at BARC, Bombay for 15 years. Published many research articles. Active member of Kannada Sangha of BARC -an association for the propagation of science through Kannada. Edited the Kannada Science magazine of Kannada Sangha, BARC, "belagu", for 5 years. Gave a new look to the magazine by DTP. Organized many seminars in Kannada on various science topics at BARC. Brought a PC culture to the chemist communities at BARC. Resigned BARC in 1997 June and came down to Bangalore to work for Kannada and computers. Made the first Kannada program called "Kannada Kali" (Learn Kannada), a game which helps nursery kids and non- Kannadigas to learn Kannada alphabet, in 1993. The program is now evolved to include graphics and multimedia. This program is well appreciated by many Kannadigas and non-Kannadigas. Many Kannada organizations all over the world use this program to introduce Kannada alphabet to children at their Kannada classes. This program is available for free download at the Vishva Kannada site. Put up Kannada web site Vishva Kannada (http://www.vishvakannada.com/ ) in Dec. 1996. It has many firsts to its credit - First Kannada web site. First Kannada online magazine. First Indian Language web site to use dynamic fonts. Actively involved with Kannada Ganaka Parishat. KGP is an voluntary body formed by Computer professionals, literary persons and Kannada enthusiasts for the standardization and usage of Kannada on computers. KGP is the official certifying agency of Karnataka Govt for Kannada software.
    [Show full text]
  • Digital Review of Asia Pacific 2007-2008
    DIGITAL REVIEW of ri LA V i. L I 2007 2008 REPORTS ON 31 ECONOMIES 2 SUB-REGIONAL ASSOCIATIONS ICT4D in Asia Pacific: An Overview of Emerging Issues Mobile and Wireless Technologies for Development in Asia Pacific The Role of ICTs in Risk Communication in Asia Pacific I Localization in Asia Pacific I I Key Policy Issues in I Intellectual Property and Technology in Asia Pacific I State and Evolution of ICTs: A Tale of Two Asias ARCHIV 127081 www.digital-review.org DIGITAL REVIEW of Asia acjfic 2007-2008 I Supplementary news, reports and analyses are available for download at: http ://www. digital-review. org DIGITAL REVIEW of As 2007—2008 <dirAP> CHIEF EDIToR: Felix Librero CONTRIBUTING AUTHORS: ASSOCIATE EDITOR: Patricia B. Arinto Frederick John Abo Salman Malik EDITORIAL BOARD: Musa Abu Hassan Muhammad Aimal Marjan Ilyas Ahmed Jamshed Masood Danny Butt Zorayda Ruth Andam Ram Mohan Claude-Yves Charron Lkhagvasuren Ariunaa Charles Mok Suchit Nanda Batpurev Batchuluun Rapin Mudiardjo Maria Ng Lee Hoon Axel Bruns Frederick Noronha Milagros Rivera Danny Butt Them Oo Rajesh Sreenivasan Donny B.U. Sushil Pandey Knshnamurthy Sriramesh Elizabeth V. Cardoza Adam Peake Jian Yan Wang Claude-Yves Charron Phonpasit Phissamay Kapil Chawla Gopi Pradhan Masoud Davarinejad Ananya Raihan Deng Jianguo Naomi Robinson Massood Saffari Hj Abd Rahim Derus Lorraine Carlos Joâo Câncio Freitas Salazar George Sciadas John Fung Basanta Shrestha Atanu Garai Abhishek Singh Goh Seow Hiong Rajesh Sreenivasan Lelia Green Krishnamurthy Sriramesh Nalaka Gunawardene Tan
    [Show full text]
  • Human Language Computing in Indian Languages - a Holistic Perspective
    Human Language Computing in Indian Languages - A Holistic Perspective Swaran Lata Country Manager , W3C India Director & Head , TDIL Programme , Dept of Informaon Technology , Govt.of India E-mail : [email protected] 1 Organization of presentation: • Languages of India and its distribution • Technology Development for Indian Languages Programme • Phases of TDIL Programme • Paradigm Shift –Consortium mode projects • Linguistic Resources developed • Standardization Efforts - Core - Linguistic Resources • Testing and Evaluation Initiatives • Possible Collaborations with EU Programme • Future Directions 2 Languages of India ……INDIA: A Primer • Total Population: INDIA 1,028,737,436 (Source: STATES: 28 Census of India 2001) 10 States 03 UTs UT: 07 01 State 01 States • Language’s (Percentage to total population) 01 State 01 State HINDI (41.03) 01 State GUJARATI 02 States 02 UTs (4.48) BENGALI MARATHI (8.11) (6.99) 01 States MANIPURI 01 State (0.14) (3.21) (3.21) MALAYALAM MALAYALAM TELUGU 02 States (7.19) 01 State 01 State 01 States 01 UTs 01 UTs 01 State Linguistic Scenario in India Source – Census 2001, India Language Speakers Percentage to State(s) total population Assamese 13,168,484 1.28 Assam Bengali 83,369,769 8.11 Andaman & Nicobar Islands, Assam, Tripura, West Bengal Bodo 1,350,478 0.13 Assam Dogri 2,282,589 0.22 Jammu and Kashmir Gujarati 46,091,617 4.48 Dadra and Nagar Haveli, Daman and Diu, Gujarat Hindi 422,048,642 41.03 Andaman and Nicobar Islands, Arunachal Pradesh, Bihar, Chandigarh, Chhattisgarh, Delhi, Haryana, Himachal Pradesh, Jharkhand, Madhya Pradesh, Rajasthan, Uttar Pradesh and Uttarakhand Kannada 37,924,011 3.69 Karnataka.
    [Show full text]
  • Issues in Representation of Indic Scripts in Unicode
    Issues in Representation of Indic Scripts in Unicode Dr. Om Vikas Government of India Ministry of Communications & Information Technology Department of Information Technology Email: [email protected] UTC # 102 Unicode Technical Committee Meeting February 7-10, 2005, Mountain View, CA, USA Contents • Characteristics of Indic script. • India’s initiative for Language Technology Development • Awaiting updates in Unicode. • Proposal for New Scripts UTC # 102 Representation of Indic Scripts in Unicode Standard Dr. Om Vikas Characteristics of Indic Scripts UTC # 102 Representation of Indic Scripts in Unicode Standard Dr. Om Vikas Linguistic Scenario in India • India is Multilingual Multiscript Country • Twenty Two constitutionally recognized Indian Languages are mentioned as follows with their scripts within parentheses: Hindi (Devanagari), Konkani (Devanagari), Marathi (Devanagari), Nepali (Devanagari), Sanskrit (Devanagari), Sindhi (Devanagari/Urdu), Kashmiri (Devanagari/Urdu); Assamese (Assamese), Manipuri (Manipuri), Bangla (Bangali), Oriya (Oriya), Gujarati (Gujarati), Punjabi (Gurumukhi), Telugu (Telugu), Kannada (Kannada), Tamil (Tamil), Malayalam (Malayalam) Urdu (Urdu), Bodo (Devanagari, Assamese), Dogri (Devanagari), Maithili (Devanagari), Santhali (Ol Chiki). There are 10 Indic Scripts in vogue. • Less than 5 percent of people can either read & write English. Over 95 percent population is normally deprived of the benefits of English- based Information Technology. UTC # 102 Representation of Indic Scripts in Unicode Standard Dr. Om Vikas Characteristics of Indian Languages: • What You Speak Is What You Write (WYSIWYW) • Script grammar describes transformation rules • Relatively word-order-free • Common phonetic based alphabet • Common concept terms (from Sanskrit) Indian languages owe their origin to Sanskrit, hence they have in common rich cultural heritage and treasure of knowledge. Indic scripts have originated from Brahmi script.
    [Show full text]
  • Proposal for a Devanagari Script Root Zone Label Generation Rule-Set (LGR)
    Proposal for a Devanagari Script Root Zone Label Generation Rule-Set (LGR) LGR Version: 3.0 Date: 2019-03-06 Document version: 6.3 Authors: Neo-Brahmi Generation Panel [NBGP] 1 General Information/ Overview/ Abstract This document lays down the Label Generation Rule Set for the Devanagari script. Three main components of the Devanagari Script LGR i.e. Code point repertoire, Variants and Whole Label Evaluation Rules have been described in detail here. All these components have been incorporated in a machine-readable format in the accompanying XML file named "proposal-devanagari-lgr-06mar19-en.xml". In addition, a document named “devanagari-test-labels-06mar19-en.txt” has been provided. It contains a list of valid and invalid labels as per the Whole Label Evaluation laid down in Section 7 of this document. The labels have been tagged as valid and invalid under the specific rules1. In addition, the file also lists the set of labels which can produce variants as laid down in Section 6 of this document. 2 Script for which the LGR is proposed ISO 15924 Code: Deva ISO 15924 Key N°: 315 ISO 15924 English Name: Devanagari (Nagari) Latin transliteration of native script name: dévanâgarî 1 The categorization of invalid labels under specific rules is given as per the general understanding of the LGR Tool by the NBGP. During testing with any LGR tool, whether a particular label gets flagged under the same rule or the different one is totally dependent on the internal implementation of the LGR Tool. In case of discrepancy among the same, the fact that it is an invalid label should only be considered.
    [Show full text]
  • Automatic Transliterator from One Indic Script to Another
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by European Scientific Journal (European Scientific Institute) European Scientific Journal May edition vol. 8, No.11 ISSN: 1857 – 7881 (Print) e - ISSN 1857- 7431 AUTOMATIC TRANSLITERATION AMONG INDIC SCRIPTS USING CODE MAPPING FORMULA Ahmad Hweishel AL-Farjat Applied Science Department, AlBalqa Applied University, Jordan, Aqaba Abstract: This paper, discuss about developing an Automatic Transliterator which transliterates one Bramhi origin Indic script to another. This methodology is used specifically to map a text from one writing system to another and does not map the sounds of one language to the best matching script of another language. Here, we have discussed a method of Transliteration by finding out the patterns in Unicode Chart. It is mainly works for the following Indic Scripts: Bangla, Devanagari, Gurumukhi, Gujarati, Kannada, Malayalam, Oriya, Tamil, and Telugu. Each of these Indic Scripts can be interchangeably transliterated. Keywords: Translation, Transliteration, Script, Arabic, Kannada Introduction As majority of population know more than one language, they understand the spoken or verbal communication, however when it comes to scripts or written communication, the number diminishes, thus a need for transliteration tools which can convert text written in one language script to another script arises. Transliteration is mapping of pronunciation and articulation of words written in one script into another script. Transliteration should
    [Show full text]
  • A Survey of Software Localization Work
    Volume 4, No. 8, August 2013 Journal of Global Research in Computer Science REVIEW ARTICLE Available Online at www.jgrcs.info A SURVEY OF SOFTWARE LOCALIZATION WORK Manisha Bhatia*1, Varsha Tomar2 and Aparna Sharma3 *1Computer Science, Banasthali University, Jaipur, Rajasthan, India [email protected] 2Information Technology, Banasthali University, Jaipur, Rajasthan, India [email protected] 3Computer Science, Banasthali University, Jaipur, Rajasthan, India [email protected] Abstract: Localization concerns the translation of digital content and software, and their appropriate presentation to end users in different locales. Localization is important because having software, a website or other content in several languages, and meeting several sets of cultural expectations is an important international marketing advantage. This paper presents the difference between software localization, globalization and internationalization and compares the traditional document translation with respect to software localization. The paper further includes the survey of various localization software and software localization services, methods and tools available worldwide. As part of conclusion the steps followed for localizing a software project is included. We intend this paper to be useful to researchers and practitioners interested in software localization. Keywords: Localization (L10n), Internationalization (I18n), Globalization (G11n), Document Translation, Indic Computing. INTRODUCTION In computing, internationalization and localization are means of adapting computer software to different languages, Localization is the process of adapting a product or service regional differences and technical requirements of a target to a particular language, culture, and desired local "look- market. Internationalization (i18n) is the process of and-feel (Rouse, 2005). Software Localization is more than designing a software application so that it can be adapted to the translation of a product's User Interface.
    [Show full text]
  • Proposal for a Devanagari Script Root Zone Label Generation Rule-Set (LGR)
    Proposal for a Devanagari Script Root Zone Label Generation Rule-Set (LGR) LGR Version: 3.0 Date: 2018-07-27 Document version: 6.1 Authors: Neo-Brahmi Generation Panel [NBGP] 1 General Information/ Overview/ Abstract This document lays down the Label Generation Rule Set for the Devanagari script. Three main components of the Devanagari Script LGR i.e. Code point repertoire, Variants and Whole Label Evaluation Rules have been described in detail here. All these components have been incorporated in a machine-readable format in the accompanying XML file named "Proposal-lgr-devanagari-20180727.xml". In addition, a document named “Devanagari-test-labels-20180727.txt” has been provided. It contains a list of valid and invalid labels as per the Whole Label Evaluation laid down in Section 7 of this document. The labels have been tagged as valid and invalid under the specific rules1. In addition, the file also lists the set of labels which can produce variants as laid down in Section 6 of this document. 2 Script for which the LGR is proposed ISO 15924 Code: Deva ISO 15924 Key N°: 315 ISO 15924 English Name: Devanagari (Nagari) Latin transliteration of native script name: dévanâgarî 1 The categorization of invalid labels under specific rules is given as per the general understanding of the LGR Tool by the NBGP. During testing with any LGR tool, whether a particular label gets flagged under the same rule or the different one is totally dependent on the internal implementation of the LGR Tool. In case of discrepancy among the same, the fact that it is an invalid label should only be considered.
    [Show full text]
  • An Extensive Literature Review on CLIR and MT Activities in India
    International Journal of Scientific & Engineering Research Volume 4, Issue 2, February-2013 1 ISSN 2229-5518 An Extensive Literature Review on CLIR and MT activities in India Kumar Sourabh Abstract: This paper addresses the various developments in Cross Language IR and machine transliteration system in India, First part of this paper discusses the CLIR systems for Indian languages and second part discusses the machine translation systems for Indian languages. Three main approaches in CLIR are machine translation, a parallel corpus, or a bilingual dictionary. Machine translation- based (MT-based) approach uses existing machine translation techniques to provide automatic translation of queries. The information can be retrieved and utilized by the end users by integrating the MT system with other text processing services such as text summarization, information retrieval, and web access. It enables the web user to perform cross language information retrieval from the internet. Thus CLIR is naturally associated with MT (Machine Translation). This Survey paper covers the major ongoing developments in CLIR and MT with respect to following: Existing projects, Current projects, Status of the projects, Participants, Government efforts, Funding and financial aids, Eleventh Five Year Plan (2007-2012) activities and Twelfth Five Year Plan (2012- 2017) Projections. Keywords: Machine Translation, Cross Language Information Retrieval, NLP —————————— —————————— 1. INTRODUCTION: support truly cross-language retrieval. Many search engines Information retrieval (IR) system intends to retrieve relevant are monolingual but have the added functionality to carry out documents to a user query where the query is a set of translation of the retrieved pages from one language to keywords. Monolingual Information Retrieval - refers to the another, for example, Google, yahoo and AltaVista.
    [Show full text]
  • Indic Typesetting – Challenges and Opportunities
    Indic Typesetting – Challenges and Opportunities S. Rajkumar Linuxense Information Systems, “Lalita Mandir”, 16/1623, Jagathy, Trivandrum 695014, India [email protected] Abstract Asia boasts a wide variety of scripts, most of which are complex from the perspec- tive of a computer scientist or engineer. This is true in the case of Indic scripts which are classified in the realm of complex scripts. All of the Indic scripts re- quire a special process to transform from the a Unicode text to the actual glyph metrics for TEX to process. In this paper I will talk about the processing of Unicode text to produce high quality typeset material for Indic scripts using OpenType fonts. I will cover the OpenType standards for Indic scripts and other facilities OpenType provides for advanced typesetting. Introduction a full 16 bits and is based on Unicode, and thus supports 64k glyph in a single font. T X has remained and continues to remain one of E OpenType also supports advanced typograph- the best typesetting systems of the world. But with ical control such as ligatures, kerning, small caps the passage of time, T X has been used for purposes E etc, which were available in T X for a long time, that were not taken into account during its design E plus swash variants, contextual ligatures, old-style phase. One such “flaw” is that it was based on 8- figures, multi-script baselines etc, which are not part bit tables internally. This applies among others to of T X. What sets apart OpenType from others that the number of glyph in font files and the number of E offer these features, including T X, is that the ren- characters in a language.
    [Show full text]