Boost.Regex John Maddock Copyright © 1998-2010 John Maddock

Total Page:16

File Type:pdf, Size:1020Kb

Boost.Regex John Maddock Copyright © 1998-2010 John Maddock Boost.Regex John Maddock Copyright © 1998-2010 John Maddock Distributed under the Boost Software License, Version 1.0. (See accompanying ®le LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) Table of Contents Con®guration ........................................................................................................................................................ 3 Compiler Setup .............................................................................................................................................. 3 Locale and traits class selection ........................................................................................................................ 3 Linkage Options ............................................................................................................................................ 3 Algorithm Selection ....................................................................................................................................... 4 Algorithm Tuning .......................................................................................................................................... 4 Building and Installing the Library ............................................................................................................................ 5 Introduction and Overview ....................................................................................................................................... 7 Unicode and Boost.Regex ........................................................................................................................................ 9 Understanding Marked Sub-Expressions and Captures ................................................................................................. 10 Partial Matches .................................................................................................................................................... 14 Regular Expression Syntax ..................................................................................................................................... 17 Perl Regular Expression Syntax ...................................................................................................................... 17 POSIX Extended Regular Expression Syntax ..................................................................................................... 29 POSIX Basic Regular Expression Syntax .......................................................................................................... 36 Character Class Names .................................................................................................................................. 40 Character Classes that are Always Supported ............................................................................................. 40 Character classes that are supported by Unicode Regular Expressions ............................................................. 41 Collating Names .......................................................................................................................................... 43 Digraphs ............................................................................................................................................. 43 POSIX Symbolic Names ........................................................................................................................ 43 Named Unicode Characters .................................................................................................................... 46 The Leftmost Longest Rule ............................................................................................................................ 46 Search and Replace Format String Syntax ................................................................................................................. 48 Sed Format String Syntax .............................................................................................................................. 48 Perl Format String Syntax .............................................................................................................................. 48 Boost-Extended Format String Syntax .............................................................................................................. 50 Reference ........................................................................................................................................................... 53 basic_regex ................................................................................................................................................. 53 match_results .............................................................................................................................................. 64 sub_match .................................................................................................................................................. 72 regex_match ................................................................................................................................................ 84 regex_search ............................................................................................................................................... 88 regex_replace .............................................................................................................................................. 91 regex_iterator .............................................................................................................................................. 95 regex_token_iterator .................................................................................................................................... 101 bad_expression ........................................................................................................................................... 109 syntax_option_type ..................................................................................................................................... 110 syntax_option_type Synopsis ................................................................................................................ 110 Overview of syntax_option_type ............................................................................................................ 111 Options for Perl Regular Expressions ..................................................................................................... 111 Options for POSIX Extended Regular Expressions .................................................................................... 113 Options for POSIX Basic Regular Expressions ......................................................................................... 116 Options for Literal Strings .................................................................................................................... 118 1 XML to PDF by RenderX XEP XSL-FO Formatter, visit us at http://www.renderx.com/ Boost.Regex match_¯ag_type ......................................................................................................................................... 118 error_type ................................................................................................................................................. 122 regex_traits ................................................................................................................................................ 123 Interfacing With Non-Standard String Types .................................................................................................... 124 Working With Unicode and ICU String Types .......................................................................................... 124 Introduction to using Regex with ICU ............................................................................................. 124 Unicode regular expression types ................................................................................................... 124 Unicode Regular Expression Algorithms ......................................................................................... 126 Unicode Aware Regex Iterators ...................................................................................................... 127 Using Boost Regex With MFC Strings .................................................................................................... 133 Introduction to Boost.Regex and MFC Strings .................................................................................. 133 Regex Types Used With MFC Strings ............................................................................................. 133 Regular Expression Creation From an MFC String ............................................................................ 133 Overloaded Algorithms For MFC String Types ................................................................................. 134 Iterating Over the Matches Within An MFC String ............................................................................ 136 POSIX Compatible C API©s .......................................................................................................................... 138 Concepts ................................................................................................................................................... 141 charT Requirements ...........................................................................................................................
Recommended publications
  • Traits: Experience with a Language Feature
    7UDLWV([SHULHQFHZLWKD/DQJXDJH)HDWXUH (PHUVRQ50XUSK\+LOO $QGUHZ3%ODFN 7KH(YHUJUHHQ6WDWH&ROOHJH 2*,6FKRRORI6FLHQFH1(QJLQHHULQJ$ (YHUJUHHQ3DUNZD\1: 2UHJRQ+HDOWKDQG6FLHQFH8QLYHUVLW\ 2O\PSLD$:$ 1::DONHU5G PXUHPH#HYHUJUHHQHGX %HDYHUWRQ$25 EODFN#FVHRJLHGX ABSTRACT the desired semantics of that method changes, or if a bug is This paper reports our experiences using traits, collections of found, the programmer must track down and fix every copy. By pure methods designed to promote reuse and understandability reusing a method, behavior can be defined and maintained in in object-oriented programs. Traits had previously been used to one place. refactor the Smalltalk collection hierarchy, but only by the crea- tors of traits themselves. This experience report represents the In object-oriented programming, inheritance is the normal way first independent test of these language features. Murphy-Hill of reusing methods—classes inherit methods from other classes. implemented a substantial multi-class data structure called ropes Single inheritance is the most basic and most widespread type of that makes significant use of traits. We found that traits im- inheritance. It allows methods to be shared among classes in an proved understandability and reduced the number of methods elegant and efficient way, but does not always allow for maxi- that needed to be written by 46%. mum reuse. Consider a small example. In Squeak [7], a dialect of Smalltalk, Categories and Subject Descriptors the class &ROOHFWLRQ is the superclass of all the classes that $UUD\ +HDS D.2.3 [Programming Languages]: Coding Tools and Tech- implement collection data structures, including , , 6HW niques - object-oriented programming and . The property of being empty is common to many ob- jects—it simply requires that the object have a size method, and D.3.3 [Programming Languages]: Language Constructs and that the method returns zero.
    [Show full text]
  • Javascript and the DOM
    Javascript and the DOM 1 Introduzione alla programmazione web – Marco Ronchetti 2020 – Università di Trento The web architecture with smart browser The web programmer also writes Programs which run on the browser. Which language? Javascript! HTTP Get + params File System Smart browser Server httpd Cgi-bin Internet Query SQL Client process DB Data Evolution 3: execute code also on client! (How ?) Javascript and the DOM 1- Adding dynamic behaviour to HTML 3 Introduzione alla programmazione web – Marco Ronchetti 2020 – Università di Trento Example 1: onmouseover, onmouseout <!DOCTYPE html> <html> <head> <title>Dynamic behaviour</title> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> </head> <body> <div onmouseover="this.style.color = 'red'" onmouseout="this.style.color = 'green'"> I can change my colour!</div> </body> </html> JAVASCRIPT The dynamic behaviour is on the client side! (The file can be loaded locally) <body> <div Example 2: onmouseover, onmouseout onmouseover="this.style.background='orange'; this.style.color = 'blue';" onmouseout=" this.innerText='and my text and position too!'; this.style.position='absolute'; this.style.left='100px’; this.style.top='150px'; this.style.borderStyle='ridge'; this.style.borderColor='blue'; this.style.fontSize='24pt';"> I can change my colour... </div> </body > JavaScript is event-based UiEvents: These event objects iherits the properties of the UiEvent: • The FocusEvent • The InputEvent • The KeyboardEvent • The MouseEvent • The TouchEvent • The WheelEvent See https://www.w3schools.com/jsref/obj_uievent.asp Test and Gym JAVASCRIPT HTML HEAD HTML BODY CSS https://www.jdoodle.com/html-css-javascript-online-editor/ Javascript and the DOM 2- Introduction to the language 8 Introduzione alla programmazione web – Marco Ronchetti 2020 – Università di Trento JavaScript History • JavaScript was born as Mocha, then “LiveScript” at the beginning of the 94’s.
    [Show full text]
  • The Elinks Manual the Elinks Manual Table of Contents Preface
    The ELinks Manual The ELinks Manual Table of Contents Preface.......................................................................................................................................................ix 1. Getting ELinks up and running...........................................................................................................1 1.1. Building and Installing ELinks...................................................................................................1 1.2. Requirements..............................................................................................................................1 1.3. Recommended Libraries and Programs......................................................................................1 1.4. Further reading............................................................................................................................2 1.5. Tips to obtain a very small static elinks binary...........................................................................2 1.6. ECMAScript support?!...............................................................................................................4 1.6.1. Ok, so how to get the ECMAScript support working?...................................................4 1.6.2. The ECMAScript support is buggy! Shall I blame Mozilla people?..............................6 1.6.3. Now, I would still like NJS or a new JS engine from scratch. .....................................6 1.7. Feature configuration file (features.conf).............................................................................7
    [Show full text]
  • Introduction to Javascript
    Introduction to JavaScript Lecture 6 CGS 3066 Fall 2016 October 6, 2016 JavaScript I Dynamic programming language. Program the behavior of web pages. I Client-side scripts to interact with the user. I Communicates asynchronously and alters document content. I Used with Node.js in server side scripting, game development, mobile applications, etc. I Has thousands of libraries that can be used to carry out various tasks. JavaScript is NOT Java I Names can be deceiving. I Java is a full-fledged object-oriented programming language. I Java is popular for developing large-scale distributed enterprise applications and web applications. I JavaScript is a browser-based scripting language developed by Netscape and implemented in all major browsers. I JavaScript is executed by the browsers on the client side. JavaScript and other languages JavaScript borrows the elements from a variety of languages. I Object orientation from Java. I Syntax from C. I Semantics from Self and Scheme. Whats a script? I A program written for a special runtime environment. I Interpreted (as opposed to compiled). I Used to automate tasks. I Operates at very high levels of abstraction. Whats JavaScript? I Developed at Netscape to perform client side validation. I Adopted by Microsoft in IE 3.0 (1996). I Standardized in 1996. Current standard is ECMAScript 6 (2016). I Specifications for ECMAScript 2016 are out. I CommonJS used for development outside the browser. JavaScript uses I JavaScript has an insanely large API and library. I It is possible to do almost anything with JavaScript. I Write small scripts/apps for your webpage.
    [Show full text]
  • NINETEENTH PLENARY MEETING of ISO/IEC JTC 1/SC 22 London, United Kingdom September 19-22, 2006 [20060918/22] Version 1, April 17, 2006 1
    NINETEENTH PLENARY MEETING OF ISO/IEC JTC 1/SC 22 London, United Kingdom September 19-22, 2006 [20060918/22] Version 1, April 17, 2006 1. OPENING OF PLENARY MEETING (9:00 hours, Tuesday, September 19) 2. CHAIRMAN'S REMARKS 3. ROLL CALL OF DELEGATES 4. APPOINTMENT OF DRAFTING COMMITTEE 5. ADOPTION OF THE AGENDA 6. REPORT OF THE SECRETARY 6.1 SC 22 Project Information 6.2 Proposals for New Work Items within SC 22 6.3 Outstanding Actions From the Eighteenth Plenary of SC 22 Page 1 of 7 JTC 1 SC 22, 2005 Version 1, April 14, 2006 6.4 Transition to ISO Livelink 6.4.1 SC 22 Transition 7. ACTIVITY REPORTS 7.1 National Body Reports 7.2 External Liaison Reports 7.2.1 ECMA International (Rex Jaeschke) 7.2.2 Free Standards Group (Nick Stoughton) 7.2.2 Austin Joint Working Group (Nick Stoughton) 7.3 Internal Liaison Reports 7.3.1 Liaison Officers from JTC 1/SC 2 (Mike Ksar) 7.3.2 Liaison Officer from JTC 1/SC 7 (J. Moore) Page 2 of 7 JTC 1 SC 22, 2005 Version 1, April 14, 2006 7.3.3 Liaison Officer from ISO/TC 37 (Keld Simonsen) 7.3.5 Liaison Officer from JTC 1 SC 32 (Frank Farance) 7.4 Reports from SC 22 Subgroups 7.4.1 Other Working Group Vulnerabilities (Jim Moore) 7.4.2 SC 22 Advisory Group for POSIX (Stephen Walli) 7.5 Reports from JTC 1 Subgroups 7.5.1 JTC 1 Vocabulary (John Hill) 7.5.2 JTC 1 Ad Hoc Directives (John Hill) 8.
    [Show full text]
  • The Cedar Programming Environment: a Midterm Report and Examination
    The Cedar Programming Environment: A Midterm Report and Examination Warren Teitelman The Cedar Programming Environment: A Midterm Report and Examination Warren Teitelman t CSL-83-11 June 1984 [P83-00012] © Copyright 1984 Xerox Corporation. All rights reserved. CR Categories and Subject Descriptors: D.2_6 [Software Engineering]: Programming environments. Additional Keywords and Phrases: integrated programming environment, experimental programming, display oriented user interface, strongly typed programming language environment, personal computing. t The author's present address is: Sun Microsystems, Inc., 2550 Garcia Avenue, Mountain View, Ca. 94043. The work described here was performed while employed by Xerox Corporation. XEROX Xerox Corporation Palo Alto Research Center 3333 Coyote Hill Road Palo Alto, California 94304 1 Abstract: This collection of papers comprises a report on Cedar, a state-of-the-art programming system. Cedar combines in a single integrated environment: high-quality graphics, a sophisticated editor and document preparation facility, and a variety of tools for the programmer to use in the construction and debugging of his programs. The Cedar Programming Language is a strongly-typed, compiler-oriented language of the Pascal family. What is especially interesting about the Ce~ar project is that it is one of the few examples where an interactive, experimental programming environment has been built for this kind of language. In the past, such environments have been confined to dynamically typed languages like Lisp and Smalltalk. The first paper, "The Roots of Cedar," describes the conditions in 1978 in the Xerox Palo Alto Research Center's Computer Science Laboratory that led us to embark on the Cedar project and helped to define its objectives and goals.
    [Show full text]
  • Proposal to Refocus TC39-TG1 on the Maintenance of the Ecmascript, 3 Edition Specification
    Proposal to Refocus TC39-TG1 On the Maintenance of the ECMAScript, 3rd Edition Specification Submitted by: Yahoo! Inc. Microsoft Corporation Douglas Crockford Pratap Lakshman & Allen Wirfs-Brock Preface We believe that the specification currently under development by TC39-TG1 as ECMAScript 4 is such a radical departure from the current standard that it is essentially a new language. It is as different from ECMAScript 3rd Edition as C++ is from C. Such a drastic change is not appropriate for a revision of a widely used standardized language and cannot be justified in light of the current broad adoption of ECMAScript 3rd Edition for AJAX style web applications. We do not believe that consensus can be reach within TC39-TG1 based upon its current language design work. However, we do believe that an alternative way forward can be found and submit this proposal as a possible path to resolution. Proposal We propose that the work of TC39-TG1 be reconstituted as two (or possibly three) new TC39 work items as follows: Work item 1 – On going maintenance of ECMAScript, 3rd Edition. In light on the broad adoption of ECMAScript, 3rd Edition for web browser based applications it is clear that this language will remain an important part of the world-wide-web infrastructure for the foreseeable future. However, since the publication of the ECMAScript, 3rd Edition specification in 1999 there has been feature drift between implementations and cross-implementation compatibility issues arising from deficiencies and ambiguities in the specification. The purpose of this work item is to create a maintenance revision of the specification (a 4th Edition) that focuses on these goals: Improve implementation conformance by rewriting the specification to improve its rigor and clarity, and by correcting known points of ambiguity or under specification.
    [Show full text]
  • Smalltalk Idioms
    Smalltalk Idioms Farewell and a wood pile Kent Beck IT’S THE OBJECTS, STUPID If we parsed the string “@years”, the resulting picture S me awhile to see the obvious. Some- would look like Figure 6. When the BinaryFunction un- times even longer than that. Three or four times in the wraps its children, the right function will be in place. last month I’ve been confronted by problems I had a As I said, several times in the last month I’ve faced hard time solving. In each case, the answer became clear baffling problems that became easy when I asked myself when I asked myself the simple question, “How can I the question, “How could I make an object to solve this make an object to solve this problem for me?” You think problem for me?” Sometimes it was a method that just I’d have figured it out by now: got a problem? make an didn’t want to be simplified, so I created an object just for object for it. that method. Sometimes it was a question of adding Here’s an example: I had to write an editor for a tree features to an object for a particular purpose without clut- structure. There were several ways of viewing and editing tering the object (as in the editing example). I recommend the tree. On the left was a hierarchical list. On the top right that the next time you run into a problem that just doesn’t was a text editor on the currently selected node of the tree.
    [Show full text]
  • International Standard Iso/Iec 9075-2
    This is a previewINTERNATIONAL - click here to buy the full publication ISO/IEC STANDARD 9075-2 Fifth edition 2016-12-15 Information technology — Database languages — SQL — Part 2: Foundation (SQL/Foundation) Technologies de l’information — Langages de base de données — SQL — Partie 2: Fondations (SQL/Fondations) Reference number ISO/IEC 9075-2:2016(E) © ISO/IEC 2016 ISO/IEC 9075-2:2016(E) This is a preview - click here to buy the full publication COPYRIGHT PROTECTED DOCUMENT © ISO/IEC 2016, Published in Switzerland All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized otherwise in any form orthe by requester. any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of Ch. de Blandonnet 8 • CP 401 ISOCH-1214 copyright Vernier, office Geneva, Switzerland Tel. +41 22 749 01 11 Fax +41 22 749 09 47 www.iso.org [email protected] ii © ISO/IEC 2016 – All rights reserved This is a preview - click here to buy the full publication ISO/IEC 9075-2:2016(E) Contents Page Foreword..................................................................................... xxi Introduction.................................................................................. xxii 1 Scope.................................................................................... 1 2 Normative references.....................................................................
    [Show full text]
  • Proof/Épreuve
    0- W e8 IE -2 V 07 E a3 ) 4 R i 68 P .a /5 -1 D h st 7 e si 75 R it s/ 1 A s. : d -2 rd r rf D rd a da p N a d n o- A d an ta is t /s c/ T n l s g 5 S ta l lo 8 s u a cc eh ( F at 0 /c fa iT ai c INTERNATIONAL . b0 eh a it - s. 43 STANDARD rd a9 a 1- nd f a 4b st // s: tp ht Document management for PDF — 21757-1 Part 1: Use of ISO 32000-2 (PDF 2.0) ISO First edition — ECMAScript PROOF/ÉPREUVE ISO 21757-1:2020(E)Reference number © ISO 2020 ISO 21757-1:2020(E) 0- W e8 IE -2 V 07 E a3 ) 4 R i 68 P .a /5 -1 D h st 7 e si 75 R it s/ 1 A s. : d -2 rd r rf D rd a da p N a d n o- A d an ta is t /s c/ T n l s g 5 S ta l lo 8 s u a cc eh ( F at 0 /c fa iT ai c . b0 eh a it - s. 43 rd a9 a 1- nd f a 4b st // s: tp ht © ISO 2020 All rights reserved. UnlessCOPYRIGHT otherwise specified, PROTECTED or required inDOCUMENT the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission.
    [Show full text]
  • Evolution of the Major Programming Languages
    COS 301 Programming Languages Evolution of the Major Programming Languages UMaine School of Computing and Information Science COS 301 - 2018 Topics Zuse’s Plankalkül Minimal Hardware Programming: Pseudocodes The IBM 704 and Fortran Functional Programming: LISP ALGOL 60 COBOL BASIC PL/I APL and SNOBOL SIMULA 67 Orthogonal Design: ALGOL 68 UMaine School of Computing and Information Science COS 301 - 2018 Topics (continued) Some Early Descendants of the ALGOLs Prolog Ada Object-Oriented Programming: Smalltalk Combining Imperative and Object-Oriented Features: C++ Imperative-Based Object-Oriented Language: Java Scripting Languages A C-Based Language for the New Millennium: C# Markup/Programming Hybrid Languages UMaine School of Computing and Information Science COS 301 - 2018 Genealogy of Common Languages UMaine School of Computing and Information Science COS 301 - 2018 Alternate View UMaine School of Computing and Information Science COS 301 - 2018 Zuse’s Plankalkül • Designed in 1945 • For computers based on electromechanical relays • Not published until 1972, implemented in 2000 [Rojas et al.] • Advanced data structures: – Two’s complement integers, floating point with hidden bit, arrays, records – Basic data type: arrays, tuples of arrays • Included algorithms for playing chess • Odd: 2D language • Functions, but no recursion • Loops (“while”) and guarded conditionals [Dijkstra, 1975] UMaine School of Computing and Information Science COS 301 - 2018 Plankalkül Syntax • 3 lines for a statement: – Operation – Subscripts – Types • An assignment
    [Show full text]
  • 1. with Examples of Different Programming Languages Show How Programming Languages Are Organized Along the Given Rubrics: I
    AGBOOLA ABIOLA CSC302 17/SCI01/007 COMPUTER SCIENCE ASSIGNMENT ​ 1. With examples of different programming languages show how programming languages are organized along the given rubrics: i. Unstructured, structured, modular, object oriented, aspect oriented, activity oriented and event oriented programming requirement. ii. Based on domain requirements. iii. Based on requirements i and ii above. 2. Give brief preview of the evolution of programming languages in a chronological order. 3. Vividly distinguish between modular programming paradigm and object oriented programming paradigm. Answer 1i). UNSTRUCTURED LANGUAGE DEVELOPER DATE Assembly Language 1949 FORTRAN John Backus 1957 COBOL CODASYL, ANSI, ISO 1959 JOSS Cliff Shaw, RAND 1963 BASIC John G. Kemeny, Thomas E. Kurtz 1964 TELCOMP BBN 1965 MUMPS Neil Pappalardo 1966 FOCAL Richard Merrill, DEC 1968 STRUCTURED LANGUAGE DEVELOPER DATE ALGOL 58 Friedrich L. Bauer, and co. 1958 ALGOL 60 Backus, Bauer and co. 1960 ABC CWI 1980 Ada United States Department of Defence 1980 Accent R NIS 1980 Action! Optimized Systems Software 1983 Alef Phil Winterbottom 1992 DASL Sun Micro-systems Laboratories 1999-2003 MODULAR LANGUAGE DEVELOPER DATE ALGOL W Niklaus Wirth, Tony Hoare 1966 APL Larry Breed, Dick Lathwell and co. 1966 ALGOL 68 A. Van Wijngaarden and co. 1968 AMOS BASIC FranÇois Lionet anConstantin Stiropoulos 1990 Alice ML Saarland University 2000 Agda Ulf Norell;Catarina coquand(1.0) 2007 Arc Paul Graham, Robert Morris and co. 2008 Bosque Mark Marron 2019 OBJECT-ORIENTED LANGUAGE DEVELOPER DATE C* Thinking Machine 1987 Actor Charles Duff 1988 Aldor Thomas J. Watson Research Center 1990 Amiga E Wouter van Oortmerssen 1993 Action Script Macromedia 1998 BeanShell JCP 1999 AngelScript Andreas Jönsson 2003 Boo Rodrigo B.
    [Show full text]