POSIX Programmer's Manual <Regex.H>(P) Regex.H>

<regex.h>(P) POSIX Programmer’sManual <regex.h>(P) _regex.h> NAME regex.h − regular expression matching types SYNOPSIS #include <regex.h> DESCRIPTION The <reg ex.h> header shall define the structures and symbolic constants used by the regcomp(), regexec(), regerror(), and regfree() functions. The structure type regex_t shall contain at least the following member: size_t re_nsub Number of parenthesized subexpressions. The type size_t shall be defined as described in <sys/types.h> . The type regoff_t shall be defined as a signed integer type that can hold the largest value that can be stored in either a type off_t or type ssize_t.The structure type regmatch_t shall contain at least the following members: regoff_t rm_so Byte offset from start of string to start of substring. regoff_t rm_eo Byte offset from start of string of the first character after the end of substring. Values for the cflags parameter to the regcomp() function are as follows: REG_EXTENDED Use Extended Regular Expressions. REG_ICASE Ignore case in match. REG_NOSUB Report only success or fail in regexec(). REG_NEWLINE Change the handling of <newline>. Values for the eflags parameter to the regexec() function are as follows: REG_NOTBOL The circumflexcharacter ( ’ˆ’ ), when taken as a special character,does not match the beginning of string. REG_NOTEOL The dollar sign ( ’$’ ), when taken as a special character,does not match the end of string. The following constants shall be defined as error return values: REG_NOMATCH regexec() failed to match. REG_BADPAT Invalid regular expression. REG_ECOLLATE Invalid collating element referenced. REG_ECTYPE Invalid character class type referenced. REG_EESCAPE Trailing ’\’ in pattern. IEEE/The Open Group 2003 1 <regex.h>(P) POSIX Programmer’sManual <regex.h>(P) REG_ESUBREG Number in \digit invalid or in error. REG_EBRACK "[]" imbalance. REG_EPAREN "" or "()" imbalance. REG_EBRACE "\{\}" imbalance. REG_BADBR Content of "\{\}" invalid: not a number,number too large, more than twonumbers, first larger than second. REG_ERANGE Invalid endpoint in range expression. REG_ESPACE Out of memory. REG_BADRPT ’?’ , ’*’ ,or ’+’ not preceded by valid regular expression. REG_ENOSYS Reserved. The following shall be declared as functions and may also be defined as macros. Function prototypes shall be provided. int regcomp(regex_t *restrict, const char *restrict, int); size_t regerror(int, const regex_t *restrict, char *restrict, size_t); int regexec(const regex_t *restrict, const char *restrict, size_t, regmatch_t[restrict], int); void regfree(regex_t *); The implementation may define additional macros or constants using names beginning with REG_. The following sections areinformative. APPLICATION USAGE None. RATIONALE None. FUTURE DIRECTIONS None. SEE ALSO <sys/types.h> ,the System Interfaces volume of IEEE Std 1003.1-2001, regcomp(), the Shell and Utili- ties volume of IEEE Std 1003.1-2001 COPYRIGHT Portions of this text are reprinted and reproduced in electronic form from IEEE Std 1003.1, 2003 Edi- tion, Standard for Information Technology -- Portable Operating System Interface (POSIX), The Open Group Base Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of Electrical and Electron- ics Engineers, Inc and The Open Group. In the event of anydiscrepancybetween this version and the original IEEE and The Open Group Standard, the original IEEE and The Open Group Standard is the referee document. The original Standard can be obtained online at http://www.open- group.org/unix/online.html . IEEE/The Open Group 2003 2 agrep(1) agrep(1) agrep NAME agrep − print lines approximately matching a pattern SYNOPSIS agrep [OPTION]... PA TTERN [FILE]... DESCRIPTION Searches for approximate matches of PATTERN in each FILE or standard input. Example: ‘agrep −2 optimize foo.txt’ outputs all lines in file ‘foo.txt’ that match "optimize" within twoerrors. E.g. lines which contain "optimise", "optmise", and "opitmize" all match. OPTIONS Regexp selection and interpretation: −e PA TTERN, −−regexp=PA TTERN Use PA TTERN as a regular expression; useful to protect patterns beginning with −. −i, −−ignore−case Ignore case distinctions (as defined by the current locale) in PA TTERN and input files. −k, −−literal Treat PA TTERN as a literal string, that is, a fixed string with no special characters. −w, −−word−regexp Force PA TTERN to match only whole words. A "whole word" is a substring which either starts at the beginning or the record or is preceded by a non-word constituent character.Simi- larly,the substring must either end at the end of the record or be followed by a non-word constituent character.Word-constituent characters are alphanumerics (as defined by the current locale) and the underscore character.Note that the non-word constituent characters must sur- round the match; theycannot be counted as errors. Approximate matching settings: −D NUM, −−delete−cost=NUM Set cost of missing characters to NUM. −I NUM, −−insert−cost=NUM Set cost of extra characters to NUM. −S NUM, −−substitute−cost=NUM Set cost of incorrect characters to NUM.Note that a deletion (a missing character) and an insertion (an extra character) together constitute a substituted character,but the cost will be the that of a deletion and an insertion added together.Thus, if the const of a substitution is set to be larger than the sum of the costs of deletion and insertion, direct substitutions will neverbe done. −E NUM, −−max−errors=NUM Select records that have atmost NUM errors. -# Select records that have atmost # errors (# is a digit between 0 and 9). Miscellaneous: −d PA TTERN, −−delimiter=PA TTERN Set the record delimiter regular expression to PA TTERN.The text between twodelimiters, before the first delimiter,and after the last delimiter is considered to be a record. The default record delimiter is the regexp "\n", so by default a record is a line. PA TTERN can be anyregu- lar expression that does not match the empty string. Forexample, using −d "ˆFrom " defines mail messages as records in a Mailbox format file. −v, −−invert−match Select non-matching records instead of matching records. −V, −−version Print version information and exit. −y, −−nothing Does nothing. This options exists only for compatibility with the non-free agrep program. TRE agrep 0.7.5 November 21, 2004 1 agrep(1) agrep(1) −−help Display a brief help message and exit. Output control: −B, −−best−match Only output the best matching records, that is, the records with the lowest cost. This is cur- rently implemented by making twopasses overthe input files and cannot be used when read- ing from standard input. −−color, −−colour Highlight the matching strings in the output with a color marker.The color string is taken from the GREP_COLOR environment variable. The default color is red. −c, −−count Only print a count of matching records per each input file, suppressing normal output. −h, −−no−filename Suppress the prefixing filename on output when multiple files are searched. −H, −−with−filename Prefix each output record with the name of the input file where the record was read from. −l, −−files−with−matches Only print the name of each input file which contains at least one match, suppressing normal output. The scanning for each file will stop on the first match. −n, −−record−number Prefix each output record with its sequence number in the input file. The number of the first record is 1. −q, −−quiet, −−silent Do not write anything to standard output. Exit immediately with zero exit status if a match is found. −s, −−show−cost Print match cost with output. −−show−position Prefix each output record with the start and end offset of the first match within the record. The offset of the first character of the record is 0. The end position is givenasthe offset of the first character after the match. −M, −−delimiter−after By default, the record delimiter is the newline character and is output after the matching record. If −d is used, the record delimiter will be output before the matching record. This option causes the delimiter to be output after the matching record. With no FILE,orwhen FILE is -, reads standard input. If less than two FILEsare given −h is assumed, otherwise −H is the default. DIAGNOSTICS Exit status is 0 if a match is found, 1 for no match, and 2 if there were errors. If −E or -# is not speci- fied, only exact matches are selected. PA TTERN is a POSIX extended regular expression (ERE) with the TRE extensions. REPORTING BUGS Report bugs to the TRE mailing list <[email protected]>. COPYRIGHT Copyright © 2002-2004 Ville Laurikari. This is free software, and comes with ABSOLUTELYNOWARRANTY.You are welcome to redis- tribute this software under certain conditions; see the source for the full license text. TRE agrep 0.7.5 November 21, 2004 2 REGEX(3) Linux Programmer’sManual REGEX(3) REGEX NAME regcomp, regexec, regerror,regfree − POSIX regexfunctions SYNOPSIS #include <sys/types.h> #include <regex.h> int regcomp(regex_t *preg,const char *regex,int cflags); int regexec(const regex_t *preg,const char *string,size_t nmatch,regmatch_t pmatch[],int eflags); size_t regerror(int errcode,const regex_t *preg,char *errbuf ,size_t errbuf_size); void regfree(regex_t *preg); DESCRIPTION POSIX Regex Compiling regcomp() is used to compile a regular expression into a form that is suitable for subsequent regexec() searches. regcomp() is supplied with preg,apointer to a pattern buffer storage area; regex,apointer to the null- terminated string and cflags,flags used to determine the type of compilation. All regular expression searching must be done via a compiled pattern buffer,thus regexec() must always be supplied with the address of a regcomp() initialized pattern buffer. cflags may be the bitwise-or of one or more of the following: REG_EXTENDED Use POSIX Extended Regular Expression syntax when interpreting regex.Ifnot set, POSIX Basic Regular Expression syntax is used. REG_ICASE Do not differentiate case. Subsequent regexec() searches using this pattern buffer will be case insensitive.

POSIX Programmer's Manual <Regex.H>(P) Regex.H>

Hardware Pattern Matching for Network Traffic Analysis in Gigabit

Comprehensive Examinations in Computer Science 1872 - 1878

Trusted Research Environments (TRE) a Strategy to Build Public Trust and Meet Changing Health Data Science Needs

A Personalized Cloud-Based Traffic Redundancy Elimination for Smartphones Vivekgautham Soundararaj Clemson University, [email protected]

Branchclust Tutorial

Approximate Pattern Matching with Index Structures

How to Think Like a Computer Scientist

A Best-First Anagram Hashing Filter for Approximate String Matching

The Answer Question in Question Answering Systems Engenharia

Comparing Namedcapture with Other R Packages for Regular Expressions by Toby Dylan Hocking

The Power of Prediction: Cloud Bandwidth and Cost Reduction

Necessary to Conduct Most Analyses Commercial Software Typically Have (Very Bad) Installers Open Source Software Typically Have