Characters and Strings
57:017, Computers in Introduction Engineering—Quick Review Fundamentals of Strings and Characters of C Character Strings Character Handling Library String Conversion Functions Standard Input/Output Library Functions String Manipulation Functions of the String Handling Library Comparison Functions of the String Handling Library Search Functions of the String Handling Library Other Functions of the String Handling Library
Fundamentals of Strings Introduction and Characters z Introduce some standard library functions z Characters z Easy string and character processing z Character constant z Programs can process characters, strings, lines of z represented as a character in single quotes text, and blocks of memory z 'z' represents the integer value of z z stored in ASCII format ((yone byte/character ) z These techniques used to make: z Strings z Word processors z Series of characters treated as a single unit z Can include letters, digits and special characters (*, /, $) z Page layout software z String literal (string constant) - written in double quotes z Typesetting programs z "Hello" z etc. z Strings are arrays of characters z The type of a string is char * i.e. “pointer to char” z Value of string is the address of first character
The ASCII Character Set Strings and Characters z String declarations z Declare as a character array or a variable of type char * char color[] = "blue"; char *colorPtr = "blue"; z Remember that strings represented as character arrays end with '\0' z color has 5 eltlements z Inputting strings z Use scanf char word[10]; scanf("%s", word); z Copies input into word[] z Do not need & (because word is a pointer) z Remember to leave room in the array for '\0'
1 Chars as Ints Inputting Strings using scanf char word[10]; z Since characters are stored in one-byte ASCII representation, char *wptr; they can be considered as one-byte integers. … z C allows chars to be treated as ints and vice versa—e.g.: // This is OK as long as input string is less int main() { // than 10 characters in length int ivar; scanf((,“%s”, word); char cvar; ivar = ‘b’; // But this is not, since wptr does not have associated cvar = 97; // storage locations to hold the string printf (“ivar=%d cvar=‘%c’\n”, ivar, cvar); scanf(“%s”, wptr); }
//This would be OK: wptr = word; ivar=98 cvar=‘a’ scanf(“%s”, wptr);
Character Handling Library Character Handling Library
Prototype Description
z Character handling library int isdigit( int c ) Returns true if c is a digit and false otherwise.
z Includes functions to perform useful tests and int isalpha( int c ) Returns true if c is a letter and false otherwise. manipulations of character data int isalnum( int c ) Returns true if c is a digit or a letter and false otherwise. int isxdigit( int c ) Returns true if c is a hexadecimal digit character and false otherwise. z Each function receives a character (an int) or EOF as an int islower( int c ) Returns true if c is a lowercase letter and false otherwise. argument int isupper( int c ) Returns true if c is an uppercase letter; false otherwise. int tolower( int c ) If c is an uppercase letter, tolower returns c as a lowercase letter. Otherwise, tolower z The following slide contains a table of all the returns the argument unchanged. int toupper( int c ) If c is a lowercase letter, toupper returns c as an uppercase letter. Otherwise, toupper functions in
1 /* Fig. 8.2: fig08_02.c 2 Using functions isdigit, isalpha, isalnum, and isxdigit */ 33 isxdigit( '7' ) ? "7 is a " : "7 is not a ", 3 #include
2 1 /* Fig. 8.6: fig08_06.c String Conversion Functions 2 Using atof */ 3 #include
10 d = atof( "99.0" ); Prototype Description 11 printf( "%s%.3f\n%s%.3f\n", double atof( const char *nPtr ) Converts the string nPtr to double. 12 "The string \"99.0\" converted to double is ", d, 13 "The converted value divided by 2 is ", int atoi( const char *nPtr ) Converts the string to . nPtr int 14 d / 2.0 ); long atol( const char *nPtr ) Converts the string nPtr to long int. 15 return 0; double strtod( const char *nPtr, Converts the string nPtr to double. 16 } char **endPtr ) long strtol( const char *nPtr, Converts the string nPtr to long. The string "99.0" converted to double is 99.000 char **endPtr, int base ) The converted value divided by 2 is 49.500 unsigned long strtoul( const char Converts the string nPtr to unsigned *nPtr, char **endPtr, int base ) long.
1 /* Fig. 8.13: fig08_13.c 2 Using gets and putchar */ Standard Input/Output 3 #include
A Non-recursive Version of reverse() String Manipulation Functions of the String Handling Library
In the previous example, the function reverse() was written z String handling library has functions to recursively. Here is a non-recursive version of the function: z Manipulate string data z Search strings void reverse( const char * const sPtr ) { z Tokenize strings int i=0; z Determine string length
/* fin d en d o f the s tr ing. Cou ld a lso use s tr len () */ while (sPtr[i] != '\0') i++; Function prototype Function description char *strcpy( char *s1, Copies string s2 into array s1. The value of s1 is /* print the string from end to beginnning */ const char *s2 ) returned. for (; i>=0; i--) char *strncpy( char *s1, Copies at most n characters of string s2 into array s1. putchar(sPtr[i]); const char *s2, size_t n ) The value of s1 is returned. putchar('\n'); char *strcat( char *s1, Appends string s2 to array s1. The first character of const char *s2 ) s2 overwrites the terminating null character of s1. return; The value of s1 is returned. } // end reverse function char *strncat( char *s1, Appends at most n characters of string s2 to array s1. const char *s2, size_t n ) The first character of s2 overwrites the terminating null character of s1. The value of s1 is returned.
3 Comparing Character Strings Comparing Character Strings Consider the following example: z To determine if two character strings are char array1[ ] = “abcdef”; equal, they must be compared character-by char array2[ ] = “abcdef”; character if (array1 == array2) z C has libraryyp functions to perform character printf(“the arrays compare as equal\n”); string comparisons else printf(“the arrays do not compare as equal\n); The arrays do not compare as equal
Why?? Because the values of the pointers are being compared, not the contents of the strings.
Comparison Functions of the Search Functions of the String Handling Library String Handling Library
z Comparing strings Function prototype Function description z Computer compares numeric ASCII codes of characters in string char *strchr( const char *s, Locates the first occurrence of character c in string s. If c is found, a pointer to c in int c ); z Appendix D has a list of character codes s is returned. Otherwise, a NULL pointer is returned. size_t strcspn( const char Determines and returns the length of the initial segment of string s1 consisting of *s1, const char *s2 ); characters not contained in string s2. int strcmp( const char *s1, const char *s2 ); size_ t strspn( const char Determines and returns the length of the initial segment of string s1 consisting only *s1, const char *s2 ); of characters contained in string s2. z Compares string to s1 s2 char *strpbrk( const char Locates the first occurrence in string s1 of any character in string s2. If a character
z Returns a negative number if s1 < s2, zero if s1 == s2 or a *s1, const char *s2 ); from string s2 is found, a pointer to the character in string s1 is returned. Other- wise, a NULL pointer is returned. positive number if s1 > s2 char *strrchr( const char *s, Locates the last occurrence of c in string s. If c is found, a pointer to c in string s is int c ); returned. Otherwise, a NULL pointer is returned. char *strstr( const char *s1, Locates the first occurrence in string s1 of string s2. If the string is found, a pointer int strncmp( const char *s1, const char *s2, size_t n ); const char *s2 ); to the string in s1 is returned. Otherwise, a NULL pointer is returned. z Compares up to n characters of string s1 to s2 char *strtok( char *s1, const A sequence of calls to strtok breaks string s1 into “tokens”—logical pieces such char *s2 ); as words in a line of text—separated by characters contained in string s2. The first z Returns values as above call contains s1 as the first argument, and subsequent calls to continue tokenizing the same string contain NULL as the first argument. A pointer to the current token is returned by each call. If there are no more tokens when the function is called, NULL is returned.
1 /* Fig. 8.29: fig08_29.c 2 Using strtok */ 1 /* Fig. 8.27: fig08_27.c 3 #include
4 Other Functions of the 1 /* Fig. 8.37: fig08_37.c
String Handling Library 2 Using strerror */
3 #include
z Creates a system-dependent error message based on 5
errornum 6 int main()
z Returns a pointer to the string 7 { z size_t strlen( const char *s ); 8 printf( "%s\n", strerror( 2 ) ); z Returns the number of characters (before NULL) in string s 9 return 0; 10 }
No such file or directory
An Example Problem Statement z Take as input a string of characters (command line z Character and string processing example argument), and search for words as substrings z Use standard char/string library functions (fgets(), z Example: strlen(), strcpy, etc) ./a.out thisisnotatest hi z Top-down design, with divide-and-conquer is is AND step-wise refinement no at z Use a new “Unix” specific utility (“pipe”) his not ate this test
Understanding the Problem and Proposing an Refinement Algorithm/Solution z Take string input (argv[]) z Generate all substrings Input string example: test z Compare all generated substrings to a All possible (contiguous) sub-strings: “dic tionary ” to de term ine if su bs tr ing is a “real” word t, e, s, t te, es, st tes, est test
5 Refinement Refinement
For each “len”gth 1 to n (n == length of input string) z For len = 1 to strlen(input) For each character i in input string: for i = 0 to strlen(input) get each substring at position i, length “len” get_chars(input,temp,i,len)
Refinement -- Algorithm start dstart
source dest get_chars(source, destination, position, length=1 t e s t t length); start dstart z Function to pass in the source string, the source dest destination string, the position in the source length=2 t e s t e string, and the length of characters to acquire start dstart source dest
length=2 t e s t e s
#include 6 How do we compare these Output substrings to a dictionary? ./wordFind.exe test %cat file temp = t z Directs the contents of “file” to the “standard input” -- temp = e the contents are printed to the screen temp = s temp = t z This is a mechanism that we can use to send the temp = te content s of a l arge di c tionary fil e to our program temp = es z Unix pipe -- “|” temp = st %cat file | ./testProgram.exe temp = tes z Instead of sending the contents of the file to the temp = est screen, we can “pipe” them to our program temp = test Simple program to show how a Refinement “pipe” may work z Take string input (argv[]) #include printf("word = %s\n",word); } Expand the program to handle Output a large number of words: #define NUM_WORDS 45427 % more file #define MAX 30 // max word length test #include 7 Even simpler method for Output input/output %cat linux.words | ./readWords.exe word = Aarhus z cat is most useful when directly taking the output 0 from one program and sending it as the input to word = Aaron 1 another word = Ababa z If file already exists,use the Input Redirection Shell 2 word = aback CdCommand, < 3 z % wordFind.exe argvstuff < dictionary.txt word = abaft 4 z Can do the same for output (Output Redirection word = abandon Shell Command 5 word = abandoned z % wordFind.exe argvstuff for(length=1;length<=strlen(argv[1]);length++){ /* read in a string from the command line for(i=0;i<=strlen(argv[1])-length;i++) { and try all permutations against a dictionary */ get_chars(argv[1],temp,i,length); //printf("temp = %s\n",temp); #define NUM_WORDS 45427 #define MAX 30 // max word length for(j=0;j Output Future Modifications %cat dictionary.txt | ./wordFind.exe thisisnotatest z Modify to try all possible combinations of hi substrings: is is z the = t, h, e, th, te, he, ht, et, eh, the, teh, hte, het, no eht at z Read from file (instead of pipe/stdio) his z not Keep dictionary in memory and do multiple ate words this test 8