<<

Characters, Character Strings, and string-manipulation functions in

see Kernighan & Ritchie – Section 1.9, Appendix B3 Characters

 Printable characters (and some non-printable ones) are represented as 8-bit numeric values  Stored in variables of type char

 7-bit ASCII character code is used in C  compatible with Latin-1, UTF-8 encodings

Character ASCII decimal ASCII hexadecimal ASCII binary A 65 0x41 0100 0001 a 97 0x61 0110 0001 B 66 0x42 0100 0010 b 98 0x62 0110 0010

Character Strings (Text)

 C does not have a “string” type!

 C has arrays of variables of its data types, including characters

 Arrays must have a constant, fixed length

 Text is kept in character arrays

 The arrays must be as big as or bigger than the text strings  In practice, they are almost always bigger Strings of Characters in Character Arrays

 C character strings are held in memory as ASCII values, with an ASCII 0, or , at the end.  “Null-terminated strings”  Also called ASCIIZ

 Character arrays must be big enough to include the null character as well as the printable characters

 Any extra elements in the array may be filled with more nulls, with garbage, or with remnants of previous strings Putting a String in a Character Array

 Initialize an array when you declare it:  char bfrA[10] = "abcdefg"; // last two char's unused  char bfrB[] = "hijklm"; // array is just big enough

 Assign characters one-by-one:  char bfrA[10]; bfrA[0] = 'a'; bfrA[1] = 'b'; We'll see a better way shortly bfrA[2] = 'c'; ⁝ bfrA[7] = '\0'; String Input

 scanf() - use “%s” for the format specifier, and supply a character array  Amount of input can be limited with the size modifier: “%20s” will get 20 characters

 fgets() expects a character array  fgets() also expects the array size, to limit the amount of input text  gets() also expects a character array, but don't use it – always use fgets() instead

 The result is an array of characters that are valid up to the terminating null String Output

 printf(), puts(), fputs() - all expect a null- terminated character string

 They will keep printing characters until they see a null  So if you give them a character array that doesn’t contain a null, they’ll keep going - off the end of the array, and until they happen to run into a 0 byte somewhere in memory (or run out of legal memory) Example - Caesar Cipher

 Also known as "rot-N" for "rotate N characters"  rot-13 is a common one

 Simply done with character arithmetic

 Use Boolean variables

 Repeated inputs

 Rotation count as cmd-line input Solution Sometimes side-effects are good, even necessary...

Character-String Functions in C Functions That Work With Character Strings

Find these in

 strlen(char *src)  report the number of "meaningful" characters  stops counting at the first null character

 strcmp(char *dest, char *src)  compare two string arrays  returns 0 if they match each other  returns -1 if dest comes before (is "less than") src  returns +1 if dest comes after src

 note: "dest == src" would test whether both names refer to the same string Finding a Character In a String

 char *strchr(char *s, int c)  Return the location of the first occurrence of character c in the string s . This is a pointer, not an index/offset

 char *strrchr(char*s, int c)  Return the location of the last occurrence of character c in the string s

 char *strstr(char*haystack, char *needle)  Return the location of the substring needle within the (larger) string haystack Example: the strchr() function

strrchr() – finds "am!" Making New Strings

 strncpy(char *dest, char *src, size_t n) . size_t is a type related to unsigned or long unsigned  copy a text string from src array into dest array  copies at most n characters . if src string is longer than n, the copied result will not be null terminated!  set n <= the dest array's length to prevent buffer overflow

 strncat(char *dest, char *src, size_t n)  appends at most n characters from src array to end of dest array . if src string is longer than n, the copied result will not be null terminated Putting a String in a Character Array

 Assign characters to the array:  char bfrA[10];

/* bfrA[0] = 'a'; bfrA[1] = 'b'; bfrA[2] = 'c'; ⁝ Ugh bfrA[7] = '\0'; */

strncpy(bfrA, "abcdefg", 10); Better than character-by-character Avoiding Buffer Overflow

 Recognize, but don't use these... Older versions of strncpy(), strcat():

 strcpy(char *dest, char *src)  copy a text string from one array into another  appends null character to the end  buffer overflow occurs if src is longer than dest!

 strcat(char *dest, char *src)  appends a text string to the end of another  also appends null character  can also overflow the dest Implementing String Functions

 Functions that operate on strings actually work on character arrays

 A function must step through each character of the arrays, checking for the null character

 Typical function uses a loop to step through the array An Implementation of “strlen()”

 Prototype:  unsigned strlen(char *src);

 Equivalent to:  unsigned strlen(char src[]);

 A simple counting loop works:  unsigned strlen(char *src) { unsigned i; for (i = 0; src[i] != ‘\0’; i++) ; return i; } An Implementation of “strncpy()”

 Prototype:  char *strncpy(char *dest, char *src, unsigned maxlen);

 Equivalent to:  char *strncpy(char dest[], char src[], unsigned maxlen);

 Use a loop to copy (see "man strncpy"):  for (i = 0; i < maxlen && src[i] != '\0'; i++) dest[i] = src[i]; for ( ; i < maxlen; i++) // pad with nulls, if possible dest[i] = '\0'; return dest;