Homework 2 Joe College Pattern Matching with Grep MAT 4970 January 23, 2012
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  GNU Grep: Print Lines That Match Patterns Version 3.7, 8 August 2021GNU Grep: Print lines that match patterns version 3.7, 8 August 2021 Alain Magloire et al. This manual is for grep, a pattern matching engine. Copyright c 1999{2002, 2005, 2008{2021 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled \GNU Free Documentation License". i Table of Contents 1 Introduction ::::::::::::::::::::::::::::::::::::: 1 2 Invoking grep :::::::::::::::::::::::::::::::::::: 2 2.1 Command-line Options ::::::::::::::::::::::::::::::::::::::::: 2 2.1.1 Generic Program Information :::::::::::::::::::::::::::::: 2 2.1.2 Matching Control :::::::::::::::::::::::::::::::::::::::::: 2 2.1.3 General Output Control ::::::::::::::::::::::::::::::::::: 3 2.1.4 Output Line Prefix Control :::::::::::::::::::::::::::::::: 5 2.1.5 Context Line Control :::::::::::::::::::::::::::::::::::::: 6 2.1.6 File and Directory Selection:::::::::::::::::::::::::::::::: 7 2.1.7 Other Options ::::::::::::::::::::::::::::::::::::::::::::: 9 2.2 Environment Variables:::::::::::::::::::::::::::::::::::::::::: 9 2.3 Exit Status :::::::::::::::::::::::::::::::::::::::::::::::::::: 12 2.4 grep Programs :::::::::::::::::::::::::::::::::::::::::::::::: 13 3 Regular Expressions ::::::::::::::::::::::::::: 14 3.1 Fundamental Structure ::::::::::::::::::::::::::::::::::::::::
- 
												  “Linux at the Command Line” Don Johnson of BU IS&T We’Ll Start with a Sign in Sheet“Linux at the Command Line” Don Johnson of BU IS&T We’ll start with a sign in sheet. We’ll end with a class evaluation. We’ll cover as much as we can in the time allowed; if we don’t cover everything, you’ll pick it up as you continue working with Linux. This is a hands-on, lab class; ask questions at any time. Commands for you to type are in BOLD The Most Common O/S Used By BU Researchers When Working on a Server or Computer Cluster Linux is a Unix clone begun in 1991 and written from scratch by Linus Torvalds with assistance from a loosely-knit team of hackers across the Net. 64% of the world’s servers run some variant of Unix or Linux. The Android phone and the Kindle run Linux. a set of small Linux is an O/S core programs written by written by Linus Richard Stallman and Torvalds and others others. They are the AND GNU utilities. http://www.gnu.org/ Network: ssh, scp Shells: BASH, TCSH, clear, history, chsh, echo, set, setenv, xargs System Information: w, whoami, man, info, which, free, echo, date, cal, df, free Command Information: man, info Symbols: |, >, >>, <, ;, ~, ., .. Filters: grep, egrep, more, less, head, tail Hotkeys: <ctrl><c>, <ctrl><d> File System: ls, mkdir, cd, pwd, mv, touch, file, find, diff, cmp, du, chmod, find File Editors: gedit, nedit You need a “xterm” emulation – software that emulates an “X” terminal and that connects using the “SSH” Secure Shell protocol. ◦ Windows Use StarNet “X-Win32:” http://www.bu.edu/tech/support/desktop/ distribution/xwindows/xwin32/ ◦ Mac OS X “Terminal” is already installed Why? Darwin, the system on which Apple's Mac OS X is built, is a derivative of 4.4BSD-Lite2 and FreeBSD.
- 
												![User Commands Grep ( 1 ) Grep – Search a File for a Pattern /Usr/Bin/Grep [-Bchilnsvw] Limited-Regular-Expression [Filename](https://docslib.b-cdn.net/cover/5418/user-commands-grep-1-grep-search-a-file-for-a-pattern-usr-bin-grep-bchilnsvw-limited-regular-expression-filename-1165418.webp)  User Commands Grep ( 1 ) Grep – Search a File for a Pattern /Usr/Bin/Grep [-Bchilnsvw] Limited-Regular-Expression [FilenameUser Commands grep ( 1 ) NAME grep – search a file for a pattern SYNOPSIS /usr/bin/grep [-bchilnsvw] limited-regular-expression [filename...] /usr/xpg4/bin/grep [-E -F] [-c -l -q] [-bhinsvwx] -e pattern_list... [-f pattern_file]... [file...] /usr/xpg4/bin/grep [-E -F] [-c -l -q] [-bhinsvwx] [-e pattern_list...] -f pattern_file... [file...] /usr/xpg4/bin/grep [-E -F] [-c -l -q] [-bhinsvwx] pattern [file...] DESCRIPTION The grep utility searches text files for a pattern and prints all lines that contain that pattern. It uses a compact non-deterministic algorithm. Be careful using the characters $, ∗, [,ˆ,, (, ), and \ in the pattern_list because they are also meaning- ful to the shell. It is safest to enclose the entire pattern_list in single quotes ’ . ’. If no files are specified, grep assumes standard input. Normally, each line found is copied to standard output. The file name is printed before each line found if there is more than one input file. /usr/bin/grep The /usr/bin/grep utility uses limited regular expressions like those described on the regexp(5) manual page to match the patterns. /usr/xpg4/bin/grep The options -E and -F affect the way /usr/xpg4/bin/grep interprets pattern_list. If -E is specified, /usr/xpg4/bin/grep interprets pattern_list as a full regular expression (see -E for description). If -F is specified, grep interprets pattern_list as a fixed string. If neither are specified, grep interprets pattern_list as a basic regular expression as described on regex(5) manual page. OPTIONS The following options are supported for both /usr/bin/grep and /usr/xpg4/bin/grep: -b Precede each line by the block number on which it was found.
- 
												  Regular Expressions Grep and Sed Intro PreviouslyLecture 4 Regular Expressions grep and sed intro Previously • Basic UNIX Commands – Files: rm, cp, mv, ls, ln – Processes: ps, kill • Unix Filters – cat, head, tail, tee, wc – cut, paste – find – sort, uniq – comm, diff, cmp – tr Subtleties of commands • Executing commands with find • Specification of columns in cut • Specification of columns in sort • Methods of input – Standard in – File name arguments – Special "-" filename • Options for uniq Today • Regular Expressions – Allow you to search for text in files – grep command • Stream manipulation: – sed Regular Expressions What Is a Regular Expression? • A regular expression (regex) describes a set of possible input strings. • Regular expressions descend from a fundamental concept in Computer Science called finite automata theory • Regular expressions are endemic to Unix – vi, ed, sed, and emacs – awk, tcl, perl and Python – grep, egrep, fgrep – compilers Regular Expressions • The simplest regular expressions are a string of literal characters to match. • The string matches the regular expression if it contains the substring. regular expression c k s UNIX Tools rocks. match UNIX Tools sucks. match UNIX Tools is okay. no match Regular Expressions • A regular expression can match a string in more than one place. regular expression a p p l e Scrapple from the apple. match 1 match 2 Regular Expressions • The . regular expression can be used to match any character. regular expression o . For me to poop on. match 1 match 2 Character Classes • Character classes [] can be used to match any specific set of characters. regular expression b [eor] a t beat a brat on a boat match 1 match 2 match 3 Negated Character Classes • Character classes can be negated with the [^] syntax.
- 
												  Linux Terminal Commands Man Pwd Ls Cd Olmo SPython Olmo S. Zavala Romero Welcome Computers File system Super basics of Linux terminal Commands man pwd ls cd Olmo S. Zavala Romero mkdir touch Center of Atmospheric Sciences, UNAM rm mv Ex1 August 9, 2017 Regular expres- sions grep Ex2 More Python Olmo S. 1 Welcome Zavala Romero 2 Computers Welcome 3 File system Computers File system 4 Commands Commands man man pwd pwd ls ls cd mkdir cd touch mkdir rm touch mv Ex1 rm Regular mv expres- sions Ex1 grep Regular expressions Ex2 More grep Ex2 More Welcome to this course! Python Olmo S. Zavala Romero Welcome Computers File system Commands 1 Who am I? man pwd 2 Syllabus ls cd 3 Intro survey https://goo.gl/forms/SD6BM6KHKRlDOpZx1 mkdir 4 touch Homework 1 due this Sunday. rm mv Ex1 Regular expres- sions grep Ex2 More How does a computer works? Python Olmo S. Zavala Romero Welcome Computers File system Commands man pwd ls cd mkdir touch rm mv Ex1 Regular 1 CPU Central Processing Unit. All the computations happen there. expres- sions 2 HD Hard Drive. Stores persistent information in binary format. grep 3 RAM Random Access Memory. Faster memory, closer to the CPU, not persistent. Ex2 More 4 GPU Graphics Processing Unit. Used to process graphics. https://teentechdaily.files.wordpress.com/2015/06/computer-parts-diagram.jpg What is a file? Python Olmo S. Zavala Romero Welcome Computers File system Commands man pwd ls cd mkdir touch rm mv Section inside the HD with 0’s and 1’s. Ex1 Regular File extension is used to identify the meaning of those 0’s and 1’s.
- 
												  How to Build a Search-Engine with Common Unix-ToolsThe Tenth International Conference on Advances in Databases, Knowledge, and Data Applications Mai 20 - 24, 2018 - Nice/France How to build a Search-Engine with Common Unix-Tools Andreas Schmidt (1) (2) Department of Informatics and Institute for Automation and Applied Informatics Business Information Systems Karlsruhe Institute of Technologie University of Applied Sciences Karlsruhe Germany Germany Andreas Schmidt DBKDA - 2018 1/66 Resources available http://www.smiffy.de/dbkda-2018/ 1 • Slideset • Exercises • Command refcard 1. all materials copyright, 2018 by andreas schmidt Andreas Schmidt DBKDA - 2018 2/66 Outlook • General Architecture of an IR-System • Naive Search + 2 hands on exercices • Boolean Search • Text analytics • Vector Space Model • Building an Inverted Index & • Inverted Index Query processing • Query Processing • Overview of useful Unix Tools • Implementation Aspects • Summary Andreas Schmidt DBKDA - 2018 3/66 What is Information Retrieval ? Information Retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an informa- tion need (usually a query) from within large collections (usually stored on computers). [Manning et al., 2008] Andreas Schmidt DBKDA - 2018 4/66 What is Information Retrieval ? need for query information representation how to match? document document collection representation Andreas Schmidt DBKDA - 2018 5/66 Keyword Search • Given: • Number of Keywords • Document collection • Result: • All documents in the collection, cotaining the keywords • (ranked by relevance) Andreas Schmidt DBKDA - 2018 6/66 Naive Approach • Iterate over all documents d in document collection • For each document d, iterate all words w and check, if all the given keywords appear in this document • if yes, add document to result set • Output result set • Extensions/Variants • Ranking see examples later ...
- 
												  Introduction to BASH: Part IIIntroduction to BASH: Part II By Michael Stobb University of California, Merced February 17th, 2017 Quick Review ● Linux is a very popular operating system for scientific computing ● The command line interface (CLI) is ubiquitous and efficient ● A “shell” is a program that interprets and executes a user's commands ○ BASH: Bourne Again SHell (by far the most popular) ○ CSH: C SHell ○ ZSH: Z SHell ● Does everyone have access to a shell? Quick Review: Basic Commands ● pwd ○ ‘print working directory’, or where are you currently ● cd ○ ‘change directory’ in the filesystem, or where you want to go ● ls ○ ‘list’ the contents of the directory, or look at what is inside ● mkdir ○ ‘make directory’, or make a new folder ● cp ○ ‘copy’ a file ● mv ○ ‘move’ a file ● rm ○ ‘remove’ a file (be careful, usually no undos!) ● echo ○ Return (like an echo) the input to the screen ● Autocomplete! Download Some Example Files 1) Make a new folder, perhaps ‘bash_examples’, then cd into it. Capital ‘o’ 2) Type the following command: wget "goo.gl/oBFKrL" -O tutorial.tar 3) Extract the tar file with: tar -xf tutorial.tar 4) Delete the old tar file with rm tutorial.tar 5) cd into the new director ‘tutorial’ Input/Output Redirection ● Typically we give input to a command as follows: ○ cat file.txt ● Make the input explicit by using “<” ○ cat < file.txt ● Change the output by using “>” ○ cat < file.txt > output.txt ● Use the output of one function as the input of another ○ cat < file.txt | less BASH Utilities ● BASH has some awesome utilities ○ External commands not
- 
												  Laboratory ManualLINUX SHELL PROGRAMMING LAB (4CS4-24) LABORATORY MANUAL FOR IV -SEMESTER COMPUTER SCIENCE& ENGINEERING Linux Shell Programming Lab- 4CS4-24 INTERNAL MARKS: 30 EXTERNAL MARKS: 20 Department of Computer Science and Engineering GLOBAL INSTITUTE OF TECHNOLOGY, JAIPUR Department of Computer Science & Engineering Global Institute of Technology, Jaipur LINUX SHELL PROGRAMMING LAB (4CS4-24) Scheme as per RTU M.M M.M. Subject Exam Sessional/ Total Name of Subject L T P End Code Hrs. Mid M.M. Term Term 4CS4-24 Linux Shell Prog. 2 - - 2 30 20 50 Lab Assessment criteria A. Internal Assessment: 30 In continuous evaluation system of the university, a student is evaluated throughout semester. His/her performance in the lab, attendance, practical knowledge, problem solving skill, written work in practical file and behavior are main criteria to evaluate student performance. Apart from that a lab quiz will be organize to see program programming skill and knowledge about the proposed subject. B. External Assessment: 20 At the end of the semester a lab examination will be scheduled to check overall programming skill, in which student will need to solve 2 programming problems in time span of 3 hours. C. Total Marks: 30+20=50 Department of Computer Science & Engineering Global Institute of Technology, Jaipur LINUX SHELL PROGRAMMING LAB (4CS4-24) SYLLABUS AS PER RTU 1. Use of basic Unix Shell Commands: ls, mkdir, rmdir, cd, cat, banner, touch, file, wc, sort, cut, grep, dd, dfspace, du, ulimit. 2. Commands related to inode, I/O redirection, piping, process control commands, mails. 3. Shell Programming: shell script exercise based on following: a.
- 
												  Introduction to Bash ProgrammingBash Scripting: Advanced Topics CISC3130, Spring 2013 Dr. Zhang 1 Outline Review HW1, HW2 and Quiz2 Review of standard input/output/error How to redirect them ? Pipeline Review of bash scripting Functions Here documents Arrays 2 Homework 2 match phone numbers in text file 7188174484, 718-817-4484, (718)817,4484 817-4484, or 817,4484, 8174484. (01)718,817,4484, 01,718-817-4484 grep -f phone.grep file.txt , where phone.grep: [^0-9][0-9]\{10\}$ Match 10 digits at end of line [^0-9][0-9]\{10\}[^0-9] Match 10 digits, and a non-digit char [^0-9][0-9]\{3\}\-[0-9]\{3\}\-[0-9]\{4\}$ 718-817,4484 at end of line [^0-9][0-9]\{3\}\-[0-9]\{3\}\-[0-9]\{4\}[^0-9] [^0-9][0-9]\{3\}\,[0-9]\{4\}$ [^0-9][0-9]\{3\}\,[0-9]\{4\}[^0-9] [^0-9][0-9]\{3\}\-[0-9]\{4\}$ [^0-9][0-9]\{3\}\-[0-9]\{4\}[^0-9] 3 Homework 2 [^0-9]*\([0-9]\{2\}\)\([0-9]\{3\}\)[0-9]\{3\}\,[0-9]\{4\}$ [^0-9]*\([0-9]\{2\}\)\([0-9]\{3\}\)[0-9]\{3\}\,[0-9]\{4\}[^0-9] [^0-9]*\([0-9]\{2\}\)\([0-9]\{3\}\))?[0-9]\{3\}\-[0-9]\{4\}$ [^0-9]*\([0-9]\{2\}\)\([0-9]\{3\}\))?[0-9]\{3\}\-[0-9]\{4\}[^0-9] [^0-9]*[0-9]\{2\}\,[0-9]\{3\}\,[0-9]\{3\}\,[0-9]\{4\}$ [^0-9]*[0-9]\{2\}\,[0-9]\{3\}\-[0-9]\{3\}\-[0-9]\{4\}[^0-9] [^0-9]*[0-9]\{2\}\,[0-9]\{3\}\-[0-9]\{3\}\-[0-9]\{4\}$ [^0-9]*[0-9]\{2\}\,[0-9]\{3\}\,[0-9]\{3\}\,[0-9]\{4\}[^0-9] 4 Homework 2 Write a sed script file that remove all one-line comments from C/C++ source code files.
- 
												  UNIX II:Grep, Awk, Sed October 30, 2017 File Searching and ManipulationUNIX II:grep, awk, sed October 30, 2017 File searching and manipulation • In many cases, you might have a file in which you need to find specific entries (want to find each case of NaN in your datafile for example) • Or you want to reformat a long datafile (change order of columns, or just use certain columns) • Can be done with writing python or other scripts, today will use other UNIX tools grep: global regular expression print • Use to search for a pattern and print them • Amazingly useful! (a lot like Google) grep Basic syntax: >>> grep <pattern> <inputfile> >>> grep Oklahoma one_week_eq.txt 2017-10-28T09:32:45.970Z,35.3476,-98.0622,5,2.7,mb_lg,,133,0.329,0.3,us,us1000ay0b,2017-10-28T09:47:05.040Z,"11km WNW of Minco, Oklahoma",earthquake,1.7,1.9,0.056,83,reviewed,us,us 2017-10-28T04:08:45.890Z,36.2119,-97.2878,5,2.5,mb_lg,,41,0.064,0.32,us,us1000axz3,2017-10-28T04:22:21.040Z,"8km S of Perry, Oklahoma",earthquake,1.4,2,0.104,24,reviewed,us,us 2017-10-27T18:39:28.100Z,36.4921,-98.7233,6.404,2.7,ml,,50,,0.41,us,us1000axpz,2017-10-28T02:02:23.625Z,"33km NW of Fairview, Oklahoma",earthquake,1.3,2.6,,,reviewed,tul,tul 2017-10-27T10:00:07.430Z,36.2851,-97.506,5,2.8,mb_lg,,25,0.216,0.19,us,us1000axgi,2017-10-27T19:39:37.296Z,"19km W of Perry, Oklahoma",earthquake,0.7,1.8,0.071,52,reviewed,us,us 2017-10-25T15:17:48.200Z,36.2824,-97.504,7.408,3.1,ml,,25,,0.23,us,us1000awq6,2017-10-25T21:38:59.678Z,"19km W of Perry, Oklahoma",earthquake,1.1,5,,,reviewed,tul,tul 2017-10-25T11:05:21.940Z,35.4134,-97.0133,5,2.5,mb_lg,,157,0.152,0.31,us,us1000awms,2017-10-27T21:37:47.660Z,"7km
- 
												  Lecture 7 Regular Expressions and Grep 7.1LECTURE 7 REGULAR EXPRESSIONS AND GREP 7.1 Regular Expressions 7.1.1 Metacharacters, Wild cards and Regular Expressions Characters that have special meaning for utilities like grep, sed and awk are called metacharacters. These special characters are usually alpha-numeric in nature; a "e# e$- ample metacharacters are: caret &�'( dollar &+'( question mar- &.'( asterisk &/'( backslash &1'("orward slash &2'( ampersand &3'( set braces &4 an) 5'( range brac-ets &[...]). These special characters are interpreted contextually 0y the utilities. If you need to use the spe- cial characters as is without any interpretation, they need to be prefixed with the escape character (\). 8or example( a literal + can be inserted as 1+! a literal \can be inserted as \\. Although not as commonly used, numbers also can be turned into metacharacters 0y using escape character. 8or example( a number 9 is a literal number tw*( while 19 has a special meaning. Often times( the same metacharacters ha:e special meaning "or the shell as #ell. Thus #e need to be careful when passing metacharacters as arguments to utilities. It is a standard practice to embed the metacharacter arguments in single quotes to stop the shell from interpreting the metacharacters. $grep a*b file . asteris- gets interpreted 0y shell $grep ’a*b’ file . asterisk gets interpreted 0y 6rep utility Although single quotes are needed only "or arguments with metacharacters( as a 6**) practice( the pattern arguments *" the 6rep( sed and a#- utilities are always embedded in single quotes. $grep ’ab’ file $sed ’/a*b/’ file 7.1 Regular Expressions 47 Wild card Meaning given 0y shell / Zero or more characters .
- 
												  LFCS Exam Name: Linux Foundation Certified System AdministratorVendor: Linux Foundation Exam Code: LFCS Exam Name: Linux Foundation Certified System Administrator Version: DEMO QUESTION 1 What is the output of the following command? echo "Hello World" | tr -d aieou A. Hello World B. eoo C. Hll Wrld D. eoo Hll Wrld Answer: C QUESTION 2 Given a file called birthdays containing lines like: YYYY-MM-DD Name 1983-06-02 Tim 1995-12-17 Sue Which command would you use to output the lines belonging to all people listed whose birthday is in May or June? A. grep '[56]' birthdays B. grep 05?6? birthdays C. grep '[0-9]*-0[56]-' birthdays D. grep 06 birthdays | grep 05 Answer: C QUESTION 3 Which keyword must be listed in the hosts option of the Name Service Switch configuration file in order to make host lookups consult the /etc/hosts file? Answer: files QUESTION 4 Which command can be used to delete a group from a Linux system? A. groupdel B. groupmod C. groups D. groupedit Answer: A QUESTION 5 How many IP-addresses can be used for unique hosts inside the IPv4 subnet 192.168.2.128/28? (Specify the number only without any additional information.) Answer: 14 QUESTION 6 What is the purpose of the command mailq? A. It fetches new emails from a remote server using POP3 or IMAP. B. It is a multi-user mailing list manager. C. It is a proprietary tool contained only in the qmail MTA. D. It queries the mail queue of the local MTA. E. It is a command-line based tool for reading and writing emails.