Saturday, June 16, 2012

Linux 15 - Search textfiles with regular expressions

  • grep
  • grep -F (fgrep)
  • grep -E (egrep)
  • regex (grep -R)
Grepping involves matching a pattern and printing it on the screen.
Per dictionary.com grep is an acronym which stands for:
Globally search for the Regular Expression and Print the lines containing matches to it.
 
grep

# grep oo file.txt
books Books
book
doogy

# grep boo file.txt
books book

What it does: Find the group of letters 'oo' in the file 'file.txt' (case-sensitive).

# grep -n oo file.txt
2:books 3:Books 4:book 6:doogy

What it does: Find the group of letters 'oo' in the file 'file.txt' (case-sensitive) and print which line number matched the condition (-n)

# grep -i boo file.txt
books Books book

What it does: Case-insensitive search of ‘boo’

# ls | grep ile
file2.txt
file.txt

What it does: Uses piping to grep the output of the ls command

# cat file.txt | grep oo
books
Books
book
doogy
What it does: Similar to the 1st example but instead uses cat and then grep.
# grep ple file.txt
purple
plenty
# grep ^ple file.txt
plenty

What it does: The two words ‘purple’ and ‘plenty’ are found by grepping ‘ple’ with the use of ‘^ple’ you will restrict the search to those lines that only begin with ‘ple

# grep s$ file.txt
baldes
books
Books

What it does: it finds all the words that have 's' at the end of the word

# grep .o file.txt
books
Books
book
doogy
# grep .ple file.txt
purple

What it does: Matches any single character

grep -F (fgrep) - fast grep, literal strings
# fgrep fa$ file.txt
fa$t
grep -e (egrep) extended regex grep
# egrep .ple file.txt
purple

# egrep '^(b|d)' file.txt
baldes
books
book
doogy
What it does: finds either b or d at the beginning of the line.
# egrep '^(b|d)oo' file.txt
books
book
doogy

# egrep '^[a-k]' file.txt
baldes
books
Books
book
kitten
doogy
fast

# egrep '^[a-k]|[A-K]' file.txt
baldes
books
Books
book
kitten
doogy
purple
fast
skip
plenty

sed

# sed -e 's/oo/00/' file.txt
baldes b00ks B00ks b00k kitten d00gy purple fa$t skip plenty
# sed -re 's/^(B|b)/C/' file.txt
Caldes Cooks Cooks Cook kitten doogy purple fa$t skip plenty

regex

  • ^ begining of line also characters not in the range of a list
  • $ end of line
  • . any character
  • [a-z] range

No comments:

Post a Comment