Skip to main content\(
\newcommand{\lt}{<}
\newcommand{\gt}{>}
\newcommand{\amp}{&}
\definecolor{fillinmathshade}{gray}{0.9}
\newcommand{\fillinmath}[1]{\mathchoice{\colorbox{fillinmathshade}{$\displaystyle \phantom{\,#1\,}$}}{\colorbox{fillinmathshade}{$\textstyle \phantom{\,#1\,}$}}{\colorbox{fillinmathshade}{$\scriptstyle \phantom{\,#1\,}$}}{\colorbox{fillinmathshade}{$\scriptscriptstyle\phantom{\,#1\,}$}}}
\)
Exercises 12.10 Multiple Choice Questions
1.
11-9-1: What will the following code print?
import re
hand = open('mbox-short.txt')
for line in hand:
line = line.rstrip()
if re.search('^From:', line):
print(line)
Any line containing 'From'
The '^' symbol matches strings from the beginning of the line.
Any line that starts with 'From'
It must include the colon to match.
Any line that starts with 'From:'
Correct! The regex equation will match with any line beginning with 'From:'.
Any line containing 'From:'
The '^' symbol matches strings at the beginning of the line.
2.
11-9-2: Which of the following options is the correct way to import the regex library?
import regex
We shorten regex a little more!
import re
This is how to import the regex library.
import Regex
Try shortening regex and using all lowercase.
import Re
Close, but Python is case sensitive.
3.
11-9-3: Which of these option(s) will match digits in a string? Select all that apply.
[0-9]
This is one way to match all digits using regex.
\D
\D matches any non-digit character.
\d
\d matches any digit from 0 to 9.
\w
\w matches word characters, not digits.
4.
11-9-4: Which of these lines will be matched when the following code is run?
import re
hand = open('mbox-short.txt')
for line in hand:
line = line.rstrip()
if re.search('\$.+', line):
print(line)
It will cost you $1.00
There is a dollar sign followed by one or more characters.
From: stephen.marquard@uct.ac.za $a
This will match a dollar sign followed by at least one character.
$2.50 is your change
The dollar sign in this line is followed by more than one character.
You owe three dollars.
The string needs to include a dollar sign.
5.
11-9-5: True or false? The following code will match only the first email(up to the @ sign) in the string?
import re
stri = 'From: stephen.a.smith@espn.com, drake@hotmail.com, frenchMontana@gmail.com'
stri = stri.rstrip()
print(re.findall('From:.+@', stri))
True
Try again!
False
The '+' character in regex is greedy, therefore it will match with the entire string and not just one email.
6.
11-9-6: Which of the following options will the regular expression ^X-.*: [0-9.]+ match?
X-DSPAM-Probability: Accurate
This is missing a digit for the equation to match.
X-DSPAM-Confidence: 0.8475
This matches the given regex equation.
X-Wolverine-Confidence: 1
This line matches with the given regex equation.
X Wolverine-Confidence: 0.53
This is missing a dash between 'X' and other letters.
7.
11-9-7: Which of the following regex statements will match and return any price of the form $3.45 or $23.32 or $400 found in the string ‘x'?
re.findall('\$\d+\.\d{2}', x)
This would require the "." and two digits after the period
re.findall('$\d+(?:\.\d{2})?', x)
The '$' needs to be escaped with a '\'
re.findall('\$\d+(\.\d{2})?', x)
This will only return the "." and digits after if they exist
re.findall('\$\d+(?:\.\d{2})?', x)
This will allow for the period and ending two digits to be optional and will return the whole price
8.
11-9-8: Which of these command lines will match lines in mbox-short.txt starting with ‘UMich'?
# grep '^Umich' mbox-short.txt
grep is case sensitive!
$ grep '^UMich' mbox-short.txt
This is the proper use of the grep command-line.
$ grep 'UMich' mbox-short.txt
This is missing a character!
# grep 'UMich' mbox-short.txt
Not quite the correct symbol at the start of the line.
9.
11-9-9: If you are having trouble remembering what methods the regex library contains, which command can you use to find said methods?
help()
This will trigger the interactive help system, but not the methods of regex.
dir(regex)
We shorten regex to something shorter.
dir(re)
This will return a list containing all the methods of the regex library.
import re
This will import the regular expression module.