Quoting Expressions

A string in "" expends special characters
A string in '' is as-is

use q for single quote

q|this is to be quoted|

use qq for double quote

qq~this is to be double quoted~

Text Quoting
 
print <<MARK ;
Content
MARK
 
Example
 
my $str1 = <<END;
text text text
      text tex text
text text text
END
 
$text =~ s|find\s+this\s+line|$str1|g;
 
Use <<' for single quoted string and <<" for double quoted string
 
Use qr/ for quoted, compiled expression
 
$pattern = qr/pattern/;
/$pattern/;
 
Use the trailing /x in a regular expression to divid the expression in separate lines
 
/(abc)  # comment on abc
 (xyz)   # comment on xyz
 (123)
/x
 
Use the trailing /o in a regular expression to ask Perl not to recompile
 
 

Pattern Matching

use Regexp::Common module for common expressions
 
Find pattern in line, return true/false

if ($_ =~ /pattern/), which is the same as if (/pattern/)

Not find pattern

(!/pattern/)

Substitution

$_ =~ s/pattern/newpattern/g

Case insensitive

if (/pattern/i)

A set of characters

[ABCDE], [A-Z], or [A-Z0-9]

Not a set of characters

[^ABC]

One or the other

$s =~ /^(A|B)$/

Special shortcuts

\d == [0-9] \D == non-digit
\w == [0-9A-Za-z] \W == non-word
\s == [\t\n\r] \S == non-white space
\h == horizontal white space  
\v == vertical white space  
\R == line ending  
\N == non-new line  
 . == any character
\b == word boundary

This or That

th(is|at)

Repeat Qualifiers after a character

? - zero or 1
* - zero or more
+ - one or more
{x,y}   - between x and y times
{x} - exactly x times
 *? - zero or more but non-greedy
 +? - one or more but non-greedy


Others

(abc)  the string "abc"


Extract matched string

Use of parenthesis

(pattern1)(pattern2)(pattern3) will be extracted to $1, $2, and $3

use the trailing /x in a regular expression to divid the expression in separate lines

use the trailing /o in a regular expression to ask Perl not to recompile

 


Parameter Matching and Capture

Parameters are captured by enclosing a matching expression in parenthesis. That is, (\S*)...(\w+). The parameters can then be accessed using $1, $2, $3, etc..

Ex: $d =~ s/\d{2}(\d{2})(\d{2})(\d{2})/$2\/$3\/$1/; #date translation

Add label to expression

(\S+) becomes (?<label>\S+) and can be referenced by $+{label} - a hash

To allow a non-capture parenthesis

(?:....)

Use non-greedy match to avoid performance penalty

Avoid using the match variables $`, $&, and $', or use the /p flag

Once these match variables are used, they incur a performance penalty on subsequent matches. To specify that these variables should only be set for the current match, use the suffix /p int eh expression - m/..../p.

 

Performance 

To test performance of regexp, use benchmark

use Benchmark qw(timethese);

timethese {
........
}

 Example

 ".*STRING.*" matches "     STRING     "