Wednesday, July 28, 2010

grep regular exp.

Special Characters

Here, we outline the special characters for grep. Note that in egrep (which uses extended regular expressions), which actually are no more functional than standard regular expressions if you use GNU grep ) , the list of special characters increases ( | in grep is the same as \| egrep and vice versa, there are also other differences. Check the man page for details ) The following characters are considered special and need to be "escaped":
?  \  .  [  ]  ^  $
Note that a $ sign loses its meaning if characters follow it (I think) and the carat ^ loses its meaning if other characters precede it.
Square brackets behave a little differently. The rules for square brackets go as follows:
  • A closing square bracket loses its special meaning if placed first in a list. for example []12] matches ] , 1, or 2.
  • A dash - loses it's usual meaning inside lists if it is placed last.
  • A carat ^ loses it's special meaning if it is not placed first
  • Most special characters lose their meaning inside square brackets
  • * if at the beginning of the regular exps, lose its meaning.

A regular expression may be followed by one of several repetition operators:
? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
{n} The preceding item is matched exactly n times.
{n,} The preceding item is matched n or more times.
{n,m} The preceding item is matched at least n times, but not more than m times.

In basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions
\?, \+, \{, \|, \(, and \).

6 comments:

  1. only . and * have special meanings in grep

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. The metacharacters that do not have special meaning in grep:
    1. the repetition operators: ?, +, {}.
    * has special meaning.

    2. The anchoring operators: ^ and $ have special meaning(lose meaning if not for positioning)

    3. Alternation | have no special meaning

    4. bracket expression [] has special meaning

    5. Paretheses (): no special meaning

    6. Operator . and \ have special meanings

    ---------------------------------------
    Operators ., *, ^, $, [], has special meaning, also, \ has

    ReplyDelete
  4. In item 1, the repetition operators: ?, +, {} have no special meaning. * has special meaning.

    Special groups such as [[:num:]] and \w, word boundaries -w in grep

    ReplyDelete
  5. The main differences between the vi/sed and the grep/gawk:

    In vim/sed, the \(, \) is used for storing pattern for later replay, grep/gawk does not have this.
    In grep/gawk, the \(, \) is used for alternations.

    vim and sed difference: vim has \< and \>, indicating word boundaries.

    gawk and grep differences: in grep {} are common characters, to use the repetition meaning, we need the \{ and \}. In gawk, the {} are special characters

    ReplyDelete
  6. vim and sed difference: vim has \< and \>, indicating word boundaries. sed also has the same thing. This is not a difference

    ReplyDelete

Note: Only a member of this blog may post a comment.