Saturday, 31 October 2015

Regular Expressions in C#

Regular Expressions
Regular expressions are Patterns that can be used to match strings. We can call it a formula for matching strings that follow some pattern. Regular expression(s) can be considered as a Language, which is designed to manipulate text.

Regular Expressions may be used to find one or more occurrences of a pattern of characters within a string. You may choose to replace it with some other characters or perform some other tasks based on the results obtained. These patterns of characters can be simple or very complex. Regular Expressions generally comprises of two types of characters -
  1. Literal or Normal Characters such as "abcd123"
  2. Special Characters that have a special meaning such as "." Or "$" or "^" 
Due to the special characters Regular Expressions form a very powerful means of manipulating strings and text.

Meta-characters and their Description


.

Matches any single character. An example of this is the regular expression s.t would match the strings sat, sit, but not sight.

$

Matches the end of a line. For instance, the regular expression reason$ would match the end of the string "He has a reason" but not the string "He has his reasons"

^

Matches the beginning of a line. For instance, the regular expression ^Where would match the beginning of the string "Where is my cap" but would not match "Do you know Where it is " .

*

Matches zero or more occurrences of the character immediately preceding. For example, the regular expression .* means match any number of any characters.

[ ]


[c1-c2]



 
[^c1-c2]
·         Matches any one of the characters between the brackets.
For example, the regular expression s[ia]t matches sat, sit, but not set.
·         Ranges of characters can specified by using a hyphen.
For example, the regular expression [0-9] means match any digit. Multiple ranges can be specified as well. The regular expression [A-Za-z] means match any upper or lower case letter.

·         To match any character except those in the range, the complement range, use the caret as the first character after the opening bracket.
For example, the expression [^123a-z] will match any characters except 1,2, 3, and lower case letters.

|

Or two conditions together. For example (him|her) matches the line "it belongs to him" and matches the line "it belongs to her" but does not match the line "it belongs to them."

+

Matches one or more occurrences of the character or regular expression immediately preceding. For example, the regular expression 9+ matches 9, 99, 999.

?

Matches 0 or 1 occurrence of the character or regular expression immediately preceding.

{i}




{i,j}
·         Match a specific number of instances or instances within a range of the preceding character.
For example, the expression A[0-9]{3} will match "A" followed by exactly 3 digits. That is, it will match A123 but not A1234.

·         The expression [0-9]{4,6} any sequence of 4, 5, or 6 digits

\d Matches a digit character. Equivalent to [0-9].

\D Matches a non-digit character. Equivalent to [^0-9].

\w Matches any word character including underscore. Equivalent to "[A-Za-z0-9_]".

 \W Matches any non-word character. Equivalent to "[^A-Za-z0-9_]".

\b Matches a word boundary, that is, the position between a word and a space. For example, "er\b" matches the "er" in "never" but not the "er" in "verb".
\B Matches a non-word boundary. "ea*r\B" matches the "ear" in "never early".

No comments:

Post a Comment