M
ột regular expression là một pattern mà regular expression engine cố gắng để match trong văn bản đầu vào. Một pattern bao gồm một hoặc nhiều character literals, operators, or constructs.
Bạn có thẻ tham khảo các bảng character literals, operators, or constructs dưới đây. Sử dụng chúng để tạo một pattern hay cho việc xử lý văn bản:
- Single characters ↓
- Control characters ↓
- Character classes ↓
- Quantifiers ↓
- Anchors ↓
- Groups ↓
- Inline options ↓
- Backreferences ↓
- Alternation ↓
- Substitution ↓
- Comments ↓
- Supported unicode categories ↓
- Regular expression operations ↓
- a. Pattern matching with Regex objects ↓
- b. Pattern matching with static methods ↓
- c. Finding and replacing matched patterns ↓
1. Single Characters ↑
Use | To match any character |
[set] | In that set |
[^set] | Not in that set |
[a–z] | In the a-z range |
[^a–z] | Not in the a-z range |
. | Any except \n (new line) |
\char | Escaped special character |
2. Control Characters ↑
Use | To match | Unicode |
\t | Horizontal tab | \u0009 |
\v | Vertical tab | \u000B |
\b | Backspace | \u0008 |
\e | Escape | \u001B |
\r | Carriage return | \u000D |
\f | Form feed | \u000C |
\n | New line | \u000A |
\a | Bell (alarm) | \u0007 |
\c char | ASCII control character | - |
3. Character Classes ↑
Use | To match character |
\p{ctgry} | In that Unicode category or block |
\P{ctgry} | Not in that Unicode category or block |
\w | Word character |
\W | Non-word character |
\d | Decimal digit |
\D | Not a decimal digit |
\s | White-space character |
\S | Non-white-space char |
4. Quantifiers ↑
Greedy | Lazy | Matches |
* | *? | 0 or more times |
+ | +? | 1 or more times |
? | ?? | 0 or 1 time |
{n} | {n}? | Exactly n times |
{n,} | {n,}? | At least n times |
{n,m} | {n,m}? | From n to m times |
5. Anchors ↑
Use | To specify position |
^ | At start of string or line |
\A | At start of string |
\z | At end of string |
\Z | At end (or before \n at end) of string |
$ | At end (or before \n at end) of string or line |
\G | Where previous match ended |
\b | On word boundary |
\B | Not on word boundary |
6. Groups ↑
Use | To define |
(exp) | Indexed group |
(?<name>exp) | Named group |
(?<name1-name2>exp) | Balancing group |
(?:exp) | Noncapturing group |
(?=exp) | Zero-width positive lookahead |
(?!exp) | Zero-width negative lookahead |
(?<=exp) | Zero-width positive lookbehind |
(?<!exp) | Zero-width negative lookbehind |
(?>exp) | Non-backtracking (greedy) |
7. Inline Options ↑
Option | Effect on match |
i | Case-insensitive |
m | Multiline mode |
n | Explicit (named) |
s | Single-line mode |
x | Ignore white space |
Use | To |
(?imnsx-imnsx) | Set or disable the specified options |
(?imnsx-imnsx:exp) | Set or disable the specified options within the expression |
8. Backreferences ↑
Use | To match |
\n | Indexed group |
\k<name> | Named group |
9. Alternation ↑
Use | To match |
a |b | Either a or b |
(?(exp) | yes if exp is matched |
yes | no) | no if exp isn't matched |
(?(name) | yes if name is matched |
yes | no) | no if name isn't matched |
10. Substitution ↑
Use | To substitute |
$n | Substring matched by group number n |
${name} | Substring matched by group name |
$$ | Literal $ character |
$& | Copy of whole match |
$` | Text before the match |
$' | Text after the match |
$+ | Last captured group |
$_ | Entire input string |
11. Comments ↑
Use | To |
(?# comment) | Add inline comment |
# | Add x-mode comment |
12. Supported Unicode Categories ↑
Category | Description |
Lu | Letter, uppercase |
LI | Letter, lowercase |
Lt | Letter, title case |
Lm | Letter, modifier |
Lo | Letter, other |
L | Letter, all |
Mn | Mark, nonspacing combining |
Mc | Mark, spacing combining |
Me | Mark, enclosing combining |
M | Mark, all diacritic |
Nd | Number, decimal digit |
Nl | Number, letterlike |
No | Number, other |
N | Number, all |
Pc | Punctuation, connector |
Pd | Punctuation, dash |
Ps | Punctuation, opening mark |
Pe | Punctuation, closing mark |
Pi | Punctuation, initial quote mark |
Pf | Puntuation, final quote mark |
Po | Punctuation, other |
P | Punctuation, all |
Sm | Symbol, math |
Sc | Symbol, currency |
Sk | Symbol, modifier |
So | Symbol, other |
S | Symbol, all |
Zs | Separator, space |
Zl | Separator, line |
Zp | Separator, paragraph |
Z | Separator, all |
Cc | Control code |
Cf | Format control character |
Cs | Surrogate code point |
Co | Private-use character |
Cn | Unassigned |
C | Control characters, all |
13. Regular Expression Operations ↑
a. Pattern matching with Regex objects ↑
To initialize with | Use constructor |
Regular exp | Regex(String) |
+ options | Regex(String, RegexOptions) |
+ time-out | Regex(String, RegexOptions, |
b. Pattern matching with static methods ↑
To | Use method |
Validate match | Regex.IsMatch |
Retrieve single match | Regex.Match (first) |
Match.NextMatch (next) |
Retrieve all matches | Regex.Matches |
Replace match | Regex.Replace |
Divide text | Regex.Split |
Handle char escapes | Regex.Escape |
| Regex.Unescape |
c. Finding and replacing matched patterns ↑
To get | Use Regex API |
Group names | GetGroupNames |
GetGroupNameFromNumber |
Group numbers | GetGroupNumbers |
GetGroupNumberFromName |
Expression | ToString |
Options | Options |
Time-out | MatchTimeOut |
Cache size | CacheSize |
Direction | RightToLeft |
Reference: MSDN
Created: 21/11/2015
[C# Guide] Regular Expression Language
Related Tags :
.NET
Regular Expressions
No comments:
Post a Comment
Commets Download Photoshop Actions, Lightroom Presets, PSD Template, Mockups, Stocks, Vectors, Fonts. Download free