diff options
author | Vsevolod Stakhov <vsevolod@highsecure.ru> | 2016-04-09 13:28:21 +0100 |
---|---|---|
committer | Vsevolod Stakhov <vsevolod@highsecure.ru> | 2016-04-09 13:28:21 +0100 |
commit | f998a65903c03af833e5947afb67de874abe6969 (patch) | |
tree | 3a80288286bfc655dfac536d85b6917501d22f32 /doc/markdown | |
parent | 2bec49d2580bcda5c7d143b609b88e5076a37a2d (diff) | |
download | rspamd-f998a65903c03af833e5947afb67de874abe6969.tar.gz rspamd-f998a65903c03af833e5947afb67de874abe6969.zip |
[Doc] Improve regexp module documentation
Diffstat (limited to 'doc/markdown')
-rw-r--r-- | doc/markdown/modules/regexp.md | 12 |
1 files changed, 9 insertions, 3 deletions
diff --git a/doc/markdown/modules/regexp.md b/doc/markdown/modules/regexp.md index a1a694f33..f08079bff 100644 --- a/doc/markdown/modules/regexp.md +++ b/doc/markdown/modules/regexp.md @@ -47,7 +47,8 @@ Rspamd support the following components within expressions: In rspamd, regular expressions could match different parts of messages: -* Headers (should be `Header-Name=/regexp/flags`) +* Headers (should be `Header-Name=/regexp/flags`), mime headers +* Full headers string * Textual mime parts * Raw messages * URLs @@ -55,10 +56,14 @@ In rspamd, regular expressions could match different parts of messages: The match type is defined by special flags after the last `/` symbol: * `H` - header regexp +* `X` - undecoded header regexp (e.g. without quoted-printable decoding) +* `B` - MIME header regexp (applied for headers in MIME parts only) +* `R` - full headers content (applied for all headers undecoded and for the message only - **not** including MIME headers) * `M` - raw message regexp * `P` - part regexp * `U` - URL regexp + We strongly discourage from using of raw message regexps as they are expensive and should be replaced by [trie](trie.md) rules if possible. @@ -66,8 +71,9 @@ Each regexp also supports the following flags: * `i` - ignore case * `u` - use utf8 regexp -* `m` - multiline regexp -* `x` - extended regexp +* `m` - multiline regexp - treat string as multiple lines. That is, change "^" and "$" from matching the start of the string's first line and the end of its last line to matching the start and end of each line within the string +* `x` - extended regexp - this flag tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a bracketed character class. You can use this to break up your regular expression into (slightly) more readable parts. Also, the # character is treated as a metacharacter introducing a comment that runs up to the pattern's closing delimiter, or to the end of the current line if the pattern extends onto the next line. +* `s` - dotall regexp - treat string as single line. That is, change `.` to match any character whatsoever, even a newline, which normally it would not match. Used together, as `/ms`, they let the `.` match any character whatsoever, while still allowing `^` and `$` to match, respectively, just after and just before newlines within the string. * `O` - do not optimize regexp (rspamd optimizes regexps by default) ### Internal functions |