diff options
author | Vsevolod Stakhov <vsevolod@highsecure.ru> | 2016-07-09 11:17:18 +0100 |
---|---|---|
committer | Vsevolod Stakhov <vsevolod@highsecure.ru> | 2016-07-09 11:17:18 +0100 |
commit | 14803e9faeefeee69e97902573f3e367ceaf9744 (patch) | |
tree | a2a27b8032f2e4c96d8801c1436ec25ab465c9e7 /doc/markdown/modules | |
parent | 2d2a741df6954e042ae2f5ce6c1a66c2d61acc11 (diff) | |
download | rspamd-14803e9faeefeee69e97902573f3e367ceaf9744.tar.gz rspamd-14803e9faeefeee69e97902573f3e367ceaf9744.zip |
[Doc] Documentation now lives in rspamd.com repo
Diffstat (limited to 'doc/markdown/modules')
24 files changed, 0 insertions, 1689 deletions
diff --git a/doc/markdown/modules/chartable.md b/doc/markdown/modules/chartable.md deleted file mode 100644 index 5458427a0..000000000 --- a/doc/markdown/modules/chartable.md +++ /dev/null @@ -1,10 +0,0 @@ -# Chartable module - -This module allows to find number of characters from the different [unicode scripts](http://www.unicode.org/reports/tr24/). Finally, it evaluates number of scrips changes, e.g. 'a網絡a' is treated as 2 script changes - from latin to chineese and from chineese back to latin, divided by total number of unicode characters. If the product of this division is higher than threshold then a symbol is inserted. By default threshold is `0.1` meaning that script changes occurrs approximantely for 10% of characters. - -~~~ucl -chartable { - symbol = "R_CHARSET_MIXED"; - threshold = 0.1; -} -~~~ diff --git a/doc/markdown/modules/dcc.md b/doc/markdown/modules/dcc.md deleted file mode 100644 index 36931ac3a..000000000 --- a/doc/markdown/modules/dcc.md +++ /dev/null @@ -1,40 +0,0 @@ -# DCC module - -This modules performs [DCC](http://www.dcc-servers.net/dcc/) lookups to determine -the *bulkiness* of a message (e.g. how many recipients have seen it). - -Identifying bulk messages is very useful in composite rules e.g. if a message is -from a freemail domain *AND* the message is reported as bulk by DCC then you can -be sure the message is spam and can assign a greater weight to it. - -Please view the License terms on the DCC website before you enable this module. - -## Module configuration - -This module requires that you have the `dccifd` daemon configured, running and -working correctly. To do this you must download and build the [latest DCC client] -(https://www.dcc-servers.net/dcc/source/dcc.tar.Z). Once installed, edit -`/var/dcc/dcc_conf` set `DCCIFD_ENABLE=on` and set `DCCM_LOG_AT=NEVER` and -`DCCM_REJECT_AT=MANY`, then start the daemon by running `/var/dcc/libexec/rcDCC start`. - -Once the `dccifd` daemon is started it will listen on the UNIX domain socket /var/dcc/dccifd -and all you have to do is tell the rspamd where `dccifd` is listening: - -~~~ucl -dcc { - host = "/var/dcc/dccifd"; - # Port is only required if `dccifd` listens on a TCP socket - # port = 1234 -} -~~~ - -Once this module is configured it will write the DCC output to the rspamd as each -message is scanned: - -````` -Apr 5 14:19:53 mail1-ewh rspamd: (normal) lua; dcc.lua:98: sending to dcc: client=217.78.2.204#015DNSERROR helo="003b046f.slimabs.top" envfrom="23SecondAbs@slimabs.top" envrcpt="xxxx@xxxx.com" -Apr 5 14:19:53 mail1-ewh rspamd: (normal) lua; dcc.lua:65: DCC result=R disposition=R header="X-DCC--Metrics: xxxxx.xxxx.com 1282; bulk Body=1 Fuz1=1 Fuz2=many" -````` - -Any messages that DCC returns a *reject* result for (based on the configured `DCCM_REJECT_AT` -value) will cause the symbol `DCC_BULK` to fire. diff --git a/doc/markdown/modules/dkim.md b/doc/markdown/modules/dkim.md deleted file mode 100644 index 48e589386..000000000 --- a/doc/markdown/modules/dkim.md +++ /dev/null @@ -1,32 +0,0 @@ -# DKIM module - -This module checks [DKIM](http://www.dkim.org/) signatures for emails scanned. -DKIM signatures can establish that this specific message has been signed by a trusted -relay. For example, if a message comes from `gmail.com` then a valid DKIM signature -means that this message was definitely signed by `gmail.com` (unless gmail.com private -key has been compromised, which is not a likewise case). - -## Principles of work - -Rspamd can deal with many types of DKIM signatures and messages canonicalisation. -The major difficulty with DKIM are line endings: many MTA treat them differently which -leads to broken signatures. Basically, rspamd treats all line endings as `CR+LF` that -is compatible with the most of DKIM implementations. - -## Configuration - -DKIM module has several useful configuration options: - -- `dkim_cache_size` (or `expire`) - maximum size of DKIM keys cache -- `whitelist` - a map of domains that should not be checked with DKIM (e.g. if that domains have totally broken DKIM signer) -- `domains` - a map of domains that should have more strict scores for DKIM violation -- `strict_multiplier` - multiply the value of symbols by this value if received from `domains` map -- `trusted_only` - do not check DKIM signatures for all domains but those which are from the `domains` map -- `skip_multi` - skip DKIM check for messages with multiple signatures - -The last option can help for some circumstances when rspamd lacks the proper support of -multiple DKIM signatures. Unfortunately, with some mailing lists, or other software -this option could be useful to reduce false positives rate as rspamd deals with -multiple signatures poorly: it just uses the first one to check. On the other hand, -the proper support of multiple DKIM signatures is planned to be implemented in rspamd -in the next releases, which will make this option meaningless.
\ No newline at end of file diff --git a/doc/markdown/modules/dmarc.md b/doc/markdown/modules/dmarc.md deleted file mode 100644 index 7bec587ec..000000000 --- a/doc/markdown/modules/dmarc.md +++ /dev/null @@ -1,48 +0,0 @@ -# DMARC module - -DMARC is a technology leveraging SPF & DKIM which allows domain owners to publish policies regarding how messages bearing -their domain in the RFC5322.From field should be handled (for example to quarantine or reject messages which do not have an -aligned DKIM or SPF identifier) and to elect to receive reporting information about such messages (to help them identify -abuse and/or misconfiguration and make informed decisions about policy application). - -## DMARC in rspamd - -The default configuration for the DMARC module in rspamd is an empty collection: - -~~~ucl -dmarc { -} -~~~ - -This is enough to enable the module and check/apply DMARC policies. - -Symbols added by the module are as follows: - -- `DMARC_POLICY_ALLOW`: Message was authenticated & allowed by DMARC policy -- `DMARC_POLICY_REJECT`: Authentication failed- rejection suggested by DMARC policy -- `DMARC_POLICY_QUARANTINE`: Authentication failed- quarantine suggested by DMARC policy -- `DMARC_POLICY_SOFTFAIL`: Authentication failed- no action suggested by DMARC policy - -Rspamd is able to store records in `redis` which could be used to generate DMARC aggregate reports but there is as of yet no available tool to generate such reports from these. Format of the records stored in `redis` is as follows: - - unixtime,ip,spf_result,dkim_result,dmarc_disposition - -where spf and dkim results are `true` or `false` indicating wether an aligned spf/dkim identifier was found and dmarc_disposition is one of `none`/`quarantine`/`reject` indicating policy applied to the message. - -These records are added to a list named $prefix$domain where $domain is the domain which defined policy for the message being reported on and $prefix is the value of the `key_prefix` setting (or "dmarc_" if this isn't set). - -Keys are inserted to redis servers when a server is selected by hash value from sender's domain. - -To enable storing of report information, `reporting` must be set to `true`. - -~~~ucl -dmarc { - # Enables storing reporting information to redis - reporting = true; - # If Redis server is not configured below, settings from redis {} will be used - #servers = "127.0.0.1:6379"; # Servers to use for reads and writes (can be a list) - # Alternatively set read_servers / write_servers to split reads and writes - # To set custom prefix for redis keys: - #key_prefix = "dmarc_"; -} -~~~ diff --git a/doc/markdown/modules/emails.md b/doc/markdown/modules/emails.md deleted file mode 100644 index e69de29bb..000000000 --- a/doc/markdown/modules/emails.md +++ /dev/null diff --git a/doc/markdown/modules/fann.md b/doc/markdown/modules/fann.md deleted file mode 100644 index b91e35da3..000000000 --- a/doc/markdown/modules/fann.md +++ /dev/null @@ -1,40 +0,0 @@ -# Neural network module - -Neural network module is an experimental module that allows to perform post-classification of messages based on their current symbols and some training corpus obtained from the previous learns. - -To use this module, you need to build rspamd with `libfann` support. It is normally enabled if you use pre-built packages, however, it could be specified using `-DENABLE_FANN=ON` to `cmake` command during build process. - -The idea behind this module is to learn which symbols combinations are common for spamd and which are common for ham. To achieve this goal, fann module studies log files via `log_helper` worker unless gathering some reasonable amount of log samples (`1k` by default). Neural network is learned for spam when a message has `reject` action (definite spam) and it is learned as ham when a message has negative score. You could also use your own criteria for learning. - -Training is performed in background and after some amount of trains (`1k` again) neural network is updated on the disk allowing scanners to load and update their own data. - -After some amount of such iterations (`100` by default), the training process removes old neural network and starts training new one. This is done to ensure that old data does not influence on the current processing. The neural network is also reset when you add or remove rules from rspamd. Once trained, neural network data is saved into file so it could persist between restarts. The current training epoch is however vanished upon restart. - -## Configuration - -First of all, you need a special worker called `log_helper` to accept rspamd scan results. This logger has a trivial setup: - -~~~ucl -worker "log_helper" { - count = 1; -} -~~~ - -Then you'd need to setup fann plugin: - -~~~ucl -fann_scores { - fann_file = "${DBDIR}/data.fann"; # Used to store ANN file on disk - train { - max_train = 10k; # Number of trains per epoch - max_epoch = 1k # Number of epoch while ANN data is valid - spam_score = 8; # Score to learn spam - ham_score = -2; # Score to learn ham - } - use_settings = false; # If enabled, then settings-id could switch this module to another FANN -} -~~~ - -## Settings usage - -TODO
\ No newline at end of file diff --git a/doc/markdown/modules/forged_recipients.md b/doc/markdown/modules/forged_recipients.md deleted file mode 100644 index e69de29bb..000000000 --- a/doc/markdown/modules/forged_recipients.md +++ /dev/null diff --git a/doc/markdown/modules/fuzzy_check.md b/doc/markdown/modules/fuzzy_check.md deleted file mode 100644 index 13e8c6878..000000000 --- a/doc/markdown/modules/fuzzy_check.md +++ /dev/null @@ -1,163 +0,0 @@ -# Fuzzy check module - -This module is intended to check messages for specific fuzzy patterns stored in -[fuzzy storage workers](../workers/fuzzy_storage.md). At the same time, this module -is responsible for learning fuzzy storage with message patterns. - -## Fuzzy patterns - -Rspamd uses `shingles` algorithm to perform fuzzy match of messages. This algorithm -is probabilistic and uses words chains to detect some common patterns and filter -thus spam or ham messages. Shingles algorithm is described in the following -[research paper](http://dl.acm.org/citation.cfm?id=283370). We use 3-gramms for this -algorithm and [siphash](https://131002.net/siphash/) for hash function. Currently, -rspamd uses 32 hashes for shingles. Using of siphash allows private storages to be -used, since nobody can generate the same sequence of hashes without some shared -secret called `shingles key`. By default, rspamd uses the string `rspamd` as siphash -key, however, it is possible change this value from the configuration. - -Each shingles set is accompanied by a collision resistant hash, namely [blake2](https://blake2.net/) hash. -This digest is used as unique ID of the hash. - -Attachements and images are not currently matched against fuzzy hashes, however they -are checked by means blake2 digests using strict match. - -## Module configuration - -Fuzzy check module has several global options and allows to specify multiple match -storages. Global options include: - -- `symbol`: default symbol to insert (if no flags matches) -- `min_length`: minimum length of text parts in words to perform fuzzy check (default - check all text parts) -- `min_bytes`: minimum lenght of attachements and images in bytes to check them in fuzzy storage -- `whitelist`: IP list to skip all fuzzy checks -- `timeout`: timeout for reply waiting - -Fuzzy rules are defined as a set of `rule` definitions. Each `rule` must have servers -list to check or learn and a set of flags and optional parameters. Here is an example of -rule's settings: - -~~~ucl -fuzzy_check { - rule { - # List of servers, can be an array or multi-value item - servers = "localhost:11335"; - servers = "highsecure.ru:11335"; - - # Default symbol - symbol = "FUZZY_UNKNOWN"; - - # List of additional mime types to be checked in this fuzzy - mime_types = "application/pdf"; - mime_types = ["application/*", "*/octet-stream", "*"]; - - # Maximum global score for all maps - max_score = 20.0; - - # Ignore flags that are not listed in maps for this rule - skip_unknown = yes; - - # If this value is false, then allow learning for this fuzzy rule - read_only = no; - - # Key for strict digests (default: "rspamd") - fuzzy_key = "somebigrandomstring"; - - # Key for fuzzy siphash (default: "rspamd") - fuzzy_shingles_key = "anotherbigrandomstring"; - - # maps - } -} -~~~ - -Each rule can have several maps defined by a `flag` value. For example, a single -fuzzy storage can contain both good and bad hashes that should have different symbols -and thus different weights. Maps are defined inside fuzzy rules as following: - -~~~ucl -fuzzy_check { - rule { - ... - fuzzy_map = { - FUZZY_DENIED { - # Maximum weight for this list - max_score = 20.0; - # Flag value - flag = 1 - } - FUZZY_PROB { - max_score = 10.0; - flag = 2 - } - FUZZY_WHITE { - max_score = 2.0; - flag = 3 - } - } -} -~~~ - -The meaning of `max_score` can be rather unclear. First of all, all hashes in -fuzzy storage have their own weights. For example, if we have a hash `A` and 100 users -marked it as spam hash, then it will have weight of `100 * single_vote_weight`. -Therefore, if a `single_vote_weight` is `1` then the final weight will be `100` indeed. -`max_score` means the weight that is required for the rule to add symbol with the maximum -score 1.0 (that will be of course multiplied by metric's weigth). In our example, -if the weight of hash is `100` and `max_score` will be `99`, then the rule will be -added with the weight of `1`. If `max_score` is `200`, then the rule will be added with the -weight likely `0.2` (the real function is hyperbolic tangent). In the following configuration: - -~~~ucl -metric { - name = "default"; - ... - symbol { - name = "FUZZY_DENIED"; - weght = "10.0"; - } - ... -} -fuzzy_check { - rule { - ... - fuzzy_map = { - FUZZY_DENIED { - # Maximum weight for this list - max_score = 20.0; - # Flag value - flag = 1 - } - ... - } -} -~~~ - -If a hash has value `10`, then a symbol `FUZZY_DENIED` with weight of `2.0` will be added. -If a hash has value `100500`, then `FUZZY_DENIED` will have weight `10.0`. - -## Learning fuzzy_check - -Module `fuzzy_check` also allows to learn messages. You can use `rspamc` command or -connect to the **controller** worker using HTTP protocol. For learning you must check -the following settings: - -1. Controller worker should be accessible by `rspamc` or HTTP (check `bind_socket`) -2. Controller should allow privilleged commands for this client (check `enable_password` or `allow_ip` settings) -3. Controller should have `fuzzy_check` module configured to the servers specified -4. You should know `fuzzy_key` and `fuzzy_shingles_key` to operate with this storage -5. Your `fuzzy_check` module should have `fuzzy_map` configured to the flags used by server -6. Your `fuzzy_check` rule must have `read_only` option being turned off - `read_only = false` -7. Your `fuzzy_storage` worker should allow updates from the controller's host (`allow_update` option) -8. Your controller should be able to communicate with fuzzy storage by means of `UDP` protocol - -If all these conditions are met then you can learn messages with rspamc: - - rspamc -w <weight> -f <flag> fuzzy_add ... - -or delete hashes: - - rspamc -f <flag> fuzzy_del ... - -On learning, rspamd sends commands to **all** servers inside specific rule. On check, -rspamd selects a server in round-robin matter. diff --git a/doc/markdown/modules/index.md b/doc/markdown/modules/index.md deleted file mode 100644 index afb440a8e..000000000 --- a/doc/markdown/modules/index.md +++ /dev/null @@ -1,70 +0,0 @@ -# Rspamd modules - -Rspamd ships with a set of modules. Some modules are written in C to speedup -complex procedures while others are written in lua to reduce code size. -Actually, new modules are encouraged to be written in lua and add the essential -support to the Lua API itself. Truly speaking, lua modules are very close to -C modules in terms of performance. However, lua modules can be written and loaded -dynamically. - -## C Modules - -C modules provides core functionality of rspamd and are actually statically linked -to the main rspamd code. C modules are defined in the `options` section of rspamd -configuration. If no `filters` attribute is defined then all modules are disabled. -The default configuration enables all modules explicitly: - -~~~ucl -filters = "chartable,dkim,spf,surbl,regexp,fuzzy_check"; -~~~ - -Here is the list of C modules available: - -- [regexp](regexp.md): the core module that allow to define regexp rules, -rspamd internal functions and lua rules. -- [surbl](surbl.md): this module extracts URLs from messages and check them against -public DNS black lists to filter messages with malicious URLs. -- [spf](spf.md): checks SPF records for messages processed. -- [dkim](dkim.md): performs DKIM signatures checks. -- [dmarc](dmarc.md): performs DKIM signatures checks. -- [fuzzy_check](fuzzy_check.md): checks messages fuzzy hashes against public blacklists. -- [chartable](chartable.md): checks character sets of text parts in messages. - -## Lua modules - -Lua modules are dynamically loaded on rspamd startup and are reloaded on rspamd -reconfiguration. Should you want to write a lua module consult with the -[Lua API documentation](../lua/). To define path to lua modules there is a special section -named `modules` in rspamd: - -~~~ucl -modules { - path = "/path/to/dir/"; - path = "/path/to/module.lua"; - path = "$PLUGINSDIR/lua"; -} -~~~ - -If a path is a directory then rspamd scans it for `*.lua" pattern and load all -files matched. - -Here is the list of Lua modules shipped with rspamd: - -- [multimap](multimap.md) - a complex module that operates with different types -of maps. -- [rbl](rbl.md) - a plugin that checks messages against DNS blacklist based on -either SMTP FROM addresses or on information from `Received` headers. -- [emails](emails.md) - extract emails from a message and checks it against DNS -blacklists. -- [maillist](maillist.md) - determines the common mailing list signatures in a message. -- [once_received](once_received.md) - detects messages with a single `Received` headers -and performs some additional checks for such messages. -- [phishing](phishing.md) - detects messages with phished URLs. -- [ratelimit](ratelimit.md) - implements leaked bucket algorithm for ratelimiting and -uses `redis` to store data. -- [trie](trie.md) - uses suffix trie for extra-fast patterns lookup in messages. -- [mime_types](mime_types.md) - applies some rules about mime types met in messages -- [rspamd_update](rspamd_update.md) - load dynamic rules and other rspamd updates -- [spamassassin](spamassassin.md) - load spamassassin rules -- [dmarc](dmarc.md) - performs DMARC policy checks -- [dcc](dcc.md) - performs [DCC](http://www.dcc-servers.net/dcc/) lookups to determine message bulkiness diff --git a/doc/markdown/modules/maillist.md b/doc/markdown/modules/maillist.md deleted file mode 100644 index 7563eae24..000000000 --- a/doc/markdown/modules/maillist.md +++ /dev/null @@ -1,15 +0,0 @@ -# Mail list module - -Mailing list module is a simple module that performs checks whether a message is -sent over some popular mailing lists software. This module is designed to negate -some rules as they are likely to be touched unnecessarily if a message comes from -some list. - -Here is a list of currently supported mailing lists programs: - -- Ezmlm -- Mailman -- Google groups -- Majordomo -- Communigate PRO mailing lists -- subscribe.ru mailing list
\ No newline at end of file diff --git a/doc/markdown/modules/mime_types.md b/doc/markdown/modules/mime_types.md deleted file mode 100644 index 4910fcae9..000000000 --- a/doc/markdown/modules/mime_types.md +++ /dev/null @@ -1,30 +0,0 @@ -# Rspamd mime types module - -This module is intended to do some mime types sanity checks. That includes the following: - -1. Checks whether mime type is from the `good` list (e.g. `multipart/alternative` or `text/html`) -2. Checks if a mime type is from the `bad` list (e.g. `multipart/form-data`) -3. Checks if an attachement filename extension is different from the intended mime type - -## Configuration - -`mime_types` module reads mime types map specified in `file` option. This map contains binding - -``` -type/subtype score -``` - -When score is more than `0` then it is considered as `bad` if it is less than `0` it is considered as `good` (with the corresponding multiplier). -When mime type is not listed then `MIME_UNKNOWN` symbol is inserted. - -`extension_map` option allows to specify map from a known extension to a specific mime type: - -~~~ucl -extension_map = { - html = "text/html"; - txt = "text/plain"; - pdf = "application/pdf"; -} -~~~ - -When an attachement extension matches left part but the content type does not match the right part then symbol `MIME_BAD_ATTACHMENT` is inserted. diff --git a/doc/markdown/modules/multimap.md b/doc/markdown/modules/multimap.md deleted file mode 100644 index cede3bc94..000000000 --- a/doc/markdown/modules/multimap.md +++ /dev/null @@ -1,162 +0,0 @@ -# Multimap module - -Multimap module is designed to handle rules that are based on different types of maps. - -## Principles of work - -Maps in rspamd are the files or HTTP links that are automatically monitored and reloaded -if changed. For example, maps can be defined as following: - - "http://example.com/file" - "file:///etc/rspamd/file.map" - "/etc/rspamd/file.map" - -Rspamd respects `304 Not Modified` reply from HTTP server allowing to save traffic -when a map has not been actually changed since last load. For file maps, rspamd uses normal -`mtime` attribute (time modified). The global map watching settings are defined in the -`options` section of the configuration file: - -* `map_watch_interval`: defines time when all maps are rescanned; the actual check interval is jittered to avoid simultaneous checking (hence, the real interval is from this value up to the this interval doubled). - -Multimap module allows to build rules based on the dynamic maps content. Rspamd supports the following -map types in this module: - -* `hash map` - a list of domains or `user@domain` -* `regexp map` - a list of regular expressions -* `ip map` - an effective radix trie of `ip/mask` values (supports both IPv4 and IPv6 addresses) -* `cdb` - constant database format (files only) - -Multimap has different message attributes to be checked via maps. - - -Multimap can also be used for pre-filtering of message: so if map matches then no further checks will be performed. This feature is particularly useful for whitelisting, blacklisting and allows to save scan resources. To enable this mode just add `action` option to the map configuration (see below). - -## Configuration - -The module itself contains a set of rules in form: - - symbol { type = type; map = uri; [optional params] } - -### Map types - -Type attribute means what is matched with this map. The following types are supported: - -* `ip` - matches source IP of message (radix map) -* `from` - matches envelope from (or header `From` if envelope from is absent) -* `rcpt` - matches any of envelope rcpt or header `To` if envelope info is missing -* `header` - matches any header specified (must have `header = "Header-Name"` configuration attribute) -* `dnsbl` - matches source IP against some DNS blacklist (consider using [RBL](rbl.md) module for this) -* `url` - matches URLs in messages against maps -* `filename` - matches attachment filename against map - -DNS maps are legacy and are not encouraged to use in new projects (use [rbl](rbl.md) for that). - -Maps can also be specified as [CDB](http://www.corpit.ru/mjt/tinycdb.html) databases which might be useful for large maps: - - map = "cdb:///path/to/file.cdb"; - -### Pre-filter maps - -To enable pre-filter support, you should specify `action` parameter which can take the -following values: - -* `accept` - accept a message (no action) -* `add header` or `add_header` - adds a header to message -* `rewrite subject` or `rewrite_subject` - change subject -* `greylist` - greylist message -* `reject` - drop message - -No filters will be processed for a message if such a map matches. - -~~~ucl -multimap { - test { type = "ip"; map = "/tmp/ip.map"; symbol = "TESTMAP"; } - spamhaus { type = "dnsbl"; map = "pbl.spamhaus.org"; symbol = "R_IP_PBL"; - description = "PBL dns block list"; } # Better use RBL module instead -} -~~~ - -### Regexp maps - - -All maps but `ip` and `dnsbl` support `regexp` mode. In this mode, all keys in maps are treated as regular expressions, for example: - - /example\d+\.com/i - /other\d+\.com/i test - # Comments are still enabled - -For performance considerations, use only expressions supported by [hyperscan](http://01org.github.io/hyperscan/dev-reference/compilation.html#pattern-support) as this engine provides blazing performance at no additional cost. Currently, there is no way to distinguish what particular regexp was matched in case if multiple regexp were matched. - -To enable regexp mode, you should set `regexp` option to `true`: - -~~~ucl -sender_from_whitelist_user { - type = "from"; - map = "file:///tmp/from.map"; - symbol = "SENDER_FROM_WHITELIST"; - regexp = true; -} -~~~ - -### Map filters - -It is also possible to apply a filtering expression before checking value against some map. This is mainly useful -for `header` rules. Filters are specified with `filter` option. Rspamd supports the following filters so far: - -* `email` or `email:addr` - parse header value and extract email address from it (`Somebody <user@example.com>` -> `user@example.com`) -* `email:user` - parse header value as email address and extract user name from it (`Somebody <user@example.com>` -> `user`) -* `email:domain` - parse header value as email address and extract user name from it (`Somebody <user@example.com>` -> `example.com`) -* `email:name` - parse header value as email address and extract displayed name from it (`Somebody <user@example.com>` -> `Somebody`) -* `regexp:/re/` - extracts generic information using the specified regular expression - -URL maps allows another set of filters (by default, url maps are matched using hostname part): - -* `tld` - matches TLD (top level domain) part of urls -* `full` - matches the complete URL not the hostname -* `is_phished` - matches hostname but if and only if the URL is phished (e.g. pretended to be from another domain) -* `regexp:/re/` - extracts generic information using the specified regular expression from the hostname -* `tld:regexp:/re/` - extracts generic information using the specified regular expression from the TLD part -* `full:regexp:/re/` - extracts generic information using the specified regular expression from the full URL text - -Filename maps support this filters set: - -* `extension` - matches file extension -* `regexp:/re/` - extract data from filename according to some regular expression - -Here are some examples of pre-filter configurations: - -~~~ucl -sender_from_whitelist_user { - type = "from"; - filter = "email:user"; - map = "file:///tmp/from.map"; - symbol = "SENDER_FROM_WHITELIST_USER"; - action = "accept"; # Prefilter mode -} -sender_from_regexp { - type = "header"; - header = "from"; - filter = "regexp:/.*@/"; - map = "file:///tmp/from_re.map"; - symbol = "SENDER_FROM_REGEXP"; -} -url_map { - type = "url"; - filter = "tld"; - map = "file:///tmp/url.map"; - symbol = "URL_MAP"; -} -url_tld_re { - type = "url"; - filter = "tld:regexp:/\.[^.]+$/"; # Extracts the last component of URL - map = "file:///tmp/url.map"; - symbol = "URL_MAP_RE"; -} -filename_blacklist { - type = "filename"; - filter = "extension"; - map = "/${LOCAL_CONFDIR}/filename.map"; - symbol = "FILENAME_BLACKLISTED"; - action = "reject"; -} -~~~ diff --git a/doc/markdown/modules/once_received.md b/doc/markdown/modules/once_received.md deleted file mode 100644 index cb91522a5..000000000 --- a/doc/markdown/modules/once_received.md +++ /dev/null @@ -1,22 +0,0 @@ -# Once received module - -This module is intended to do simple checks for mail with one `Received` header. The idea behind these checks is that legitimate mail likely has more than one received and some bad patterns, such as `dynamic` or `broadband` are common for spam from hacked users' machines. - -## Configuration - -The configuration of this module is pretty straightforward: specify `symbol` for generic one received mail, specify `symbol_strict` for emails with bad patterns or with unresolvable hostnames and add **good** and **bad** patterns. Patterns can contain [lua patterns](http://lua-users.org/wiki/PatternsTutorial). `good_host` lines are used to negate this module for certain hosts, `bad_host` lines are used to specify certain bad patterns. It is also possible to specify `whitelist` to define a list of networks for which `once_received` checks should be excluded. - -## Example - -~~~ucl -once_received { - good_host = "^mail"; - bad_host = "static"; - bad_host = "dynamic"; - symbol_strict = "ONCE_RECEIVED_STRICT"; - symbol = "ONCE_RECEIVED"; - whitelist = "/tmp/ip.map"; -} -~~~ - -IP map can contain, as usually, IP's (both v4 and v6), networks (in CIDR notation) and optional comments starting from `#` symbol. diff --git a/doc/markdown/modules/phishing.md b/doc/markdown/modules/phishing.md deleted file mode 100644 index 55884f287..000000000 --- a/doc/markdown/modules/phishing.md +++ /dev/null @@ -1,114 +0,0 @@ -# Phishing module - -This module is designed to report about potentially phished URL's. - -## Principles of phishing detection - -Rspamd tries to detect phished URL's merely in HTML text parts. First, -it get URL from `href` or `src` attribute and then tries to find the text enclosed -within this link tag. If some url is also enclosed in the specific tag then -rspamd decides to compare whether these two URL's are related, namely if they -belong to the same top level domain. Here are examples of urls that are considered -to be non-phished: - - <a href="http://sub.example.com/path">http://example.com/other</a> - <a href="https://user:password@sub.example.com/path">http://example.com/</a> - -And the following URLs are considered as phished: - - <a href="http://evil.co.uk">http://example.co.uk</a> - <a href="http://t.co/xxx">http://example.com</a> - <a href="http://redir.to/example.com">http://example.com</a> - -## Configuration of phishing module - -Here is an example of full module configuration. - -~~~ucl -phishing { - symbol = "R_PHISHING"; # Default symbol - - # Check only domains from this list - domains = "file:///path/to/map"; - - # Make exclusions for known redirectors - # Entry format: URL/path for map, colon, name of symbol - redirector_domains = [ - "${CONFDIR}/redirectors.map:REDIRECTOR_FALSE" - ]; - # For certain domains from the specified strict maps - # use another symbol for phishing plugin - strict_domains = [ - "${CONFDIR}/paypal.map:PAYPAL_PHISHING" - ]; -} -~~~ - -If an anchoring (actual as opposed to phished) domain is found in a map -referenced by the `redirector_domains` setting then the related symbol is -yielded and the URL is not checked further. This allows making exclusions -for known redirectors, especially ESPs. - -Further to this, if the phished domain is found in a map referenced by -`strict_domains` the related symbol is yielded and the URL not checked -further. This allows fine-grained control to avoid false positives and -enforce some really bad phishing mails, such as bank phishing or other -payments system phishing. - -Finally, the default symbol is yielded- if `domains` is specified then -only if the phished domain is found in the related map. - -Maps for this module can consist of effective second level domain parts (eSLD) -or whole domain parts of the URLs (FQDN) as well. - -## Openphish support - -Since version 1.3, there is [openphish](https://openphish.com) support in rspamd. -Now rspamd loads this public feed as a map (using HTTPS) and checks URLs in messages using -openphish list. If any match is found, then rspamd adds symbol `PHISHED_OPENPHISH`. - -If you use research or commercial data feed, rspamd can also use its data and gives -more details about URLs found: their sector (e.g. 'Finance'), brand name (e.g. -'Bank of Zimbabwe') and other useful information. - -There are couple of options available to configure openphish module: - -~~~ucl -phishing { - # URL of feed, default is public url: - openphish_map = "https://www.openphish.com/feed.txt"; - # For premium feed, change that to your personal URL, e.g. - # openphish_map = "https://openphish.com/samples/premium_feed.json"; - - # Change this to true if premium feed is enabled - openphish_premium = false; -} -~~~ - -## Phishtank support - -There is also [phishtank](https://phishtank.com) support in rspamd since 1.3. Unlike -openphish feed, phishtank's one is not enabled by default since it has quite a big size (about 50Mb) so -you might want to setup some reverse proxy (e.g. nginx) to cache that data among rspamd instances: - -~~~nginx -proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=phish:10m; - -server { - listen 8080; - location / { - proxy_pass http://data.phishtank.com:80; - proxy_cache phish; - proxy_cache_lock on; - } -} -~~~ - - -To enable phishtank feed, you can edit `local.d/phishing.conf` file and add the following lines there: - -~~~ucl -phishtank_enabled = true; -# Where nginx is installed -phishtank_map = "http://localhost:8080/data/online-valid.json"; -~~~ diff --git a/doc/markdown/modules/ratelimit.md b/doc/markdown/modules/ratelimit.md deleted file mode 100644 index c36017461..000000000 --- a/doc/markdown/modules/ratelimit.md +++ /dev/null @@ -1,95 +0,0 @@ -# Ratelimit plugin - -Ratelimit plugin is designed to limit messages coming from certain senders, to -certain recipients from certain IP addresses combining these parameters into -a separate limits. - -All limits are stored in [redis](http://redis.io) server (or servers cluster) to enable -shared cache between different scanners. - -## Module configuration - -In the default configuration, there are no cache servers specified, hence, the module won't work unless you add this option to the configuration. - -`Ratelimit` module supports the following configuration options: - -- `servers` - list of servers where ratelimit data is stored -- `whitelisted_rcpts` - comma separated list of whitelisted recipients. By default -the value of this option is 'postmaster, mailer-daemon' -- `whitelisted_ip` - a map of ip addresses or networks whitelisted -- `max_rcpts` - do not apply ratelimit if it contains more than this value of recipients (5 by default). This -option allows to avoid too many work for setting buckets if there are a lot of recipients in a message). -- `max_delay` - maximum lifetime for any limit bucket (1 day by default) -- `rates` - a table of allowed rates in form: - - type = [burst,leak]; - -Where `type` is one of: - -- `to` -- `to_ip` -- `to_ip_from` -- `bounce_to` -- `bounce_to_ip` - -`burst` is a capacity of a bucket and `leak` is a rate in messages per second. -Both these attributes are floating point values. - -- `symbol` - if this option is specified, then `ratelimit` plugin just adds the corresponding symbol instead of setting pre-result, the value is scaled as $$ 2 * tanh(\frac{bucket}{threshold * 2}) $$, where `tanh` is the hyperbolic tanhent function - -## Principles of work - -The basic principle of ratelimiting in rspamd is called `leaked bucket`. It could -be visually represented as a bucket that has some capacity, and a small hole in a bottom. -Messages comes to this bucket and leak through the hole over time (it doesn't delay messages, just count them). If the capacity of -a bucket is exhausted, then a temporary reject is sent. This happens unless the capacity -of bucket is enough to accept more messages (and since messages are leaking then after some -time, it will be possible to process new messages). - -Rspamd uses 3 types of limit buckets: - -- `to` - a bucket based on a recipient only -- `to:ip` - a bucket combining a recipient and a sender's IP -- `to:from:ip` - a bucket combining a recipient, a sender and a sender's IP - -For bounce messages there are special buckets that lack `from` component and have more -restricted limits. Rspamd treats the following senders as bounce senders: - -- 'postmaster', -- 'mailer-daemon' -- '' (empty sender) -- 'null' -- 'fetchmail-daemon' -- 'mdaemon' - -Each recipient has its own triple of buckets, hence it is useful -to limit number of recipients to check. - -Each bucket has two parameters: -- `capacity` - how many messages could go into a bucket before a limit is reached -- `leak` - how many messages per second are leaked from a bucket. - -For example, a bucket with capacity `100` and leak `1` can accept up to 100 messages but then -will accept not more than a message per second. - -By default, ratelimit module has the following settings which disable all limits: - -~~~lua --- Default settings for limits, 1-st member is burst, second is rate and the third is numeric type -local settings = { - -- Limit for all mail per recipient (burst 100, rate 2 per minute) - to = {0, 0.033333333}, - -- Limit for all mail per one source ip (burst 30, rate 1.5 per minute) - to_ip = {0, 0.025}, - -- Limit for all mail per one source ip and from address (burst 20, rate 1 per minute) - to_ip_from = {0, 0.01666666667}, - - -- Limit for all bounce mail (burst 10, rate 2 per hour) - bounce_to = {0, 0.000555556}, - -- Limit for bounce mail per one source ip (burst 5, rate 1 per hour) - bounce_to_ip = {0, 0.000277778}, - - -- Limit for all mail per user (authuser) (burst 20, rate 1 per minute) - user = {0, 0.01666666667} -} -~~~ diff --git a/doc/markdown/modules/rbl.md b/doc/markdown/modules/rbl.md deleted file mode 100644 index b7e73a1d1..000000000 --- a/doc/markdown/modules/rbl.md +++ /dev/null @@ -1,116 +0,0 @@ -# RBL module - -The RBL module provides support for checking the IPv4/IPv6 source address of a message's sender against a set of RBLs as well as various less conventional methods of using RBLs: against addresses in Received headers; against the reverse DNS name of the sender and against the parameter used for HELO/EHLO at SMTP time. - -Configuration is structured as follows: - -~~~ucl -rbl { - # default settings defined here - rbls { - # 'rbls' subsection under which the RBL definitions are nested - an_rbl { - # rbl-specific subsection - } - # ... - } -} -~~~ - -The default settings define the ways in which the RBLs are used unless overridden in an RBL-specific subsection. - -Defaults may be set for the following parameters (default values used if these are not set are shown in brackets - note that these may be redefined in the default config): - -- default_ipv4 (true) - -Use this RBL to test IPv4 addresses. - -- default_ipv6 (false) - -Use this RBL to test IPv6 addresses. - -- default_received (true) - -Use this RBL to test IPv4/IPv6 addresses found in Received headers. The RBL should also be configured to check one/both of IPv4/IPv6 addresses. - -- default_from (false) - -Use this RBL to test IPv4/IPv6 addresses of message senders. The RBL should also be configured to check one/both of IPv4/IPv6 addresses. - -- default_rdns (false) - -Use this RBL to test reverse DNS names of message senders (hostnames passed to rspamd should have been validated with a forward lookup, particularly if this is to be used to provide whitelisting). - -- default_helo (false) - -Use this RBL to test parameters sent for HELO/EHLO at SMTP time. - -- default_dkim (false) - -Use this RBL to test domains found in validated DKIM signatures. - -- default_dkim_domainonly (true) - -If true test top-level domain only, otherwise test entire domain found in DKIM signature. - -- default_emails (false) - -Use this RBL to test email addresses in form [localpart].[domainpart].[rbl] or if set to "domain_only" uses [domainpart].[rbl]. - -- default_unknown (false) - -If set to false, do not yield a result unless the response received from the RBL is defined in its related returncodes {} subsection, else return the default symbol for the RBL. - -- default_exclude_users (false) - -If set to true, do not use this RBL if the message sender is authenticated. - -- default_exclude_private_ips (true) - -If true, do not use the RBL if the sending host address is in `local_addrs` & do not check received headers baring these addresses. - -- default_exclude_local (true) - -If true & local_exclude_ip_map has been set - do not use the RBL if the sending host address is in the local IP list & do not check received headers baring these addresses. - -- default_is_whitelist (false) - -If true matches on this list should neutralise any listings where this setting is false and ignore_whitelists is not true. - -- default_ignore_whitelists (false) - -If true this list should not be neutralised by whitelists. - -Other parameters which can be set here are: - -- local_exclude_ip_map - -Can be set to a URL of a list of IPv4/IPv6 addresses & subnets not to be considered as local exclusions by exclude_local checks. - -RBL-specific subsection is structured as follows: - -~~~ucl -# Descriptive name of RBL or symbol if symbol is not defined. -an_rbl { - # Explicitly defined symbol - symbol = "SOME_SYMBOL"; - # RBL-specific defaults (where different from global defaults) - #The global defaults may be overridden using 'helo' to override 'default_helo' and so on. - ipv6 = true; - ipv4 = false; - # Address used for RBL-testing - rbl = "v6bl.example.net"; - # Possible responses from RBL and symbols to yield - returncodes { - # Name_of_symbol = "address"; - EXAMPLE_ONE = "127.0.0.1"; - EXAMPLE_TWO = "127.0.0.2"; - } -} -~~~ - -The following extra settings are valid in the RBL subsection: - -- whitelist_exception - -(For whitelists) - Symbols named as parameters for this setting will not be used for neutralising blacklists (set this multiple times to add multiple exceptions). diff --git a/doc/markdown/modules/regexp.md b/doc/markdown/modules/regexp.md deleted file mode 100644 index 01d7a0635..000000000 --- a/doc/markdown/modules/regexp.md +++ /dev/null @@ -1,146 +0,0 @@ -# Rspamd regexp module - -This is a core module that deals with regexp expressions to filter messages. - -## Principles of work - -Regexp module operates with `expressions` - a logical sequence of different `atoms`. Atoms -are elements of the expression and could be represented as regular expressions, rspamd -functions and lua functions. Rspamd supports the following operators in expressions: - -* `&&` - logical AND (can be also written as `and` or even `&`) -* `||` - logical OR (`or` `|`) -* `!` - logical NOT (`not`) -* `+` - logical PLUS, usually used with comparisons: - - `>` more than - - `<` less than - - `>=` more or equal - - `<=` less or equal - -Whilst logical operators are clear for understanding, PLUS is not so clear. In rspamd, -it is used to join multiple atoms or subexpressions and compare them to a specific number: - - A + B + C + D > 2 - evaluates to `true` if at least 3 operands are true - (A & B) + C + D + E >= 2 - evaluates to `true` if at least 2 operands are true - -Operators has their own priorities: - -1. NOT -2. PLUS -3. COMPARE -4. AND -5. OR - -You can change priorities by braces, of course. All operations are *right* associative in rspamd. -While evaluating expressions, rspamd tries to optimize their execution time by reordering and does not evaluate -unnecessary branches. - -## Expressions components - -Rspamd support the following components within expressions: - -* Regular expressions -* Internal functions -* Lua global functions (not widely used) - -### Regular expressions - -In rspamd, regular expressions could match different parts of messages: - -* Headers (should be `Header-Name=/regexp/flags`), mime headers -* Full headers string -* Textual mime parts -* Raw messages -* URLs - -The match type is defined by special flags after the last `/` symbol: - -* `H` - header regexp -* `X` - undecoded header regexp (e.g. without quoted-printable decoding) -* `B` - MIME header regexp (applied for headers in MIME parts only) -* `R` - full headers content (applied for all headers undecoded and for the message only - **not** including MIME headers) -* `M` - raw message regexp -* `P` - part regexp without HTML tags -* `Q` - part regexp with HTML tags -* `C` - spamassassin `BODY` regexp analogue(see http://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.txt) -* `D` - spamassassin `RAWBODY` regexp analogue -* `U` - URL regexp - -From 1.3, it is also possible to specify long regexp types for convenience in curly braces: - -* `{header}` - header regexp -* `{raw_header}` - undecoded header regexp (e.g. without quoted-printable decoding) -* `{mime_header}` - MIME header regexp (applied for headers in MIME parts only) -* `{all_header}` - full headers content (applied for all headers undecoded and for the message only - **not** including MIME headers) -* `{body}` - raw message regexp -* `{mime}` - part regexp without HTML tags -* `{raw_mime}` - part regexp with HTML tags -* `{sa_body}` - spamassassin `BODY` regexp analogue(see http://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.txt) -* `{sa_raw_body}` - spamassassin `RAWBODY` regexp analogue -* `{url}` - URL regexp - -Each regexp also supports the following flags: - -* `i` - ignore case -* `u` - use utf8 regexp -* `m` - multiline regexp - treat string as multiple lines. That is, change "^" and "$" from matching the start of the string's first line and the end of its last line to matching the start and end of each line within the string -* `x` - extended regexp - this flag tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a bracketed character class. You can use this to break up your regular expression into (slightly) more readable parts. Also, the # character is treated as a metacharacter introducing a comment that runs up to the pattern's closing delimiter, or to the end of the current line if the pattern extends onto the next line. -* `s` - dotall regexp - treat string as single line. That is, change `.` to match any character whatsoever, even a newline, which normally it would not match. Used together, as `/ms`, they let the `.` match any character whatsoever, while still allowing `^` and `$` to match, respectively, just after and just before newlines within the string. -* `O` - do not optimize regexp (rspamd optimizes regexps by default) - -### Internal functions - -Rspamd supports a set of internal functions to do some common spam filtering tasks: - -* `check_smtp_data(type[, str or /re/])` - checks for the specific envelope argument: `from`, `rcpt`, `user`, `subject` -* `compare_encoding(str or /re/)` - compares message encoding with string or regexp -* `compare_parts_distance(inequality_percent)` - if a message is multipart/alternative, compare two parts and return `true` if they are inequal more than `inequality_percent` -* `compare_recipients_distance(inequality_percent)` - check how different are recipients of a message (works for > 5 recipients) -* `compare_transfer_encoding(str or /re/)` - compares message transfer encoding with string or regexp -* `content_type_compare_param(param, str or /re/)` - compare content-type parameter `param` with string or regexp -* `content_type_has_param(param)` - return true if `param` exists in content-type -* `content_type_is_subtype(str or /re/` - return `true` if subtype of content-type matches string or regexp -* `content_type_is_type(str or /re/)`- return `true` if type of content-type matches string or regexp -* `has_content_part(type)` - return `true` if the part with the specified `type` exists -* `has_content_part_len(type, len)` - return `true` if the part with the specified `type` exists and have at least `len` lenght -* `has_fake_html()` - check if there is an HTML part in message with no HTML tags -* `has_html_tag(tagname)` - return `true` if html part contains specified tag -* `has_only_html_part()` - return `true` if there is merely a single HTML part -* `header_exists(header)` - return if a specified header exists in the message -* `is_html_balanced()` - check whether HTML part has balanced tags -* `is_recipients_sorted()` - return `true` if there are more than 5 recipients in a message and they are sorted -* `raw_header_exists()` - does the same as `header_exists` - -Many of these functions are just legacy but they are supported in terms of compatibility. - -### Lua atoms - -Lua atoms now can be lua global functions names or callbacks. This is -a compatibility feature for previously written rules. - -### Regexp objects - -From rspamd 1.0, it is possible to add more power to regexp rules by using of -table notation while writing rules. A table can have the following fields: - -- `callback`: lua callback for the rule -- `re`: regular expression (mutually exclusive with `callback` option) -- `condition`: function of task that determines when a rule should be executed -- `score`: default score -- `description`: default description -- `one_shot`: default one shot settings - -Here is an example of table form definition of regexp rule: - -~~~lua -config['regexp']['RE_TEST'] = { - re = '/test/i{mime}', - score = 10.0, - condition = function(task) - if task:get_header('Subject') then - return true - end - return false - end, -} -~~~
\ No newline at end of file diff --git a/doc/markdown/modules/replies.md b/doc/markdown/modules/replies.md deleted file mode 100644 index 6beb59133..000000000 --- a/doc/markdown/modules/replies.md +++ /dev/null @@ -1,48 +0,0 @@ -# Replies module - -This module collects the `message-id` header of messages sent by authenticated users and stores corresponding hashes to Redis, which are set to expire after a configuable amount of time (by default 1 day). Furthermore, it hashes `in-reply-to` headers of all received messages & checks for matches (ie. messages sent in response to messages our system originated)- and yields a symbol which could be used to adjust scoring or forces an action (most likely "no action" to accept) according to configuration. - - -## Configuration - -Settings for the module are described below (default values are indicated in brackets). - -- action (null) - -If set, apply the given action to messages identified as replies (would typically be set to "no action" to accept). - -- expire (86400) - -Time, in seconds, after which to expire records (default is one day). - -- key_prefix (rr) - -String prefixed to keys in Redis. - -- message (Message is reply to one we originated) - -Message passed when action is forced. - -- servers (null) - -Comma seperated list of Redis hosts - -- symbol (REPLY) - -Symbol yielded on messages identified as replies. - -## Example - -~~~ucl -replies { - # This setting is non-default & is required to be set - servers = "localhost"; - # This setting is non-default & may be desirable - action = "no action"; - # These are default settings you may want to change - expire = 86400; - key_prefix = "rr"; - message = "Message is reply to one we originated"; - symbol = "REPLY"; -} -~~~ diff --git a/doc/markdown/modules/rspamd_update.md b/doc/markdown/modules/rspamd_update.md deleted file mode 100644 index 78b004c8c..000000000 --- a/doc/markdown/modules/rspamd_update.md +++ /dev/null @@ -1,90 +0,0 @@ -# Rspamd update module - -This module allows to load rspamd rules, adjust symbols scores and actions without full daemon restart. -`rspamd_update` provides method to backport new rules and scores changing without updating rspamd itself. This might be useful, for example, if you want to use the stable version of rspamd but would like to improve filtering quality at the same time. - -## Security considerations - -Rspamd update module can execute lua code which is executed with scanner's privilleges - usually `_rspamd` or `nobody` user. Therefore, you should not use untrusted sources of updates. -Rspamd supports digital signatures to check the validity of updates downloaded using [EdDSA](http://ed25519.cr.yp.to/) signatures scheme. -For your own updates that are loaded from the filesystem or from some trusted network you might use unsigned files, however, signing is recommended even in this case. - -To sign a map you can use `rspamadm signtool` and to generate signing keypair - `rspamadm kaypair -s -u`: - -~~~ucl -keypair { - pubkey = "zo4sejrs9e5idqjp8rn6r3ow3x38o8hi5pyngnz6ktdzgmamy48y"; - privkey = "pwq38sby3yi68xyeeuup788z6suqk3fugrbrxieri637bypqejnqbipt1ec9tsm8h14qerhj1bju91xyxamz5yrcrq7in8qpsozywxy"; - id = "bs4zx9tcf1cs5ed5mt4ox8za54984frudpzzny3jwdp8mkt3feh7nz795erfhij16b66piupje4wooa5dmpdzxeh5mi68u688ixu3yd"; - encoding = "base32"; - algorithm = "curve25519"; - type = "sign"; -} -~~~ - -Then you can use `signtool` to edit map's file: - -``` -rspamadm signtool -e --editor=vim -k <keypair_file> <map_file> -``` - -To enforce signing policies you should add `sign+` string to your map definition: - -~~~ucl -map = "sign+http://example.com/map" -~~~ - -To specify trusted key you could either put **public** key from the keypair to `local.d/options.inc` file as following: - -``` -trusted_keys = ["<public key string>"]; -``` - -or add it as `key` definition to the map string: - -~~~ucl -map = "sign+key=<key_string>+http://example.com/map" -~~~ - -## Module configuration - -The module itself has very few parameters: - -* `key`: use this key (base32 encoded) as trusted key - -All other keys are threated as rules to load maps. By default, rspamd tries to load signed updates from `rspamd.com` site using trusted key `qxuogdh5eghytji1utkkte1dn3n81c3y5twe61uzoddzwqzuxxyb`: - -~~~ucl -rspamd_update { - rules = "sign+http://rspamd.com/update/rspamd-${BRANCH_VERSION}.ucl"; - key = "qxuogdh5eghytji1utkkte1dn3n81c3y5twe61uzoddzwqzuxxyb"; -} -~~~ - -## Updates structure - -Update files are quite simple: they have 3 sections: - -* `symbols` - list of new scores for symbols that are already in rspamd (loaded with `priority = 1` to override default settings) -* `actions` - list of scores for actions (also loaded with `priority = 1`) -* `rules` - list of lua code fragments to load into rspamd, they can use `rspamd_config` global to register new rules - -Here is an example of update file: - -~~~ucl -rules = { - test =<<EOD -rspamd_config.TEST = { - callback = function(task) return true end, - score = 1.0, - description = 'test', -} -EOD -} -actions = { - greylist = 3.4, -} -symbols = { - R_DKIM_ALLOW = -0.5, -} -~~~ diff --git a/doc/markdown/modules/spamassassin.md b/doc/markdown/modules/spamassassin.md deleted file mode 100644 index ca585241f..000000000 --- a/doc/markdown/modules/spamassassin.md +++ /dev/null @@ -1,72 +0,0 @@ -# Spamassassin rules module - -This module is designed to read and adopt spamassassin rules for rspamd. - -## Overview - -Spamassassin provides an excellent set of rules that are useful in some relatively -low volume environments. The goal of this plugin is to re-use the existing set -of spamassassin rules natively within rspamd. The configuration of this plugin -is very simple: just glue all your SA rules into a single file and feed it to -spamassassin module: - -~~~ucl -spamassassin { - ruleset = "/path/to/file"; - # Limit search size to 100 kilobytes for all regular expressions - match_limit = 100k; - # Those regexp atoms will not be passed through hyperscan: - pcre_only = ["RULE1", "__RULE2"]; -} -~~~ - -Rspamd can read multiple files containing SA rules, however it doesn't support -glob patterns so far. All rules are parsed to the same structure, so individual -rules might be overwritten if they occurs in multiple times. - -## Limitations and principles of work - -Rspamd tries to optimize SA rules quite aggressively. Some of that optimizations -are described in the following [presentation](http://highsecure.ru/ast-rspamd.pdf). -To achieve this goal, rspamd counts all rules as `expression atoms`. Meta rules are -**real** rspamd rules that can have their symbol and score. Other rules are normally -hidden. However, it is possible to specify some minimum score that is needed for a rule -to be treated as normal rule: - - alpha = 0.1 - -With this setting in `spamassassin` section, all rules whose scores are higher than -`0.1` are treated not as atoms but as the complete rules and evaluated accordingly. - -Currently, rspamd supports the following functions: - -* body, rawbody, meta, header, uri and other rules -* some header functions, such as `exists` -* some eval functions -* some plugins: - + 'Mail::SpamAssassin::Plugin::FreeMail', - + 'Mail::SpamAssassin::Plugin::HeaderEval', - + 'Mail::SpamAssassin::Plugin::ReplaceTags', - + 'Mail::SpamAssassin::Plugin::RelayEval', - + 'Mail::SpamAssassin::Plugin::MIMEEval', - + 'Mail::SpamAssassin::Plugin::BodyEval', - + 'Mail::SpamAssassin::Plugin::MIMEHeader' - -Rspamd does **not** support network plugins, HTML plugins and some other plugins. -This is planned for the next releases of rspamd. - -Nevertheless, the vast majority of spamassassin rules can work in rspamd making -the migration process much smoother for those who decide to replace SA with rspamd. - -The overall performance of rspamd, of course, goes down since SA rules contain a lot -of inefficient regular expressions that scan large text bodies. However, the optimizations -performed by rspamd can significantly reduce the amount of work required to process -SA rules. Moreover, if your PCRE library is built with JIT support, rspamd can benefit -from this by a significant grade. On start, rspamd tells if it can use JIT compilation and -warns if it cannot. Some regular expressions might also benefit from `hyperscan` support -that is available on x86_64 platforms starting from rspamd 1.1. - -Spamassassin plugin is written in lua with many functional elements. Hence, to speed -it up you might want to build rspamd with [luajit](http://luajit.org) that performs -blazingly fast and is almost as fast as plain C. Luajit is enabled by default since -rspamd 0.9. diff --git a/doc/markdown/modules/spf.md b/doc/markdown/modules/spf.md deleted file mode 100644 index 281cd91b4..000000000 --- a/doc/markdown/modules/spf.md +++ /dev/null @@ -1,34 +0,0 @@ -# SPF module - -SPF module performs checks of the sender's [SPF](http://www.openspf.org/) policy. -Many mail providers uses SPF records to define which hosts are eligible to send email -for this specific domain. In fact, there are many possibilities to create and use -SPF records, however, all they check merely the sender's domain and the sender's IP. - -The specific case are automated messages from the special mailer daemon address: -`<>`. In this case rspamd uses `HELO` to grab domain information as specified in the -standart. - -## Principles of work - -`SPF` can be a powerfull tool when properly used. However, it is very fragile in many -cases: when a message is somehow redirected or reconstructed by mailing lists software. - -Moreover, many mail providers have no clear understanding of this technology and -misuse the SPF technique. Hence, the scores for SPF symbols are relatively small -in rspamd. - -SPF uses DNS service extensively, therefore rspamd maintain the cache of SPF records. -This caches operates on principle of `least recently used` expiration. All cached items -lifetimes is accordingly limited by the matching DNS record time to live. - -You can manually specify the size of this cache by configuring SPF module: - -~~~ucl -spf { - spf_cache_size = 1k; # cache up to 1000 of the most recent SPF records -} -~~~ - -Currently, rspamd supports the full set of SPF elements, macroes and has internal -protection from DNS recursion. diff --git a/doc/markdown/modules/surbl.md b/doc/markdown/modules/surbl.md deleted file mode 100644 index ec39a6c7d..000000000 --- a/doc/markdown/modules/surbl.md +++ /dev/null @@ -1,184 +0,0 @@ -# SURBL module - -This module performs scanning of URL's found in messages against a list of known -DNS lists. It can add different symbols depending on the DNS replies from a -specific DNS URL list. It is also possible to resolve domains of URLs and then -check the IP addresses against the normal `RBL` style list. - -## Module configuration - -The default configuration defines several public URL lists. However, their terms -of usage normally disallows commercial or very extensive usage without purchasing -a specific sort of license. - -Nonetheless, they can be used by personal services or low volume requests free -of charge. - -~~~ucl -surbl { - # List of domains that are not checked by surbl - whitelist = "file://$CONFDIR/surbl-whitelist.inc"; - # Additional exceptions for TLD rules - exceptions = "file://$CONFDIR/2tld.inc"; - - rule { - # DNS suffix for this rule - suffix = "multi.surbl.org"; - symbol = "SURBL_MULTI"; - bits { - # List of bits ORed when reply is given - JP_SURBL_MULTI = 64; - AB_SURBL_MULTI = 32; - MW_SURBL_MULTI = 16; - PH_SURBL_MULTI = 8; - WS_SURBL_MULTI = 4; - SC_SURBL_MULTI = 2; - } - } - rule { - suffix = "multi.uribl.com"; - symbol = "URIBL_MULTI"; - bits { - URIBL_BLACK = 2; - URIBL_GREY = 4; - URIBL_RED = 8; - } - } - rule { - suffix = "uribl.rambler.ru"; - # Also check images - images = true; - symbol = "RAMBLER_URIBL"; - } - rule { - suffix = "dbl.spamhaus.org"; - symbol = "DBL"; - # Do not check numeric URL's - noip = true; - } - rule { - suffix = "uribl.spameatingmonkey.net"; - symbol = "SEM_URIBL_UNKNOWN"; - bits { - SEM_URIBL = 2; - } - noip = true; - } - rule { - suffix = "fresh15.spameatingmonkey.net"; - symbol = "SEM_URIBL_FRESH15_UNKNOWN"; - bits { - SEM_URIBL_FRESH15 = 2; - } - noip = true; - } -} -~~~ - -In general, the configuration of `surbl` module is definition of DNS lists. Each -list must have suffix that defines the list itself and optionally for some lists -it is possible to specify either `bit` or `ips` sections. - -Since some URL lists do not accept `IP` addresses, it is also possible to disable sending of URLs with IP address in the host to such lists. That could be done by specifying `noip = true` option: - -~~~ucl - rule { - suffix = "dbl.spamhaus.org"; - symbol = "DBL"; - # Do not check numeric URL's - noip = true; - } -~~~ - -It is also possible to check HTML images URLs using URL blacklists. Just specify `images = true` for such list and you are done: - -~~~ucl - rule { - suffix = "uribl.rambler.ru"; - # Also check images - images = true; - symbol = "RAMBLER_URIBL"; - } -~~~ - -## Principles of operation - -In this section, we define how `surbl` module performs its checks. - -### TLD composition - -By default, we want to check some top level domain, however, many domains contain -two components while others can have 3 or even more components to check against the -list. By default, rspamd takes top level domain as defined in the [public suffixes](https://publicsuffix.org). -Then one more component is prepended, for example: - - sub.example.com -> [.com] -> example.com - sub.co.uk -> [.co.uk] -> sub.co.uk - -However, sometimes even more levels of domain components are required. In this case, -the `exceptions` map can be used. For example, if we want to check all subdomains of -`example.com` and `example.co.uk`, then we can define the following list: - - example.com - example.co.uk - -Here are new composition rules: - - sub.example.com -> [.example.com] -> sub.example.com - sub1.sub2.example.co.uk -> [.example.co.uk] -> sub2.example.co.uk - -### DNS composition - -SURBL module composes the DNS request of two parts: - -- TLD component as defined in the previous section; -- DNS list suffix - -For example, to form a request to multi.surbl.org, the following applied: - - example.com -> example.com.multi.surbl.com - -### Results parsing - -Normally, DNS blacklists encode reply in A record from some private network -(namely, `127.0.0.0/8`). Encoding varies from one service to another. Some lists -use bits encoding, where a single DNS list or error message is encoded as a bit -in the least significant octet of the IP address. For example, if bit 1 encodes `LISTA` -and bit 2 encodes `LISTB`, then we need to perform bitwise `OR` for each specific bit -to decode reply: - - 127.0.0.3 -> LISTA | LISTB -> both bit symbols are added - 127.0.0.2 -> LISTB only - 127.0.0.1 -> LISTA only - -This encoding can save DNS requests to query multiple lists one at a time. - -Some other lists use direct encoding of lists by some specific addresses. In this -case you should define results decoding principle in `ips` section not `bits` since -bitwise rules are not applicable to these lists. In `ips` section you explicitly -match the ip returned by a list and its meaning. - -## IP lists - -From rspamd 1.1 it is also possible to do two step checks: - -1. Resolve IP addresses of each URL -2. Check each IP resolved against SURBL list - -In general this procedure could be represented as following: - -* Check `A` or `AAAA` records for `example.com` -* For each ip address resolve it using reverse octets composition: so if IP address of `example.com` is `1.2.3.4`, then checks would be for `4.3.2.1.uribl.tld` - -For example, [SBL list](https://www.spamhaus.org/sbl/) of `spamhaus` project provides such functions using `ZEN` multi list. This is included in rspamd default configuration: - -~~~ucl - rule { - suffix = "zen.spamhaus.org"; - symbol = "ZEN_URIBL"; - resolve_ip = true; - ips { - URIBL_SBL = "127.0.0.2"; - } - } -~~~ diff --git a/doc/markdown/modules/trie.md b/doc/markdown/modules/trie.md deleted file mode 100644 index 18e9f6808..000000000 --- a/doc/markdown/modules/trie.md +++ /dev/null @@ -1,39 +0,0 @@ -# Trie plugin - -Trie plugin is designed to search multiple strings within raw messages or text parts -doing this blazingly fast. In fact, it uses aho-corasic algorithm that performs incredibly -good even on large texts and many input strings. - -This module provides a convenient interface to the search trie structure. - -## Configuration - -Here is an example of trie configuration: - -~~~ucl -trie { - # Each subsection defines a single rule with associated symbol - SYMBOL1 { - # Define rules in the file (it is *NOT* a map) - file = "/some/path"; - # Raw rules search within the whole undecoded messages - raw = true; - # If we have multiple occurrences of strings from this rule - # then we insert a symbol multiple times - multi = true; - } - SYMBOL2 { - patterns = [ - "pattern1", - "pattern2", - "pattern3" - ] - } -} -~~~ - -Despite of the fact that aho-corasic trie is very fast, it supports merely plain -strings. Moreover, it cannot distinguish words boundaries, for example, a string -`test` will be found in texts `test`, `tests` or even `123testing`. Therefore, it -might be used to search some concrete and relatively specific patterns and should -not be used for words match. diff --git a/doc/markdown/modules/whitelist.md b/doc/markdown/modules/whitelist.md deleted file mode 100644 index 5b2417194..000000000 --- a/doc/markdown/modules/whitelist.md +++ /dev/null @@ -1,119 +0,0 @@ -# Whitelist module - -Whitelist module is intended to negate or increase scores for some messages that are known to -be from the trusted sources. Due to `SMTP` protocol design flaws, it is quite easy to -forge sender. Therefore, rspamd tries to validate sender based on the following additional -properties: - -- `DKIM`: a message has a valid DKIM signature for this domain -- `SPF`: a message matches SPF record for the domain -- `DMARC`: a message also satisfies domain's DMARC policy (usually implies SPF and DMARC) - -## Whitelist setup - -Whitelist configuration is quite straightforward. You can define a set of rules within -`rules` section. Each rule **must** have `domains` attribute that specifies either -map of domains (if specified as a string) or a direct list of domains (if specified as an array). - -### Whitelist constraints - -The following constraints are allowed: - -- `valid_spf`: require a valid SPF policy -- `valid_dkim`: require DKIM validation -- `valid_dmarc`: require a valid DMARC policy - -### Whitelist rules modes - -Each whitelist rule can work in 3 modes: - -- `whitelist` (default): add symbol when a domain has been found and one of constraints defined is satisfied (e.g. `valid_dmarc`) -- `blacklist`: add symbol when a domain has been found and one of constraints defined is *NOT* satisfied (e.g. `valid_dmarc`) -- `strict`: add symbol with negative (ham) score when a domain has been found and one of constraints defined is satisfied (e.g. `valid_dmarc`) and add symbol with **POSITIVE** (spam) score when some of constraints defined has failed - -If you do not define any constraints, then all both `strict` and `whitelist` rules just insert result for all mail from the specified domains. For `blacklist` rules the result has normally positive score. - -These options are combined using `AND` operator for `whitelist` and using `OR` for `blacklist` and `strict` rules. Therefore, if `valid_dkim = true` and -`valid_spf = true` would require both DKIM and SPF validation to whitelist domains from -the list. On the contrary, for blacklist and strict rules any violation would cause positive score symbol being inserted. - -### Optional settings - -You can also set the default metric settings using the ordinary attributes, such as: - -- `score`: default score -- `group`: default group (`whitelist` group is used if not specified explicitly) -- `one_shot`: default one shot mode -- `description`: default description - -Within lists, you can also use optional `multiplier` argument that defines additional -multiplier for the score added by this module. For example, let's define twice bigger -score for `github.com`: - - ["github.com", 2.0] - -or if using map: - - github.com 2.0 - -## Configuration example - -~~~ucl -whitelist { - rules { - WHITELIST_SPF = { - valid_spf = true; - domains = [ - "github.com", - ]; - score = -1.0; - } - - WHITELIST_DKIM = { - valid_dkim = true; - domains = [ - "github.com", - ]; - score = -2.0; - } - - WHITELIST_SPF_DKIM = { - valid_spf = true; - valid_dkim = true; - domains = [ - ["github.com", 2.0], - ]; - score = -3.0; - } - - STRICT_SPF_DKIM = { - valid_spf = true; - valid_dkim = true; - strict = true; - domains = [ - ["paypal.com", 2.0], - ]; - score = -3.0; # For strict rules negative score should be defined - } - - BLACKLIST_DKIM = { - valid_spf = true; - valid_dkim = true; - blacklist = true; - domains = "/some/file/blacklist_dkim.map"; - score = 3.0; # Mention positive score here - } - - WHITELIST_DMARC_DKIM = { - valid_dkim = true; - valid_dmarc = true; - domains = [ - "github.com", - ]; - score = -7.0; - } - } -} -~~~ - -Rspamd also comes with a set of pre-defined whitelisted domains that could be useful for start. |