diff options
author | Vsevolod Stakhov <vsevolod@highsecure.ru> | 2016-06-12 13:11:03 +0100 |
---|---|---|
committer | Vsevolod Stakhov <vsevolod@highsecure.ru> | 2016-06-12 13:11:03 +0100 |
commit | 6f69240fd4ad3eebe8f67fb53fe0cb179c342be5 (patch) | |
tree | e067f3cc62d87828393c2116de802393301b7041 | |
parent | 721b8a2c2281493ea9e60e9e75913c9fa7a8e159 (diff) | |
download | rspamd-6f69240fd4ad3eebe8f67fb53fe0cb179c342be5.tar.gz rspamd-6f69240fd4ad3eebe8f67fb53fe0cb179c342be5.zip |
[Doc] Update documentation
-rw-r--r-- | doc/markdown/architecture/index.md | 11 | ||||
-rw-r--r-- | doc/markdown/architecture/protocol.md | 4 |
2 files changed, 4 insertions, 11 deletions
diff --git a/doc/markdown/architecture/index.md b/doc/markdown/architecture/index.md index ddb9a7407..f93ce9818 100644 --- a/doc/markdown/architecture/index.md +++ b/doc/markdown/architecture/index.md @@ -69,20 +69,13 @@ The weight of rules is not necessarily constant. For example, for statistics rul ## Statistics -rspamd uses statistic algorithms to precisely calculate the final score of a message. Currently, the only algorithm defined is OSB-Bayes. You can find details of this algorithm in the following [paper](http://osbf-lua.luaforge.net/papers/osbf-eddc.pdf). rspamd uses a window size of 5 words in its classification. During the classification procedure, rspamd splits a message into a set of tokens. Tokens are separated by punctuation or whitespace characters. Short tokens (less than 3 symbols) are ignored. For each token, rspamd calculates two non-cryptographic hashes used subsequently as indices. All these tokens are stored in memory-mapped files called `statistic files` (or `statfiles`). Each statfile is a set of token chains, indexed by the first hash. A new token may be inserted into a chain, and if this chain is full then rspamd tries to expire less significant tokens to insert the new one. It is possible to obtain the current state of tokens by running the - - rspamc stat - -command which outputs statistics for free and used tokens in each statfile. Please note that if a statfile is close to being completely full then during subsequent learning you will lose existing data. Therefore, it is recommended to increase the size of such statfiles. +rspamd uses statistic algorithms to precisely calculate the final score of a message. Currently, the only algorithm defined is OSB-Bayes. You can find details of this algorithm in the following [paper](http://osbf-lua.luaforge.net/papers/osbf-eddc.pdf). rspamd uses a window size of 5 words in its classification. During the classification procedure, rspamd splits a message into a set of tokens. Tokens are separated by punctuation or whitespace characters. Short tokens (less than 3 symbols) are ignored. For each token, rspamd calculates two non-cryptographic hashes used subsequently as indices. All these tokens are stored in different statistics backends (mmapped files, sqlite3 database or redis server). Currently, the recommended backend for statistics is `redis`. ## Running rspamd There are several command-line options that can be passed to rspamd. All of them can be displayed by passing the `--help` argument. -All options are optional: by default rspamd will try to read the `etc/rspamd.conf` config file and run as a daemon. Also there is a test mode that can be turned on by passing the `-t` argument. In test mode, rspamd reads the config file and checks its syntax. If a configuration file is OK, the exit code is zero. Test mode is useful for testing new config files without restarting rspamd. The `--convert-config` option can be used to convert old style (pre 0.6.0) configs to [ucl](../configuration/ucl.md) format: - - $ rspamd -c ./rspamd.xml --convert-conf ./rspamd.conf - +All options are optional: by default rspamd will try to read the `etc/rspamd.conf` config file and run as a daemon. Also there is a test mode that can be turned on by passing the `-t` argument. In test mode, rspamd reads the config file and checks its syntax. If a configuration file is OK, the exit code is zero. Test mode is useful for testing new config files without restarting rspamd. ## Managing rspamd using signals diff --git a/doc/markdown/architecture/protocol.md b/doc/markdown/architecture/protocol.md index f16f3380d..b3f94a7e3 100644 --- a/doc/markdown/architecture/protocol.md +++ b/doc/markdown/architecture/protocol.md @@ -2,7 +2,7 @@ ## Protocol basics -rspamd uses the HTTP protocol, either version 1.0 or 1.1. (There is also a compatibility layer described further in this document.) rspamd defines some headers which allow the passing of extra information about a scanned message, such as envelope data, IP address or SMTP sasl authentication data, etc. rspamd supports normal and chunked encoded HTTP requests; however, form URL encoding is **NOT** supported currently. +rspamd uses the HTTP protocol, either version 1.0 or 1.1. (There is also a compatibility layer described further in this document.) rspamd defines some headers which allow the passing of extra information about a scanned message, such as envelope data, IP address or SMTP sasl authentication data, etc. rspamd supports normal and chunked encoded HTTP requests. ## rspamd HTTP request @@ -135,7 +135,7 @@ Additional keys which may be in the reply include: ## rspamd JSON control block -Since rspamd version 0.9 it is also possible to pass additional data by prepending a JSON control block to a message. So you can use either headers or a JSON block to pass data from the MTA to rspamd. The advantage of the JSON block is that it can be encrypted using `httpcrypt`. Header encryption is currently unsupported. +Since rspamd version 0.9 it is also possible to pass additional data by prepending a JSON control block to a message. So you can use either headers or a JSON block to pass data from the MTA to rspamd. To use a JSON control block, you need to pass an extra header called `Message-Length` to rspamd. This header should be equal to the size of the message **excluding** the JSON control block. Therefore, the size of the control block is equal to `Content-Length - Message-Length`. rspamd assumes that a message starts immediately after the control block (with no extra CRLF). This method is equally compatible with streaming transfer, however even if you are not specifying `Content-Length` you are still required to specify `Message-Length`. |