summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorVsevolod Stakhov <vsevolod@highsecure.ru>2016-07-09 11:14:02 +0100
committerGitHub <noreply@github.com>2016-07-09 11:14:02 +0100
commit2d2a741df6954e042ae2f5ce6c1a66c2d61acc11 (patch)
treef31d25c7c005d729ec8487732f964d3719552c10 /doc
parent00c670b84935624f89371ec97e960b0479946c0b (diff)
parentdbe2ad682b6bce20afb3b5c8740d550130e9b9c0 (diff)
downloadrspamd-2d2a741df6954e042ae2f5ce6c1a66c2d61acc11.tar.gz
rspamd-2d2a741df6954e042ae2f5ce6c1a66c2d61acc11.zip
Merge pull request #713 from moisseev/patch-1
[Doc] Correct capitalization in `Rspamd architecture`
Diffstat (limited to 'doc')
-rw-r--r--doc/markdown/architecture/index.md42
-rw-r--r--doc/markdown/architecture/protocol.md16
-rw-r--r--doc/markdown/configuration/composites.md2
-rw-r--r--doc/markdown/configuration/index.md8
-rw-r--r--doc/markdown/configuration/logging.md32
-rw-r--r--doc/markdown/configuration/options.md6
-rw-r--r--doc/markdown/configuration/settings.md2
-rw-r--r--doc/markdown/configuration/statistic.md20
-rw-r--r--doc/markdown/index.md6
9 files changed, 67 insertions, 67 deletions
diff --git a/doc/markdown/architecture/index.md b/doc/markdown/architecture/index.md
index e4959728c..710a21064 100644
--- a/doc/markdown/architecture/index.md
+++ b/doc/markdown/architecture/index.md
@@ -2,7 +2,7 @@
## Introduction
-rspamd is a universal spam filtering system based on an event-driven processing model, which means that Rspamd is not intended to block anywhere in the code. To process messages Rspamd uses a set of `rules`. Each `rule` is a symbolic name associated with a message property. For example, we can define the following rules:
+Rspamd is a universal spam filtering system based on an event-driven processing model, which means that Rspamd is not intended to block anywhere in the code. To process messages Rspamd uses a set of `rules`. Each `rule` is a symbolic name associated with a message property. For example, we can define the following rules:
- `SPF_ALLOW` - means that a message is validated by SPF;
- `BAYES_SPAM` - means that a message is statistically considered as spam;
@@ -14,7 +14,7 @@ Rules are defined by [modules](../modules/). If there is a module, for example,
- `SPF_DENY` - a sender is denied by SPF policy;
- `SPF_SOFTFAIL` - there is no affinity defined by SPF policy.
-rspamd supports two main types of modules: internal modules written in C and external modules written in lua. There is no real difference between the two types with the exception that C modules are embedded and can be enabled in a `filters` attribute in the `options` section of the config:
+Rspamd supports two main types of modules: internal modules written in C and external modules written in Lua. There is no real difference between the two types with the exception that C modules are embedded and can be enabled in a `filters` attribute in the `options` section of the config:
~~~ucl
options {
@@ -25,7 +25,7 @@ options {
## Protocol
-rspamd uses the HTTP protocol for all operations. This protocol is described in the [protocol section](protocol.md).
+Rspamd uses the HTTP protocol for all operations. This protocol is described in the [protocol section](protocol.md).
## Metrics
@@ -69,17 +69,17 @@ The weight of rules is not necessarily constant. For example, for statistics rul
## Statistics
-rspamd uses statistic algorithms to precisely calculate the final score of a message. Currently, the only algorithm defined is OSB-Bayes. You can find details of this algorithm in the following [paper](http://osbf-lua.luaforge.net/papers/osbf-eddc.pdf). Rspamd uses a window size of 5 words in its classification. During the classification procedure, Rspamd splits a message into a set of tokens. Tokens are separated by punctuation or whitespace characters. Short tokens (less than 3 symbols) are ignored. For each token, Rspamd calculates two non-cryptographic hashes used subsequently as indices. All these tokens are stored in different statistics backends (mmapped files, sqlite3 database or redis server). Currently, the recommended backend for statistics is `redis`.
+Rspamd uses statistic algorithms to precisely calculate the final score of a message. Currently, the only algorithm defined is OSB-Bayes. You can find details of this algorithm in the following [paper](http://osbf-lua.luaforge.net/papers/osbf-eddc.pdf). Rspamd uses a window size of 5 words in its classification. During the classification procedure, Rspamd splits a message into a set of tokens. Tokens are separated by punctuation or whitespace characters. Short tokens (less than 3 symbols) are ignored. For each token, Rspamd calculates two non-cryptographic hashes used subsequently as indices. All these tokens are stored in different statistics backends (mmapped files, SQLite3 database or Redis server). Currently, the recommended backend for statistics is `Redis`.
-## Running Rspamd
+## Running rspamd
-There are several command-line options that can be passed to Rspamd. All of them can be displayed by passing the `--help` argument.
+There are several command-line options that can be passed to rspamd. All of them can be displayed by passing the `--help` argument.
-All options are optional: by default Rspamd will try to read the `etc/rspamd.conf` config file and run as a daemon. Also there is a test mode that can be turned on by passing the `-t` argument. In test mode, Rspamd reads the config file and checks its syntax. If a configuration file is OK, the exit code is zero. Test mode is useful for testing new config files without restarting Rspamd.
+All options are optional: by default rspamd will try to read the `etc/rspamd.conf` config file and run as a daemon. Also there is a test mode that can be turned on by passing the `-t` argument. In test mode, rspamd reads the config file and checks its syntax. If a configuration file is OK, the exit code is zero. Test mode is useful for testing new config files without restarting rspamd.
-## Managing Rspamd using signals
+## Managing rspamd using signals
-It is important to note that all user signals should be sent to the Rspamd main process and not to its children (as for child processes these signals can have other meanings). You can identify the main process:
+It is important to note that all user signals should be sent to the rspamd main process and not to its children (as for child processes these signals can have other meanings). You can identify the main process:
- by reading the pidfile:
@@ -87,20 +87,20 @@ It is important to note that all user signals should be sent to the Rspamd main
- by getting process info:
- $ ps auxwww | grep Rspamd
- nobody 28378 0.0 0.2 49744 9424 Rspamd: main process
- nobody 64082 0.0 0.2 50784 9520 Rspamd: worker process
- nobody 64083 0.0 0.3 51792 11036 Rspamd: worker process
- nobody 64084 0.0 2.7 158288 114200 Rspamd: controller process
- nobody 64085 0.0 1.8 116304 75228 Rspamd: fuzzy storage
+ $ ps auxwww | grep rspamd
+ nobody 28378 0.0 0.2 49744 9424 rspamd: main process
+ nobody 64082 0.0 0.2 50784 9520 rspamd: worker process
+ nobody 64083 0.0 0.3 51792 11036 rspamd: worker process
+ nobody 64084 0.0 2.7 158288 114200 rspamd: controller process
+ nobody 64085 0.0 1.8 116304 75228 rspamd: fuzzy storage
- $ ps auxwww | grep Rspamd | grep main
- nobody 28378 0.0 0.2 49744 9424 Rspamd: main process
+ $ ps auxwww | grep rspamd | grep main
+ nobody 28378 0.0 0.2 49744 9424 rspamd: main process
-After getting the pid of the main process it is possible to manage Rspamd with signals, as follows:
+After getting the pid of the main process it is possible to manage rspamd with signals, as follows:
-- `SIGHUP` - restart Rspamd: reread config file, start new workers (as well as controller and other processes), stop accepting connections by old workers, reopen all log files. Note that old workers would be terminated after one minute which should allow processing of all pending requests. All new requests to Rspamd will be processed by the newly started workers.
-- `SIGTERM` - terminate Rspamd.
+- `SIGHUP` - restart rspamd: reread config file, start new workers (as well as controller and other processes), stop accepting connections by old workers, reopen all log files. Note that old workers would be terminated after one minute which should allow processing of all pending requests. All new requests to rspamd will be processed by the newly started workers.
+- `SIGTERM` - terminate rspamd.
- `SIGUSR1` - reopen log files (useful for log file rotation).
-These signals may be used in rc-style scripts. Restarting of Rspamd is performed softly: no connections are dropped and if a new config is incorrect then the old config is used.
+These signals may be used in rc-style scripts. Restarting of rspamd is performed softly: no connections are dropped and if a new config is incorrect then the old config is used.
diff --git a/doc/markdown/architecture/protocol.md b/doc/markdown/architecture/protocol.md
index 09bffd4d1..81d10d67b 100644
--- a/doc/markdown/architecture/protocol.md
+++ b/doc/markdown/architecture/protocol.md
@@ -2,11 +2,11 @@
## Protocol basics
-rspamd uses the HTTP protocol, either version 1.0 or 1.1. (There is also a compatibility layer described further in this document.) Rspamd defines some headers which allow the passing of extra information about a scanned message, such as envelope data, IP address or SMTP sasl authentication data, etc. Rspamd supports normal and chunked encoded HTTP requests.
+Rspamd uses the HTTP protocol, either version 1.0 or 1.1. (There is also a compatibility layer described further in this document.) Rspamd defines some headers which allow the passing of extra information about a scanned message, such as envelope data, IP address or SMTP SASL authentication data, etc. Rspamd supports normal and chunked encoded HTTP requests.
## Rspamd HTTP request
-rspamd encourages the use of the HTTP protocol since it is standard and can be used by every programming language without the use of exotic libraries. A typical HTTP request looks like the following:
+Rspamd encourages the use of the HTTP protocol since it is standard and can be used by every programming language without the use of exotic libraries. A typical HTTP request looks like the following:
POST /check HTTP/1.0
Content-Length: 26969
@@ -52,11 +52,11 @@ Standard HTTP headers, such as `Content-Length`, are also supported.
## Rspamd HTTP reply
-rspamd reply is encoded in `JSON`. Here is a typical HTTP reply:
+Rspamd reply is encoded in `JSON`. Here is a typical HTTP reply:
HTTP/1.1 200 OK
Connection: close
- Server: Rspamd/0.9.0
+ Server: rspamd/0.9.0
Date: Mon, 30 Mar 2015 16:19:35 GMT
Content-Length: 825
Content-Type: application/json
@@ -105,7 +105,7 @@ rspamd reply is encoded in `JSON`. Here is a typical HTTP reply:
}
~~~
-For convenience, the reply is LINTed using [jsonlint](http://jsonlint.com). The actual reply is compressed for speed.
+For convenience, the reply is LINTed using [JSONLint](http://jsonlint.com). The actual reply is compressed for speed.
The reply can be treated as a JSON object where keys are metric names (namely `default`) and values are objects that represent metrics.
@@ -114,7 +114,7 @@ Each metric has the following fields:
* `is_spam` - boolean value that indicates whether a message is spam
* `is_skipped` - boolean flag that is `true` if a message has been skipped due to settings
* `score` - floating point value representing the effective score of message
-* `required_score` - floating point value meaning the treshold value for the metric
+* `required_score` - floating point value meaning the threshold value for the metric
* `action` - recommended action for a message:
- `no action` - message is likely ham;
- `greylist` - message should be greylisted;
@@ -128,7 +128,7 @@ Additionally, metric contains all symbols added during a message's processing, i
Additional keys which may be in the reply include:
* `subject` - if action is `rewrite subject` this value defines the desired subject for a message
-* `urls` - a list of urls found in a message (only hostnames)
+* `urls` - a list of URLs found in a message (only hostnames)
* `emails` - a list of emails found in a message
* `message-id` - ID of message (useful for logging)
* `messages` - array of optional messages added by Rspamd filters (such as `SPF`)
@@ -151,4 +151,4 @@ Here is an example of a JSON control block:
}
~~~
-Moreover, [UCL](https://github.com/vstakhov/libucl) json extensions and syntax conventions are also supported inside the control block. \ No newline at end of file
+Moreover, [UCL](https://github.com/vstakhov/libucl) JSON extensions and syntax conventions are also supported inside the control block.
diff --git a/doc/markdown/configuration/composites.md b/doc/markdown/configuration/composites.md
index c5e97ed1d..3e4596399 100644
--- a/doc/markdown/configuration/composites.md
+++ b/doc/markdown/configuration/composites.md
@@ -2,7 +2,7 @@
## Introduction
-rspamd composites are used to combine rules and create more complex rules. Composite rules are defined by `composite` keys. The value of the key should be an object that defines the composite's name and value, which is the combination of rules in a joint expression.
+Rspamd composites are used to combine rules and create more complex rules. Composite rules are defined by `composite` keys. The value of the key should be an object that defines the composite's name and value, which is the combination of rules in a joint expression.
For example, you can define a composite that is added when two specific symbols are found:
diff --git a/doc/markdown/configuration/index.md b/doc/markdown/configuration/index.md
index 6cc5e049e..f1c49aa4c 100644
--- a/doc/markdown/configuration/index.md
+++ b/doc/markdown/configuration/index.md
@@ -1,15 +1,15 @@
# Rspamd configuration
-rspamd uses the Universal Configuration Language (UCL) for its configuration. The UCL format is described in detail in this [document](ucl.md). Rspamd defines several variables and macros to extend
+Rspamd uses the Universal Configuration Language (UCL) for its configuration. The UCL format is described in detail in this [document](ucl.md). Rspamd defines several variables and macros to extend
UCL functionality.
## Rspamd variables
- *CONFDIR*: configuration directory for Rspamd, found in `$PREFIX/etc/rspamd/`
-- *RUNDIR*: runtime directory to store pidfiles or unix sockets
+- *RUNDIR*: runtime directory to store pidfiles or UNIX sockets
- *DBDIR*: persistent databases directory (used for statistics or symbols cache).
- *LOGDIR*: a directory to store log files
-- *PLUGINSDIR*: plugins directory for lua plugins
+- *PLUGINSDIR*: plugins directory for Lua plugins
- *PREFIX*: basic installation prefix
- *VERSION*: Rspamd version string (e.g. "0.6.6")
@@ -39,7 +39,7 @@ modules {
}
~~~
-In this file, we read a lua script placed in `$CONFDIR/lua/rspamd.lua` and load lua rules from it. Then we include a global [options](options.md) section followed by [logging](logging.md) logging configuration. The [metrics](metrics.md) section defines metric settings, including rule weights and Rspamd actions. The [workers](../workers/index.md) section specifies Rspamd workers settings. [Composites](composites.md) is a utility section that describes composite symbols. Statistical filters are defined in the [statistic](statistic.md) section. Rspamd stores module configurations (for both lua and internal modules) in the [modules](../modules/index.md) section while modules themselves are loaded from the following portion of the configuration:
+In this file, we read a Lua script placed in `$CONFDIR/lua/rspamd.lua` and load Lua rules from it. Then we include a global [options](options.md) section followed by [logging](logging.md) logging configuration. The [metrics](metrics.md) section defines metric settings, including rule weights and Rspamd actions. The [workers](../workers/index.md) section specifies Rspamd workers settings. [Composites](composites.md) is a utility section that describes composite symbols. Statistical filters are defined in the [statistic](statistic.md) section. Rspamd stores module configurations (for both Lua and internal modules) in the [modules](../modules/index.md) section while modules themselves are loaded from the following portion of the configuration:
~~~ucl
modules {
diff --git a/doc/markdown/configuration/logging.md b/doc/markdown/configuration/logging.md
index 63e0799b7..4ae51d532 100644
--- a/doc/markdown/configuration/logging.md
+++ b/doc/markdown/configuration/logging.md
@@ -1,24 +1,24 @@
# Rspamd logging settings
## Introduction
-rspamd has a number of logging options. Firstly, there are three types of log output that are supported: console logging (just output log messages to console), file logging (output log messages to file) and logging via syslog. It is also possible to restrict logging to a specific level:
+Rspamd has a number of logging options. Firstly, there are three types of log output that are supported: console logging (just output log messages to console), file logging (output log messages to file) and logging via syslog. It is also possible to restrict logging to a specific level:
* `error` - log only critical errors
* `warning` - log errors and warnings
* `info` - log all non-debug messages
* `debug` - log all including debug messages (huge amount of logging)
-It is possible to turn on debug messages for specific ip addresses. This can be useful for testing. For each logging type there are special mandatory parameters: log facility for syslog (read `syslog(3)` man page for details about facilities), log file for file logging. Also, file logging may be buffered for performance. To reduce logging noise, Rspamd detects sequential matching log messages and replaces them with a total number of repeats:
+It is possible to turn on debug messages for specific IP addresses. This can be useful for testing. For each logging type there are special mandatory parameters: log facility for syslog (read `syslog(3)` man page for details about facilities), log file for file logging. Also, file logging may be buffered for performance. To reduce logging noise, Rspamd detects sequential matching log messages and replaces them with a total number of repeats:
- #81123(fuzzy): May 11 19:41:54 Rspamd file_log_function: Last message repeated 155 times
- #81123(fuzzy): May 11 19:41:54 Rspamd process_write_command: fuzzy hash was successfully added
+ #81123(fuzzy): May 11 19:41:54 rspamd file_log_function: Last message repeated 155 times
+ #81123(fuzzy): May 11 19:41:54 rspamd process_write_command: fuzzy hash was successfully added
-## Unique id
+## Unique ID
-From version 1.0, Rspamd logs contain a unique id for each logging message. This allows finding relevant messages quickly. Moreover, there is now a `module` definition: for example, `task` or `cfg` modules. Here is a quick example of how it works: imagine that we have an incoming task for some message. Then you'd see something like this in the logs:
+From version 1.0, Rspamd logs contain a unique ID for each logging message. This allows finding relevant messages quickly. Moreover, there is now a `module` definition: for example, `task` or `cfg` modules. Here is a quick example of how it works: imagine that we have an incoming task for some message. Then you'd see something like this in the logs:
2015-09-02 16:41:59 #45015(normal) <ed2abb>; task; accept_socket: accepted connection from ::1 port 52895
- 2015-09-02 16:41:59 #45015(normal) <ed2abb>; task; Rspamd_message_parse: loaded message; id: <F66099EE-BCAB-4D4F-A4FC-7C15A6686397@FreeBSD.org>; queue-id: <undef>
+ 2015-09-02 16:41:59 #45015(normal) <ed2abb>; task; rspamd_message_parse: loaded message; id: <F66099EE-BCAB-4D4F-A4FC-7C15A6686397@FreeBSD.org>; queue-id: <undef>
So the tag is `ed2abb` in this case. All subsequent processing related to this task will have the same tag. It is enabled not only on the `task` module, but also others, such as the `spf` or `lua` modules. For other modules, such as `cfg`, the tag is generated statically using a specific characteristic, for example the configuration file checksum.
@@ -31,15 +31,15 @@ Here is summary of logging parameters:
+ `facility` - logging facility for syslog
- `level` - Defines logging level (error, warning, info or debug).
- `log_buffer` - For file and console logging defines buffer size that will be used for logging output.
-- `log_urls` - Flag that defines whether all urls in message should be logged. Useful for testing.
-- `debug_ip` - List that contains ip addresses for which debugging should be turned on.
+- `log_urls` - Flag that defines whether all URLs in message should be logged. Useful for testing.
+- `debug_ip` - List that contains IP addresses for which debugging should be turned on.
- `log_color` - Turn on coloring for log messages. Default: `no`.
- `debug_modules` - A list of modules that are enabled for debugging. The following modules are available here:
+ `task` - task messages
+ `cfg` - configuration messages
+ `symcache` - messages from symbols cache
+ `fuzzy_backend` - messages from fuzzy backend
- + `lua` - messages from lua code
+ + `lua` - messages from Lua code
+ `spf` - messages from spf module
+ `dkim` - messages from dkim module
+ `main` - messages from the main process
@@ -49,7 +49,7 @@ Here is summary of logging parameters:
### Log format
-rspamd supports a custom log format when writing information about a message to the log. (This feature is supported since version 1.1.) The format string looks as follows:
+Rspamd supports a custom log format when writing information about a message to the log. (This feature is supported since version 1.1.) The format string looks as follows:
log_format =<< EOD
@@ -61,10 +61,10 @@ rspamd supports a custom log format when writing information about a message to
Newlines are replaced with spaces. Both text and variables are supported in the log format line. Each variable can have an optional `if_` prefix, which will log only if it is triggered. Moreover, each variable can have an optional body value, where `$` is replaced with the variable value (as many times as it is found in the body, e.g. `$var{$$$$}` will be replaced with the variable's name repeated 4 times).
-rspamd supports the following variables:
+Rspamd supports the following variables:
-- `mid` - message id
-- `qid` - queue id
+- `mid` - message ID
+- `qid` - queue ID
- `ip` - from IP
- `user` - authenticated user
- `smtp_from` - envelope from (or MIME from if SMTP from is absent)
@@ -73,14 +73,14 @@ rspamd supports the following variables:
- `mime_rcpt` - MIME rcpt - the first recipient
- `smtp_rcpts` - envelope rcpts - all recipients
- `mime_rcpts` - MIME rcpts - all recipients
-- `len` - length of essage
+- `len` - length of message
- `is_spam` - a one-letter rating of spammyness: `T` for spam, `F` for ham and `S` for skipped messages
- `action` - default metric action
- `symbols` - list of all symbols
- `time_real` - real time of task processing
- `time_virtual` - CPU time of task processing
- `dns_req` - number of DNS requests
-- `lua` - custom lua script, e.g:
+- `lua` - custom Lua script, e.g:
~~~lua
$lua{
diff --git a/doc/markdown/configuration/options.md b/doc/markdown/configuration/options.md
index fcf524288..7f494fd6f 100644
--- a/doc/markdown/configuration/options.md
+++ b/doc/markdown/configuration/options.md
@@ -40,17 +40,17 @@ control_socket = "$DBDIR/rspamd.sock mode=0600";
* `history_file`: this file is automatically created and refreshed on shutdown to preserve the rolling history of operations displayed by the WebUI across restarts
* `temp_dir`: a directory for temporary files (can also be set via the environment variable `TMPDIR`).
* `url_tld`: path to file with top level domain suffixes used by Rspamd to find URLs in messages; by default this file is shipped with Rspamd and should not be touched manually
-* `pid_file`: file used to store pid of the Rspamd main process (not used with systemd)
+* `pid_file`: file used to store PID of the Rspamd main process (not used with systemd)
* `min_word_len`: minimum size in letters (valid for utf-8 as well) for a sequence of characters to be treated as a word; normally Rspamd skips sequences if they are shorter or equal to three symbols
* `control_socket`: path/bind for the control socket
* `classify_headers`: list of headers that are processed by statistics
* `history_rows`: number of rows in the recent history table
* `explicit_modules`: always load modules from the list even if they have no configuration section in the file
-* `disable_hyperscan`: disable hyperscan optimizations (if enabled at compile time)
+* `disable_hyperscan`: disable Hyperscan optimizations (if enabled at compile time)
* `cores_dir`: directory where Rspamd should drop core files
* `max_cores_size`: maximum total size of core files that are placed in `cores_dir`
* `max_cores_count`: maximum number of files in `cores_dir`
-* `local_addrs` or `local_networks`: map or list of ip networks used as local, so certain checks are skipped for them (e.g. SPF checks)
+* `local_addrs` or `local_networks`: map or list of IP networks used as local, so certain checks are skipped for them (e.g. SPF checks)
## DNS options
diff --git a/doc/markdown/configuration/settings.md b/doc/markdown/configuration/settings.md
index 0f4a70af2..c35920368 100644
--- a/doc/markdown/configuration/settings.md
+++ b/doc/markdown/configuration/settings.md
@@ -2,7 +2,7 @@
## Introduction
-rspamd allows exceptional control over the settings which will apply to incoming messages. Each setting can define a set of custom metric weights, symbols or actions. An administrator can also skip spam checks for certain messages completely, if required. Rspamd settings can be loaded as dynamic maps and updated automatically if a corresponding file or URL has changed since its last update.
+Rspamd allows exceptional control over the settings which will apply to incoming messages. Each setting can define a set of custom metric weights, symbols or actions. An administrator can also skip spam checks for certain messages completely, if required. Rspamd settings can be loaded as dynamic maps and updated automatically if a corresponding file or URL has changed since its last update.
To load settings as a dynamic map, you can set 'settings' to a map string:
diff --git a/doc/markdown/configuration/statistic.md b/doc/markdown/configuration/statistic.md
index 26b2b70e7..329832091 100644
--- a/doc/markdown/configuration/statistic.md
+++ b/doc/markdown/configuration/statistic.md
@@ -6,7 +6,7 @@ Statistics is used by Rspamd to define the `class` of message: either spam or ha
that defines probabilities combination. In general, it defines the probability of that a message belongs to the specified class (namely, `spam` or `ham`)
base on the following factors:
-- the probability of a specific token to be spam or ham (which means efficiently count of a token's occurences in spam and ham messages)
+- the probability of a specific token to be spam or ham (which means efficiently count of a token's occurrences in spam and ham messages)
- the probability of a specific token to appear in a message (which efficiently means frequency of a token divided by a number of tokens in a message)
## Statistics Architecture
@@ -30,7 +30,7 @@ Starting from Rspamd 1.0, we propose to use `sqlite3` as backed and `osb` as tok
metainformation in statistics. The following configuration demonstrates the recommended statistics configuration:
~~~ucl
-# Classifier's algorith is BAYES
+# Classifier's algorithm is BAYES
classifier "bayes" {
tokenizer {
name = "osb";
@@ -63,16 +63,16 @@ classifier "bayes" {
}
~~~
-It is also possible to organize per-user statistics using sqlite3 backend. However, you should ensure that Rspamd is called at the
+It is also possible to organize per-user statistics using SQLite3 backend. However, you should ensure that Rspamd is called at the
finally delivery stage (e.g. LDA mode) to avoid multi-recipients messages. In case of a multi-recipient message, Rspamd would just use the
-first recipient for user-based statistics which might be inappropriate for your configuration (however, Rspamd preferes SMTP recipients over MIME ones and prioritize
+first recipient for user-based statistics which might be inappropriate for your configuration (however, Rspamd prefers SMTP recipients over MIME ones and prioritize
the special LDA header called `Deliver-To` that can be appended by `-d` options for `rspamc`). To enable per-user statistics, just add `users_enabled = true` property
-to the **classifier** configuration. You can use per-user and per-language statistics simulataneously. For both types of spearation, Rspamd also
+to the **classifier** configuration. You can use per-user and per-language statistics simultaneously. For both types of statistics, Rspamd also
looks to the default language and default user's statistics allowing to have the common set of tokens shared for all users/languages.
-## Using lua scripts for `per_user` classifier
+## Using Lua scripts for `per_user` classifier
-It is also possible to create custom lua scripts to use customized user or language for a specific task. Here is an example
+It is also possible to create custom Lua scripts to use customized user or language for a specific task. Here is an example
of such a script for extracting domain names from recipients organizing thus per-domain statistics:
~~~ucl
@@ -178,7 +178,7 @@ To learn specific classifier, you can use `-c` option for `rspamc` (or `Classifi
## Redis statistics
-From version 1.1, it is also possible to specify redis as a backend for statistics and cache of learned messages. Redis is recommended for clustered configurations as it allows simultaneous learn and checks and, besides, is very fast. To setup redis, you could use `redis` backend for a classifier (cache is set to the same servers accordingly).
+From version 1.1, it is also possible to specify Redis as a backend for statistics and cache of learned messages. Redis is recommended for clustered configurations as it allows simultaneous learn and checks and, besides, is very fast. To setup Redis, you could use `redis` backend for a classifier (cache is set to the same servers accordingly).
~~~ucl
classifier "bayes" {
@@ -205,7 +205,7 @@ From version 1.1, it is also possible to specify redis as a backend for statisti
}
~~~
-`per_languages` is not supported by redis - it just stores everything in the same place. `write_servers` are used in the
+`per_languages` is not supported by Redis - it just stores everything in the same place. `write_servers` are used in the
`master-slave` rotation by default and used for learning, whilst `servers` are selected randomly each time:
write_servers = "master.example.com:6379:10, slave.example.com:6379:1"
@@ -221,6 +221,6 @@ There are 3 possibilities to specify autolearning:
* `autolearn = true`: autolearning is performing as spam if a message has `reject` action and as ham if a message has **negative** score
* `autolearn = [1, 10]`: autolearn as ham if score is less than minimum of 2 numbers (< `1` here) and as spam if score is more than maximum of 2 numbers (> `10` in this case)
-* `autolearn = "return function(task) ... end"`: use the following lua function to detect if autolearn is needed (function should return 'ham' if learn as ham is needed and string 'spam' if learn as spam is needed, if no learn is needed then a function can return anything including `nil`)
+* `autolearn = "return function(task) ... end"`: use the following Lua function to detect if autolearn is needed (function should return 'ham' if learn as ham is needed and string 'spam' if learn as spam is needed, if no learn is needed then a function can return anything including `nil`)
Redis backend is highly recommended for autolearning purposes since it's the only backend with high concurrency level when multiple writers are properly synchronized.
diff --git a/doc/markdown/index.md b/doc/markdown/index.md
index 68955f8e3..7b8ea49b3 100644
--- a/doc/markdown/index.md
+++ b/doc/markdown/index.md
@@ -13,8 +13,8 @@ Here are the main introduction documents that are recommended for reading if you
### Rspamd and Dovecot Antispam integration
-* [Training Rspamd with dovecot antispam plugin, part 1](https://kaworu.ch/blog/2014/03/25/dovecot-antispam-with-Rspamd/) - this tutorial describes how to train Rspamd automatically using the `antispam` pluging of the `dovecot` IMAP server
-* [Training Rspamd with dovecot antispam plugin, part 2](https://kaworu.ch/blog/2015/10/12/dovecot-antispam-with-Rspamd-part2/) - continuation of the previous tutorial
+* [Training Rspamd with Dovecot antispam plugin, part 1](https://kaworu.ch/blog/2014/03/25/dovecot-antispam-with-rspamd/) - this tutorial describes how to train Rspamd automatically using the `antispam` plugin of the `Dovecot` IMAP server
+* [Training Rspamd with Dovecot antispam plugin, part 2](https://kaworu.ch/blog/2015/10/12/dovecot-antispam-with-rspamd-part2/) - continuation of the previous tutorial
## Configuration
@@ -38,4 +38,4 @@ These documents are useful if you need to know details about Rspamd internals.
This section contains documents about writing new rules for Rspamd and, in particular, Rspamd Lua API.
* **[Writing Rspamd rules](./tutorials/writing_rules.md)** is a step-by-step guide that describes how to write rules for Rspamd
-* **[LUA API reference](./lua/)** provides the extensive information about all LUA modules available in Rspamd
+* **[Lua API reference](./lua/)** provides the extensive information about all Lua modules available in Rspamd