[CritFix] Restore the intended pre-filters behaviour
Previously, filters and post-filters were checked even if pre-filter has
set some result. Now pre-result efficienly makes a trapdoor to writing
reply (as it was before 1.0).
* Fix parsing of URLs in texts
* Fix creating of URLs from LUA
* Fix some more URL detector issues
* Fix unit tests
* Fix JIT compilation for PCRE2 expressions
* Fix JIT usage for PCRE2
* Fix UTF8 mode in PCRE2
* Add workaround for pre-historic compilers (#605)
* Fix and rescore R_PARTS_DIFFER logic
* Properly set lua paths for tests
* Fix SA rawbody processing - exclude top part
* Store text parts content with newlines stripped
* Properly support SA body regexps
* Fix body rules in SA plugin
* Fix setting of score for parts differ
* More fixes to parts distance calculations
- Use hashed words instead of full words for speed
- Improve levenstein distance calculations and penalise replaces
- Always return number from 0 to 1
- Use g_malloc instead of alloca
* Fix percents output in R_PARTS_DIFFER
* Plug memory leak in dkim module
* Plug minor memory leak in regexps creation
- Use hashed words instead of full words for speed
- Improve levenstein distance calculations and penalise replaces
- Always return number from 0 to 1
- Use g_malloc instead of alloca
* Implement new multipattern matcher that uses hyperscan if possible
* Use mutlipattern for lua_trie code
* Add utility methods for multipattern
* Use multipattern in url matcher
* Add escape functions for hyperscan
* Allow to optimize lua -> C transition by flattening table args
* Optimize hot paths in SA plugin
* Optimize rspamd_re_cache_type_from_string
* Allow empty tries
* Fix extraction of URLs from Subject
* Allow to have different flags for different patterns in multipattern
* Add common directory for hyperscan cache to config
* Implement caching for hyperscan multipattern
* Attach domain part to `R_SUSPICIOUS_URL` (by @moisseev)
* Allow multipattern scans to be nested for the case of hyperscan
* Simplify SURBL redirector search code and avoid ac_trie
* Add two way substring search algorithm
* Avoid acism usage to find gtube pattern
* Fix processing of empty headers
* Allow to disable pthread mutexes on broken platforms
* Make web interface not send password in query strings (#585) by @fatalbanana
* Add maximum delay to ratelimit module
* Backport fix for empty files inclusion from libucl
* Fix settings id setup
* Add min_learns option to classifiers
* Use more clever to utf8 conversion strategy
* Fix disabling of virtual symbols in the settings
* Rework settings to work properly in metric-less configuration
* Set the default limit for classifier
* Fix ttl based expiration from LRU cache
* Rework DKIM module to use OpenSSL for digests
* Fix mailto urls parsing with hyperscan
* Do not set obscured flag for urls starting with spaces
* Fix crash on redis learn
* Fix ratelimit ctime setting
* New DCC module (by @smfreegard)
* Rework whitelist module:
- Now we check different elements for different checks
- MIME from for DMARC
- DKIM signature domain for DKIM
- SMTP from or HELO for SPF
* Fix regexps results combination (*critical*)
* Fix issue with expressions processing (*critical*)
* Optimize strlcpy for aligned input
* Add support of half-closed connection in lua_tcp
* Allow to print compact json in client
* Save required score in history (#581)
* Allow to attach file descriptors to control commands
* Allow to send descriptors from workers to main
* Allow to attach fd when broadcasting to workers
* Implement log pipe feature for rspamd logs analysis
* Add `log_helper` worker
* Add `URIBL_SBL_CSS` (by @smfreegard)
* Add worker scripts functionality
* Add on load hooks for rspamd_config
* Add lua scripts for log_helper worker
* Add generic maillist detector (#584)
* Implement FANN autolearn using log_helper worker
* Rework metrics configuration to allow includes
* Change default value of forced removal in composite rules
* Allow to use assembly version of blake2b on x86 cpu
* Use less precise (but faster) clock if possible
* Insert redirected URL to the urls list
* Allow to get and set callback data for rspamd symbols
* Add binary heap implementation
* Use binary heap for expire algorithms in the hash
* Use `least frequent used` expiration strategy
* Allow to get mime headers from a task
* Add support for mime headers in `regexp` module
* Update Exim patches (by @fatalbanana)
* Allow building rspamd with jemalloc
* Save multipart boundaries
* SA plugin changes:
- Properly handle MIME headers
- Fix eval:check_for_missing_to_header rule
- Implement SA compatible body regexps
- Use sabody rules in SA plugin
* LUA API changes:
- Add util.get_ticks function
- Add util.stat function
- Add task:get_symbols_numeric method
- Add method to get number of symbols in the cache
- Add lua methods to get redirected urls
- Allow to get callbacks for lua symbols
- Add config:set_symbol_callback function
If the first rule in A + B + C + D > X matched then it was counted like
`1 + 1` and not as `0 + 1` as the accumulator was incorrectly treated in
that case.
When converting to vectored mode we need to remember results between
consequent calls of regexp match engine. Prior to this patch this
behaviour was broken and caused regexp rules to be matched incorrectly.