aboutsummaryrefslogtreecommitdiffstats
path: root/src/classifiers/winnow.c
Commit message (Collapse)AuthorAgeFilesLines
* Reorganize statfiles and classifiers into libstat.Vsevolod Stakhov2015-01-161-694/+0
|
* Refactor function names.Vsevolod Stakhov2014-09-231-1/+1
|
* Rework lua function names to avoid ambiguity.Vsevolod Stakhov2014-08-171-2/+2
|
* Unify style without sorting headers.Vsevolod Stakhov2014-07-231-146/+236
|
* Revert "Unify code style."Vsevolod Stakhov2014-07-231-239/+149
| | | | This reverts commit e0483657ff6cf1adc828ccce457814d61fe90a0d.
* Unify code style.Vsevolod Stakhov2014-07-231-149/+239
|
* Refactor config API.Vsevolod Stakhov2014-04-301-4/+4
|
* Refactor worker task structure and API.Vsevolod Stakhov2014-04-211-3/+3
|
* Refactor memory pool naming.Vsevolod Stakhov2014-04-201-6/+6
|
* Another debian license fix.Vsevolod Stakhov2012-09-101-1/+1
| | | | | Add apache license for regexp that were delivered from SpamAssassin project. Fix debian/copyright for src/dns.c.
* Update copyright (required by debian).Vsevolod Stakhov2012-09-041-3/+3
|
* * Rework thread pools locking logic to avoid global lua mutex usage.Vsevolod Stakhov2012-08-221-4/+4
| | | | | | Fixed several memory leaks with modern glib. Fixed memory leak in dkim code. Fixed a problem with static global variables in shared libraries.
* * Fix build under CentOS 5 with old glib 2.12Vsevolod Stakhov2011-07-291-7/+5
| | | | | * Fix build of rspamd with CMAKE_BINARY_DIR differs from CMAKE_SOURCE_DIR Rework include style.
* * First commit to implement multi-statfile filter system with new learning ↵Vsevolod Stakhov2011-07-121-13/+20
| | | | mechanizm (untested yet)
* Fixes in classifying for small messages.Vsevolod Stakhov2011-01-251-3/+12
|
* Remove G_INLINE_FUNC definitions as I misunderstood its purposes.Vsevolod Stakhov2010-10-151-1/+1
|
* Fixes bugs found with clang-static analyser.Vsevolod Stakhov2010-10-111-2/+13
| | | | | Strictly follow c99 standart. Turn on pedantic c99 checks.
* * Fix races in fuzzy storageVsevolod Stakhov2010-08-261-1/+3
|
* * Fix normalization for systems that have not tanhl functionVsevolod Stakhov2010-08-181-2/+2
|
* * Remove normalizer as it is winnow specific thing, so all statistic ↵Vsevolod Stakhov2010-08-131-4/+18
| | | | algorithms now returns value from 0 to 1
* * Add bayesian classifier (initial version)Vsevolod Stakhov2010-08-131-10/+10
|
* * One more try to improve accuracy of winnow algorithmVsevolod Stakhov2010-08-061-10/+19
|
* * Fixes to winnow learningVsevolod Stakhov2010-08-051-41/+107
|
* * Fix some logic errors in learningVsevolod Stakhov2010-08-031-6/+8
|
* * Improve logic of learning messages: do not learn more than specific thresholdVsevolod Stakhov2010-08-021-8/+82
| | | | * Fix inserting results for symbols that were incorrectly (for example more than 1 time) defined in config file
* * Change metric logicVsevolod Stakhov2010-06-161-1/+1
| | | | | | | * Completely remove lex/yacc readers for config * Make common sense of metric/action and symbols * Sync changes with all plugins TODO: add this to documentation
* * Fix strict aliasing while compiling with optimizationVsevolod Stakhov2010-05-311-1/+1
| | | | | * Fix tanhl detection for platforms that have not implementation of it * Remove several compile warnings
* * Fix orderVsevolod Stakhov2010-05-271-3/+3
|
* * In classify normalize result after comparing, not beforeVsevolod Stakhov2010-05-271-3/+3
|
* * Convert statistic sums to use long double for countersVsevolod Stakhov2010-05-271-23/+17
| | | | * Use hyperbolic tangent for internal normalizer
* * Implement new learning system, now rspamd should be much more intelligent ↵Vsevolod Stakhov2010-05-271-32/+125
| | | | while learning messages
* * Fix awfull bug in classifying when first statfile has twice weight than ↵Vsevolod Stakhov2010-05-141-2/+2
| | | | | | second... * Fix undisclosed recipients detection
* * Bugfixes:Vsevolod Stakhov2010-04-201-1/+1
| | | | | | | | | | - handle '\' characters in lua strings correctly - fix lua initialization - avoid of using global lua state (global L) - fix listen sockets hash to allow multiply workers of same type but on different listen sockets - fix modules options inserting to allow multiply options of the same name - fix parsing of lua options - fix lua rules
* * Add option min_tokens to classifier that allows to skip too short messages ↵Vsevolod Stakhov2010-03-221-1/+34
| | | | | | from statistic check, format: min_tokens = "10"; (for 10 words minimum)
* * Fix bugs from previous commitcebka@lenovo-laptop2010-03-011-6/+3
|
* * Add weights command for getting weights of each message by each statfilecebka@lenovo-laptop2010-03-011-2/+65
| | | | | * Add ability to specify multiplier when learning * Add statistics about spam and ham messages
* * Forgotten call of normalizer functioncebka@lenovo-laptop2010-01-141-0/+3
|
* * Introduce new logging system:Vsevolod Stakhov2009-12-221-1/+1
| | | | | | | | | - independent and customizeable buffering - line buffering - errors handling support - custom (ip based) debug - append function name automaticaly (based on __FUNCTION__) - add some logic to logs system
* * Implement pre and post classify callbacks for checking specific statfiles ↵Vsevolod Stakhov2009-12-161-3/+21
| | | | | | | | | | for this task TODO: - add properties to get all parameters of input task - add properties to statfile object - add some normalization function for calling from classify process - document changes
* * Fix symbols cache (init lua filters before symbols cache initialization)Vsevolod Stakhov2009-12-141-3/+21
| | | | | | | * Remove LRU expiration logic from statfiles and replace it with random/lowerest value expiration logic: expire random block or block with lowerest value ! statfiles are incompatible again
* * Many major fixes to statfiles:Vsevolod Stakhov2009-12-031-16/+29
| | | | | | | | | | - fix bug with mmapping files: new addresses must NOT be allocated in shared memory by themselves - fix bug with winnow classifier that totally brokes it down - fix bug with too much grow of values * Use double precission values in statistics * Add statistics for statfiles * Add more informative data to output of LEARN command (weight of incoming message) * Add weight to output of classifier as well
* * Write revision and revision time to statfileVsevolod Stakhov2009-11-121-22/+2
| | | | * Make some improvements to API (trying to make it more clear)
* * Add binlog API implementationVsevolod Stakhov2009-11-061-1/+3
|
* * Add ability to change statfile size limit in config and allow reindexing ↵Vsevolod Stakhov2009-10-161-4/+4
| | | | of statfiles
* * Retab, no functional changesVsevolod Stakhov2009-10-021-44/+44
|
* * Fix race between learn and classifyVsevolod Stakhov2009-09-281-1/+5
|
* * Fix learningVsevolod Stakhov2009-09-251-3/+5
|
* * Remove assertVsevolod Stakhov2009-09-161-3/+5
| | | | | | * Fix build WITH_LUA * Fix calling of classifier * Fix autolearn
* * New system of classifiers interface and statfiles processingVsevolod Stakhov2009-09-141-57/+56
| | | | | | | * Fix sample config * Fix compile warnings * Fix building without lua support * Fix bugs with nrcpt header parsing and symbols cache loading (by Anton Nekhoroshikh)
* * Rework structure and API of statfiles functions to improve performance and ↵Vsevolod Stakhov2009-07-021-13/+11
| | | | | | avoid missusage of hash table * Correct url length calculation in urls command