index
:
rspamd.git
external-maps
libev-migration
log_json
master
mime-rework
rdns-tcp-rework
rework-symcache
rspamd-0.5
rspamd-0.6
rspamd-0.7
rspamd-0.8
rspamd-0.9
rspamd-1.0
rspamd-1.1
rspamd-1.2
rspamd-1.3
rspamd-1.4
rspamd-1.5
rspamd-1.6
rspamd-1.9
rspamd-3.10
rspamd-3.7
rspamd-3.8
rspamd-3.9
torch-removal
vstakhov-anonymize-mime
vstakhov-another-grow-factor-fix
vstakhov-ci-try
vstakhov-conf-reorg
vstakhov-cpu-detection
vstakhov-cumulative-tcp-timeout
vstakhov-fasttext-langdet
vstakhov-fix-2047-encode
vstakhov-fix-dcc
vstakhov-fuzzy-cxx
vstakhov-fuzzy-limits-display
vstakhov-fuzzy-tcp
vstakhov-gpt-ollama
vstakhov-keypair-encoding
vstakhov-known-senders
vstakhov-llm-anonymize
vstakhov-llm-embeddings
vstakhov-lua-text-api
vstakhov-new-hiredis
vstakhov-openssl-provider-message
vstakhov-remove-control-block
vstakhov-some-build-fixes
vstakhov-ssl-fixes
vstakhov-stringzilla
vstakhov-strip-attachments
vstakhov-surbl-conf-fix
vstakhov-universal-hashing-lua
vstakhov-utf8-mime
vstakhov-zstd-headers
Rapid spam filtering system: https://github.com/rspamd/rspamd
www-data
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
src
/
libmime
/
lang_detection.c
Commit message (
Expand
)
Author
Age
Files
Lines
*
[Minor] Fix multipattern usage
Vsevolod Stakhov
2020-08-04
1
-8
/
+6
*
[Rework] Refactor libraries structure
Vsevolod Stakhov
2020-02-10
1
-1
/
+1
*
[Minor] Add some more heuristics for stop words detection
Vsevolod Stakhov
2020-02-08
1
-1
/
+39
*
[Minor] Oops, fix format string
Vsevolod Stakhov
2020-02-07
1
-1
/
+1
*
[Minor] Further fixes in stop words detection
Vsevolod Stakhov
2020-02-07
1
-14
/
+15
*
[Fix] Ignore diacritics in chartable module for specific languages
Vsevolod Stakhov
2020-02-04
1
-1
/
+1
*
[Minor] Add diacritics flag for language detector
Vsevolod Stakhov
2020-02-04
1
-9
/
+35
*
[Minor] Langdet: Add threshold for stop words
Vsevolod Stakhov
2019-08-02
1
-0
/
+6
*
[Minor] Langdet: Exclude exceptions (e.g. urls)
Vsevolod Stakhov
2019-08-02
1
-0
/
+1
*
[Minor] Show stop words found
Vsevolod Stakhov
2019-08-02
1
-1
/
+10
*
[Feature] Langdet: Limit number of stop words to be checked
Vsevolod Stakhov
2019-07-25
1
-0
/
+5
*
[Minor] Another try to plug a leak
Vsevolod Stakhov
2019-06-26
1
-0
/
+4
*
[Minor] Plug more leaks
Vsevolod Stakhov
2019-06-26
1
-7
/
+2
*
[Minor] Langdet: Improve debugging slightly
Vsevolod Stakhov
2019-06-05
1
-0
/
+3
*
[CritFix] Langdet: Fix language detection where no stop words found
Vsevolod Stakhov
2019-06-05
1
-3
/
+20
*
[Minor] Langdet: Increase cut-off limit
Vsevolod Stakhov
2019-06-05
1
-1
/
+1
*
[Fix] Lang_det: Try better to distinguish Chinese and Japanese
Vsevolod Stakhov
2019-06-05
1
-13
/
+47
*
[Fix] Fix memory leak in language detector during reloads
Vsevolod Stakhov
2019-05-03
1
-0
/
+2
*
[Minor] Fix leak
Vsevolod Stakhov
2019-02-26
1
-1
/
+4
*
[Minor] Fix loading of unicode multipatterns
Vsevolod Stakhov
2019-02-14
1
-0
/
+14
*
[Rework] Slashing: Distinguish lualibdir, pluginsdir and sharedir
Vsevolod Stakhov
2018-12-26
1
-1
/
+1
*
[Minor] Count words based on text words
Vsevolod Stakhov
2018-11-30
1
-3
/
+3
*
[Fix] Fix double free
Vsevolod Stakhov
2018-11-29
1
-4
/
+0
*
[Minor] Another fail-safety check
Vsevolod Stakhov
2018-11-27
1
-2
/
+5
*
[Minor] Fix indefinite loop in language detector
Vsevolod Stakhov
2018-11-26
1
-5
/
+4
*
[Project] Finish basic tasks in new unicode project
Vsevolod Stakhov
2018-11-25
1
-5
/
+5
*
[Project] Rework language detector to work with ucs32
Vsevolod Stakhov
2018-11-25
1
-30
/
+37
*
[Project] Various unicode fixes in language detector
Vsevolod Stakhov
2018-11-25
1
-40
/
+18
*
[Project] Rework stemming
Vsevolod Stakhov
2018-11-24
1
-16
/
+17
*
[Project] Add function to normalize unicode on per words basis
Vsevolod Stakhov
2018-11-24
1
-1
/
+1
*
[Fix] Properly escape utf8 regexps in hyperscan mode
Vsevolod Stakhov
2018-11-20
1
-2
/
+3
*
[Minor] Reduce startup noise
Vsevolod Stakhov
2018-11-19
1
-1
/
+1
*
[Feature] Store stop words and allow to query them
Vsevolod Stakhov
2018-11-15
1
-1
/
+76
*
[Minor] Fix format strings
Vsevolod Stakhov
2018-10-27
1
-1
/
+1
*
[Fix] Fix boundaries detection and rework stop words algorithm
Vsevolod Stakhov
2018-10-06
1
-16
/
+39
*
[Fix] Plug memory leak in language detector (affects reloads)
Vsevolod Stakhov
2018-09-28
1
-0
/
+4
*
[Minor] Reduce severity of warnings
Vsevolod Stakhov
2018-09-13
1
-2
/
+2
*
[Minor] Initialise candidates even in shortage of words case
Vsevolod Stakhov
2018-09-09
1
-0
/
+1
*
[Minor] Do not apply ngramms detection for short texts
Vsevolod Stakhov
2018-09-08
1
-47
/
+56
*
[Fix] Fix various corner cases for language detection
Vsevolod Stakhov
2018-09-08
1
-24
/
+30
*
[Minor] Do not use too recent additions to libicu
Vsevolod Stakhov
2018-09-07
1
-7
/
+0
*
[Fix] Fix stop words detection and loading logic
Vsevolod Stakhov
2018-09-07
1
-5
/
+17
*
[Feature] Add preliminary stop words detection support
Vsevolod Stakhov
2018-09-07
1
-3
/
+198
*
[Rework] Rework language detector
Vsevolod Stakhov
2018-09-07
1
-412
/
+472
*
[Rework] Rework utf content processing in text parts
Vsevolod Stakhov
2018-09-05
1
-1
/
+1
*
[Fix] Free language detector structures
Vsevolod Stakhov
2018-06-14
1
-0
/
+48
*
[Feature] Further optimization of the lang_detection
Vsevolod Stakhov
2018-04-17
1
-94
/
+125
*
[Feature] Further improvements of language detector by using khash
Vsevolod Stakhov
2018-04-17
1
-43
/
+60
*
[Minor] Improve performance of language detector
Vsevolod Stakhov
2018-04-17
1
-0
/
+4
*
[Fix] Fix some typos
Andrew Lewis
2018-03-18
1
-2
/
+2
[next]