aboutsummaryrefslogtreecommitdiffstats
path: root/src/libmime/lang_detection.c
Commit message (Expand)AuthorAgeFilesLines
* [Minor] Fix multipattern usageVsevolod Stakhov2020-08-041-8/+6
* [Rework] Refactor libraries structureVsevolod Stakhov2020-02-101-1/+1
* [Minor] Add some more heuristics for stop words detectionVsevolod Stakhov2020-02-081-1/+39
* [Minor] Oops, fix format stringVsevolod Stakhov2020-02-071-1/+1
* [Minor] Further fixes in stop words detectionVsevolod Stakhov2020-02-071-14/+15
* [Fix] Ignore diacritics in chartable module for specific languagesVsevolod Stakhov2020-02-041-1/+1
* [Minor] Add diacritics flag for language detectorVsevolod Stakhov2020-02-041-9/+35
* [Minor] Langdet: Add threshold for stop wordsVsevolod Stakhov2019-08-021-0/+6
* [Minor] Langdet: Exclude exceptions (e.g. urls)Vsevolod Stakhov2019-08-021-0/+1
* [Minor] Show stop words foundVsevolod Stakhov2019-08-021-1/+10
* [Feature] Langdet: Limit number of stop words to be checkedVsevolod Stakhov2019-07-251-0/+5
* [Minor] Another try to plug a leakVsevolod Stakhov2019-06-261-0/+4
* [Minor] Plug more leaksVsevolod Stakhov2019-06-261-7/+2
* [Minor] Langdet: Improve debugging slightlyVsevolod Stakhov2019-06-051-0/+3
* [CritFix] Langdet: Fix language detection where no stop words foundVsevolod Stakhov2019-06-051-3/+20
* [Minor] Langdet: Increase cut-off limitVsevolod Stakhov2019-06-051-1/+1
* [Fix] Lang_det: Try better to distinguish Chinese and JapaneseVsevolod Stakhov2019-06-051-13/+47
* [Fix] Fix memory leak in language detector during reloadsVsevolod Stakhov2019-05-031-0/+2
* [Minor] Fix leakVsevolod Stakhov2019-02-261-1/+4
* [Minor] Fix loading of unicode multipatternsVsevolod Stakhov2019-02-141-0/+14
* [Rework] Slashing: Distinguish lualibdir, pluginsdir and sharedirVsevolod Stakhov2018-12-261-1/+1
* [Minor] Count words based on text wordsVsevolod Stakhov2018-11-301-3/+3
* [Fix] Fix double freeVsevolod Stakhov2018-11-291-4/+0
* [Minor] Another fail-safety checkVsevolod Stakhov2018-11-271-2/+5
* [Minor] Fix indefinite loop in language detectorVsevolod Stakhov2018-11-261-5/+4
* [Project] Finish basic tasks in new unicode projectVsevolod Stakhov2018-11-251-5/+5
* [Project] Rework language detector to work with ucs32Vsevolod Stakhov2018-11-251-30/+37
* [Project] Various unicode fixes in language detectorVsevolod Stakhov2018-11-251-40/+18
* [Project] Rework stemmingVsevolod Stakhov2018-11-241-16/+17
* [Project] Add function to normalize unicode on per words basisVsevolod Stakhov2018-11-241-1/+1
* [Fix] Properly escape utf8 regexps in hyperscan modeVsevolod Stakhov2018-11-201-2/+3
* [Minor] Reduce startup noiseVsevolod Stakhov2018-11-191-1/+1
* [Feature] Store stop words and allow to query themVsevolod Stakhov2018-11-151-1/+76
* [Minor] Fix format stringsVsevolod Stakhov2018-10-271-1/+1
* [Fix] Fix boundaries detection and rework stop words algorithmVsevolod Stakhov2018-10-061-16/+39
* [Fix] Plug memory leak in language detector (affects reloads)Vsevolod Stakhov2018-09-281-0/+4
* [Minor] Reduce severity of warningsVsevolod Stakhov2018-09-131-2/+2
* [Minor] Initialise candidates even in shortage of words caseVsevolod Stakhov2018-09-091-0/+1
* [Minor] Do not apply ngramms detection for short textsVsevolod Stakhov2018-09-081-47/+56
* [Fix] Fix various corner cases for language detectionVsevolod Stakhov2018-09-081-24/+30
* [Minor] Do not use too recent additions to libicuVsevolod Stakhov2018-09-071-7/+0
* [Fix] Fix stop words detection and loading logicVsevolod Stakhov2018-09-071-5/+17
* [Feature] Add preliminary stop words detection supportVsevolod Stakhov2018-09-071-3/+198
* [Rework] Rework language detectorVsevolod Stakhov2018-09-071-412/+472
* [Rework] Rework utf content processing in text partsVsevolod Stakhov2018-09-051-1/+1
* [Fix] Free language detector structuresVsevolod Stakhov2018-06-141-0/+48
* [Feature] Further optimization of the lang_detectionVsevolod Stakhov2018-04-171-94/+125
* [Feature] Further improvements of language detector by using khashVsevolod Stakhov2018-04-171-43/+60
* [Minor] Improve performance of language detectorVsevolod Stakhov2018-04-171-0/+4
* [Fix] Fix some typosAndrew Lewis2018-03-181-2/+2