aboutsummaryrefslogtreecommitdiffstats
path: root/src/libmime/lang_detection.c
Commit message (Expand)AuthorAgeFilesLines
* [Rework] Change the logic of skipping symbolsVsevolod Stakhov2024-09-041-1/+1
* [Fix] Fix another corner case that allows candidates to be freed without initVsevolod Stakhov2024-04-291-8/+9
* [Fix] Apply detection phase if fasttext could not detect languageVsevolod Stakhov2024-04-281-71/+93
* [Rework] Further types conversion (no functional changes)Vsevolod Stakhov2024-03-181-122/+122
* [Rework] Remove some of the GLib types in lieu of standard onesVsevolod Stakhov2024-03-181-5/+5
* [Fix] Do not save multipatterns to FS in certain casesVsevolod Stakhov2024-03-151-1/+1
* [Minor] Print some more statsVsevolod Stakhov2024-01-191-1/+5
* [Fix] Really fix the language detector statistical heuristicVsevolod Stakhov2024-01-181-12/+26
* [Fix] Make words selection random deterministic upon contentVsevolod Stakhov2024-01-181-12/+19
* [Rework] More steps to do refactoringVsevolod Stakhov2023-08-161-4/+4
* [Rework] Use clang-format to unify formatting in all sourcesVsevolod Stakhov2023-07-261-606/+612
* [Minor] Add some more debug to the fasttext classifierVsevolod Stakhov2023-05-031-2/+2
* [Feature] Allow to use other methods when fasttext detection is enabledVsevolod Stakhov2023-05-021-1/+9
* [Fix] Feed fasttext language model with the pre-tokenized wordsVsevolod Stakhov2023-05-021-2/+1
* [Project] Some further fixesvstakhov-fasttext-langdetVsevolod Stakhov2023-04-291-16/+33
* [Fix] Ignore non-unique stop wordsVsevolod Stakhov2023-04-291-8/+30
* [Project] Implement fasttext language detectionVsevolod Stakhov2023-04-291-61/+108
* [Project] Show fasttext infoVsevolod Stakhov2023-04-291-2/+9
* Spelling (#4086)Josh Soref2022-02-221-32/+32
* [Minor] More divisions by zeroVsevolod Stakhov2021-12-251-0/+4
* [Minor] Fix multipattern usageVsevolod Stakhov2020-08-041-8/+6
* [Rework] Refactor libraries structureVsevolod Stakhov2020-02-101-1/+1
* [Minor] Add some more heuristics for stop words detectionVsevolod Stakhov2020-02-081-1/+39
* [Minor] Oops, fix format stringVsevolod Stakhov2020-02-071-1/+1
* [Minor] Further fixes in stop words detectionVsevolod Stakhov2020-02-071-14/+15
* [Fix] Ignore diacritics in chartable module for specific languagesVsevolod Stakhov2020-02-041-1/+1
* [Minor] Add diacritics flag for language detectorVsevolod Stakhov2020-02-041-9/+35
* [Minor] Langdet: Add threshold for stop wordsVsevolod Stakhov2019-08-021-0/+6
* [Minor] Langdet: Exclude exceptions (e.g. urls)Vsevolod Stakhov2019-08-021-0/+1
* [Minor] Show stop words foundVsevolod Stakhov2019-08-021-1/+10
* [Feature] Langdet: Limit number of stop words to be checkedVsevolod Stakhov2019-07-251-0/+5
* [Minor] Another try to plug a leakVsevolod Stakhov2019-06-261-0/+4
* [Minor] Plug more leaksVsevolod Stakhov2019-06-261-7/+2
* [Minor] Langdet: Improve debugging slightlyVsevolod Stakhov2019-06-051-0/+3
* [CritFix] Langdet: Fix language detection where no stop words foundVsevolod Stakhov2019-06-051-3/+20
* [Minor] Langdet: Increase cut-off limitVsevolod Stakhov2019-06-051-1/+1
* [Fix] Lang_det: Try better to distinguish Chinese and JapaneseVsevolod Stakhov2019-06-051-13/+47
* [Fix] Fix memory leak in language detector during reloadsVsevolod Stakhov2019-05-031-0/+2
* [Minor] Fix leakVsevolod Stakhov2019-02-261-1/+4
* [Minor] Fix loading of unicode multipatternsVsevolod Stakhov2019-02-141-0/+14
* [Rework] Slashing: Distinguish lualibdir, pluginsdir and sharedirVsevolod Stakhov2018-12-261-1/+1
* [Minor] Count words based on text wordsVsevolod Stakhov2018-11-301-3/+3
* [Fix] Fix double freeVsevolod Stakhov2018-11-291-4/+0
* [Minor] Another fail-safety checkVsevolod Stakhov2018-11-271-2/+5
* [Minor] Fix indefinite loop in language detectorVsevolod Stakhov2018-11-261-5/+4
* [Project] Finish basic tasks in new unicode projectVsevolod Stakhov2018-11-251-5/+5
* [Project] Rework language detector to work with ucs32Vsevolod Stakhov2018-11-251-30/+37
* [Project] Various unicode fixes in language detectorVsevolod Stakhov2018-11-251-40/+18
* [Project] Rework stemmingVsevolod Stakhov2018-11-241-16/+17
* [Project] Add function to normalize unicode on per words basisVsevolod Stakhov2018-11-241-1/+1