aboutsummaryrefslogtreecommitdiffstats
path: root/src/libstat/tokenizers/tokenizers.h
Commit message (Expand)AuthorAgeFilesLines
* [Rework] Further types conversion (no functional changes)Vsevolod Stakhov2024-03-181-16/+16
* [Rework] Remove some of the GLib types in lieu of standard onesVsevolod Stakhov2024-03-181-1/+1
* [Fix] Fix format string and some length issuesVsevolod Stakhov2023-09-261-1/+17
* [Rework] Use clang-format to unify formatting in all sourcesVsevolod Stakhov2023-07-261-34/+34
* [Minor] Add safety check when using icu ubrk iteratorsVsevolod Stakhov2019-10-241-1/+2
* [Rework] Add C++ guards to all headersVsevolod Stakhov2019-07-081-16/+27
* [Project] Use more generalised API to produce meta wordsVsevolod Stakhov2018-11-261-2/+3
* [Project] Another try to normalize unicode properlyVsevolod Stakhov2018-11-251-0/+1
* [Project] Rework stemmingVsevolod Stakhov2018-11-241-4/+5
* [Project] Add function to normalize unicode on per words basisVsevolod Stakhov2018-11-241-0/+4
* [Feature] Skip stop words in statisticsVsevolod Stakhov2018-11-151-11/+11
* [Minor] Move subject tokenisation to a separate routineVsevolod Stakhov2018-11-081-0/+2
* [Feature] Implement new text tokenizer based on libicuVsevolod Stakhov2018-09-061-0/+3
* [Rework] Rework utf content processing in text partsVsevolod Stakhov2018-09-051-1/+1
* [Project] Start unicode reworkVsevolod Stakhov2018-08-231-3/+11
* [Rework] Use a special structure for stats tokensVsevolod Stakhov2017-02-141-1/+1
* Fix tokenizationVsevolod Stakhov2016-01-051-25/+10
* Implement words decaying for text parts.Vsevolod Stakhov2015-11-121-2/+2
* Fix statistics.Vsevolod Stakhov2015-10-061-1/+1
* Rename main.h and main.c to `rspamd.X`Vsevolod Stakhov2015-09-221-1/+1
* Fix tokenizers and mmapped file.Vsevolod Stakhov2015-07-271-4/+8
* Fix stat processing.Vsevolod Stakhov2015-07-271-0/+4
* More changes to tokenization.Vsevolod Stakhov2015-07-271-2/+4
* Start tokenizers rework.Vsevolod Stakhov2015-07-271-4/+8
* Allow adding of prefix for tokenizers.Vsevolod Stakhov2015-07-261-2/+4
* Implement skipping of signatures in text messages.Vsevolod Stakhov2015-07-141-1/+2
* Add new UTF8 tokenizer.Vsevolod Stakhov2015-04-011-1/+1
* Add compatibility layer for tokenization.Vsevolod Stakhov2015-04-011-2/+12
* Save classifier configuration inside statfile config.Vsevolod Stakhov2015-04-011-3/+0
* Rework tokenization:Vsevolod Stakhov2015-02-231-1/+1
* Allow configurable tokenizers.Vsevolod Stakhov2015-02-221-2/+2
* Rework tokenization invocation.Vsevolod Stakhov2015-01-231-3/+0
* Rework types for tokenizers functions.Vsevolod Stakhov2015-01-231-12/+9
* Reorganize libstat API.Vsevolod Stakhov2015-01-231-0/+49