source.dussan.org Git - rspamd.git/commit

]> source.dussan.org Git - rspamd.git/commit

author	Vsevolod Stakhov <vsevolod@highsecure.ru>
	Mon, 23 Mar 2020 14:50:24 +0000 (14:50 +0000)
committer	Vsevolod Stakhov <vsevolod@highsecure.ru>
	Mon, 23 Mar 2020 14:50:24 +0000 (14:50 +0000)
commit	f605d670505baad46b8ef4cdfa3dc32f48d4150e
tree	04f959e423960adb60c99dc0c35cb3a10e5d8595	tree \| snapshot
parent	7299efb5eeddde80511be8a6285d94c333fc8ea3	commit \| diff

[Rework] URL: Another update for urls extraction logic

URL extraction from HTML parts should look like this:
1. Extract href links
2. Convert HTML to plain text and extract:
a) (http|https|ftp)://foo.bar and www.foo
b) email like strings \bfoo@bar.baz\b .
For all extracted strings check if we have host with a domain from the public suffix.

src/libmime/message.c		diff \| blob \| history
src/libserver/url.c		diff \| blob \| history

Rapid spam filtering system: https://github.com/rspamd/rspamd

RSS Atom