diff options
author | Alexander Scheel <alexander.m.scheel@gmail.com> | 2020-04-29 07:34:59 -0400 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-04-29 12:34:59 +0100 |
commit | 1bf9e44bda5c8cd1fd72622cffce8ec291db79c5 (patch) | |
tree | 7baebecfcb0367f41306cd37945053bf7519226d | |
parent | 6b6f20b6d43b6263320ee872799373f33a751304 (diff) | |
download | gitea-1bf9e44bda5c8cd1fd72622cffce8ec291db79c5.tar.gz gitea-1bf9e44bda5c8cd1fd72622cffce8ec291db79c5.zip |
Fix sanitizer config - multiple rules (#11133)
In #9888, it was reported that my earlier pull request #9075 didn't quite function as expected. I was quite hopeful the `ValuesWithShadow()` worked as expected (and, I thought my testing showed it did) but I guess not. @zeripath proposed an alternative syntax which I like:
```ini
[markup.sanitizer.1]
ELEMENT=a
ALLOW_ATTR=target
REGEXP=something
[markup.sanitizer.2]
ELEMENT=a
ALLOW_ATTR=target
REGEXP=something
```
This was quite easy to adopt into the existing code. I've done so in a semi-backwards-compatible manner:
- The value from `.Value()` is used for each element.
- We parse `[markup.sanitizer]` and all `[markup.sanitizer.*]` sections and add them as rules.
This means that existing configs will load one rule (not all rules). It also means people can use string identifiers (`[markup.sanitiser.KaTeX]`) if they prefer, instead of numbered ones.
Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
-rw-r--r-- | custom/conf/app.ini.sample | 6 | ||||
-rw-r--r-- | docs/content/doc/advanced/config-cheat-sheet.en-us.md | 4 | ||||
-rw-r--r-- | docs/content/doc/advanced/external-renderers.en-us.md | 9 | ||||
-rw-r--r-- | modules/setting/markup.go | 58 |
4 files changed, 38 insertions, 39 deletions
diff --git a/custom/conf/app.ini.sample b/custom/conf/app.ini.sample index 646274c766..8900a58342 100644 --- a/custom/conf/app.ini.sample +++ b/custom/conf/app.ini.sample @@ -976,8 +976,10 @@ SHOW_FOOTER_VERSION = true ; Show template execution time in the footer SHOW_FOOTER_TEMPLATE_LOAD_TIME = true -[markup.sanitizer] -; The following keys can be used multiple times to define sanitation policy rules. +[markup.sanitizer.1] +; The following keys can appear once to define a sanitation policy rule. +; This section can appear multiple times by adding a unique alphanumeric suffix to define multiple rules. +; e.g., [markup.sanitizer.1] -> [markup.sanitizer.2] -> [markup.sanitizer.TeX] ;ELEMENT = span ;ALLOW_ATTR = class ;REGEXP = ^(info|warning|error)$ diff --git a/docs/content/doc/advanced/config-cheat-sheet.en-us.md b/docs/content/doc/advanced/config-cheat-sheet.en-us.md index 3f0eca308a..000b65f5a1 100644 --- a/docs/content/doc/advanced/config-cheat-sheet.en-us.md +++ b/docs/content/doc/advanced/config-cheat-sheet.en-us.md @@ -658,7 +658,7 @@ Two special environment variables are passed to the render command: Gitea supports customizing the sanitization policy for rendered HTML. The example below will support KaTeX output from pandoc. ```ini -[markup.sanitizer] +[markup.sanitizer.TeX] ; Pandoc renders TeX segments as <span>s with the "math" class, optionally ; with "inline" or "display" classes depending on context. ELEMENT = span @@ -670,7 +670,7 @@ REGEXP = ^\s*((math(\s+|$)|inline(\s+|$)|display(\s+|$)))+ - `ALLOW_ATTR`: The attribute this policy allows. Must be non-empty. - `REGEXP`: A regex to match the contents of the attribute against. Must be present but may be empty for unconditional whitelisting of this attribute. -You may redefine `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` multiple times; each time all three are defined is a single policy entry. +Multiple sanitisation rules can be defined by adding unique subsections, e.g. `[markup.sanitizer.TeX-2]`. ## Time (`time`) diff --git a/docs/content/doc/advanced/external-renderers.en-us.md b/docs/content/doc/advanced/external-renderers.en-us.md index 4851fabc75..db5baf6060 100644 --- a/docs/content/doc/advanced/external-renderers.en-us.md +++ b/docs/content/doc/advanced/external-renderers.en-us.md @@ -73,7 +73,7 @@ IS_INPUT_FILE = false If your external markup relies on additional classes and attributes on the generated HTML elements, you might need to enable custom sanitizer policies. Gitea uses the [`bluemonday`](https://godoc.org/github.com/microcosm-cc/bluemonday) package as our HTML sanitizier. The example below will support [KaTeX](https://katex.org/) output from [`pandoc`](https://pandoc.org/). ```ini -[markup.sanitizer] +[markup.sanitizer.TeX] ; Pandoc renders TeX segments as <span>s with the "math" class, optionally ; with "inline" or "display" classes depending on context. ELEMENT = span @@ -86,6 +86,11 @@ FILE_EXTENSIONS = .md,.markdown RENDER_COMMAND = pandoc -f markdown -t html --katex ``` -You may redefine `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` multiple times; each time all three are defined is a single policy entry. All three must be defined, but `REGEXP` may be blank to allow unconditional whitelisting of that attribute. +You must define `ELEMENT`, `ALLOW_ATTR`, and `REGEXP` in each section. + +To define multiple entries, add a unique alphanumeric suffix (e.g., `[markup.sanitizer.1]` and `[markup.sanitizer.something]`). Once your configuration changes have been made, restart Gitea to have changes take effect. + +**Note**: Prior to Gitea 1.12 there was a single `markup.sanitiser` section with keys that were redefined for multiple rules, however, +there were significant problems with this method of configuration necessitating configuration through multiple sections.
\ No newline at end of file diff --git a/modules/setting/markup.go b/modules/setting/markup.go index 75e6d651bd..1dd76243e6 100644 --- a/modules/setting/markup.go +++ b/modules/setting/markup.go @@ -44,7 +44,7 @@ func newMarkup() { continue } - if name == "sanitizer" { + if name == "sanitizer" || strings.HasPrefix(name, "sanitizer.") { newMarkupSanitizer(name, sec) } else { newMarkupRenderer(name, sec) @@ -67,44 +67,36 @@ func newMarkupSanitizer(name string, sec *ini.Section) { return } - elements := sec.Key("ELEMENT").ValueWithShadows() - allowAttrs := sec.Key("ALLOW_ATTR").ValueWithShadows() - regexps := sec.Key("REGEXP").ValueWithShadows() + elements := sec.Key("ELEMENT").Value() + allowAttrs := sec.Key("ALLOW_ATTR").Value() + regexpStr := sec.Key("REGEXP").Value() - if len(elements) != len(allowAttrs) || - len(elements) != len(regexps) { - log.Error("All three keys in markup.%s (ELEMENT, ALLOW_ATTR, REGEXP) must be defined the same number of times! Got %d, %d, and %d respectively.", name, len(elements), len(allowAttrs), len(regexps)) + if regexpStr == "" { + rule := MarkupSanitizerRule{ + Element: elements, + AllowAttr: allowAttrs, + Regexp: nil, + } + + ExternalSanitizerRules = append(ExternalSanitizerRules, rule) return } - ExternalSanitizerRules = make([]MarkupSanitizerRule, 0, len(elements)) - - for index, pattern := range regexps { - if pattern == "" { - rule := MarkupSanitizerRule{ - Element: elements[index], - AllowAttr: allowAttrs[index], - Regexp: nil, - } - ExternalSanitizerRules = append(ExternalSanitizerRules, rule) - continue - } - - // Validate when parsing the config that this is a valid regular - // expression. Then we can use regexp.MustCompile(...) later. - compiled, err := regexp.Compile(pattern) - if err != nil { - log.Error("In module.%s: REGEXP at definition %d failed to compile: %v", name, index+1, err) - continue - } + // Validate when parsing the config that this is a valid regular + // expression. Then we can use regexp.MustCompile(...) later. + compiled, err := regexp.Compile(regexpStr) + if err != nil { + log.Error("In module.%s: REGEXP (%s) at definition %d failed to compile: %v", regexpStr, name, err) + return + } - rule := MarkupSanitizerRule{ - Element: elements[index], - AllowAttr: allowAttrs[index], - Regexp: compiled, - } - ExternalSanitizerRules = append(ExternalSanitizerRules, rule) + rule := MarkupSanitizerRule{ + Element: elements, + AllowAttr: allowAttrs, + Regexp: compiled, } + + ExternalSanitizerRules = append(ExternalSanitizerRules, rule) } func newMarkupRenderer(name string, sec *ini.Section) { |