Skip to content

Commit

Permalink
utils/html: Remove the hard-coded list of allowed elements and attrib…
Browse files Browse the repository at this point in the history
…utes

These changes provide full control over the management of "allowed-elements" and "allowed-attributes" through the configuration file.

Fixes #751
  • Loading branch information
pkvach committed Apr 16, 2024
1 parent eb35b17 commit fd92ccb
Show file tree
Hide file tree
Showing 6 changed files with 31 additions and 50 deletions.
8 changes: 8 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ New Features
Breaking Changes
^^^^^^^^^^^^^^^^

- Provide full control of allowed-elements and allowed-attributes via the configuration
file (`#1007`_, pkvach)
Now the configuration options ``allowed-elements`` and ``allowed-attributes`` are not additional.
That is, they specify the full list of allowed elements and attributes.

- TBD

Bugfixes & Improvements
Expand All @@ -27,13 +32,16 @@ Bugfixes & Improvements
- Prevent auto creation of invalid links in comments (`#995`_, pkvach)
- Fix W3C Validation issues (`#999`_, pkvach)
- Handle deleted comments in Disqus migration (`#994`_, pkvach)
- Provide full control of allowed-elements and allowed-attributes via the configuration
file (`#1007`_, pkvach)

.. _#951: https://github.com/posativ/isso/pull/951
.. _#967: https://github.com/posativ/isso/pull/967
.. _#983: https://github.com/posativ/isso/pull/983
.. _#995: https://github.com/isso-comments/isso/pull/995
.. _#999: https://github.com/isso-comments/isso/pull/999
.. _#994: https://github.com/isso-comments/isso/pull/994
.. _#1007: https://github.com/isso-comments/isso/pull/1007

0.13.1.dev0 (2023-02-05)
------------------------
Expand Down
5 changes: 3 additions & 2 deletions contrib/isso-dev.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,9 @@ reply-to-self = true
[markup]
options = autolink, fenced-code, no-intra-emphasis, strikethrough, superscript
flags =
allowed-elements =
allowed-attributes =
allowed-elements = a, p, hr, br, ol, ul, li, pre, code, blockquote, del, ins,
strong, em, h1, h2, h3, h4, h5, h6, sub, sup, table, thead, tbody, th, td
allowed-attributes = align, href

[hash]
salt = Eech7co8Ohloopo9Ol6baimi
Expand Down
26 changes: 4 additions & 22 deletions docs/docs/reference/server-config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -428,37 +428,19 @@ flags
.. versionadded:: 0.12.4

allowed-elements
**Additional** HTML tags to allow in the generated output, comma-separated.

By default, only ``a``, ``blockquote``, ``br``, ``code``, ``del``, ``em``,
``h1``, ``h2``, ``h3``, ``h4``, ``h5``, ``h6``, ``hr``, ``ins``, ``li``,
``ol``, ``p``, ``pre``, ``strong``, ``table``, ``tbody``, ``td``, ``th``,
``thead`` and ``ul`` are allowed.
HTML tags to allow in the generated output, comma-separated.

For a more detailed explanation, see :doc:`/docs/reference/markdown-config`.

.. warning::

This option (together with ``allowed-attributes``) is frequently
misunderstood. Setting e.g. this list to only ``a, blockquote`` will
mean that ``br, code, del, ...`` and all other default allowed tags are
still allowed. You can only add *additional* elements here.

It is planned to change this behavior, see
`this issue <https://github.com/isso-comments/isso/issues/751>`_.

Default: (empty)
Default: ``a, p, hr, br, ol, ul, li, pre, code, blockquote, del, ins, strong, em, h1, h2, h3, h4, h5, h6, sub, sup, table, thead, tbody, th, td``

allowed-attributes
**Additional** HTML attributes (independent from elements) to allow in the
HTML attributes (independent from elements) to allow in the
generated output, comma-separated.

By default, only ``align`` and ``href`` are allowed (same caveats as for
``allowed-elements`` above apply)

For a more detailed explanation, see :doc:`/docs/reference/markdown-config`.

Default: (empty)
Default: ``align, href``

.. note:: To allow images in comments, you need to add
``allowed-elements = img`` and *also* ``allowed-attributes = src``.
Expand Down
16 changes: 7 additions & 9 deletions isso/isso.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -211,15 +211,13 @@ options = autolink, fenced-code, no-intra-emphasis, strikethrough, superscript
# Per Misaka's defaults, no flags are set.
flags =

# Additional HTML tags to allow in the generated output, comma-separated. By
# default, only a, blockquote, br, code, del, em, h1, h2, h3, h4, h5, h6, hr,
# ins, li, ol, p, pre, strong, table, tbody, td, th, thead and ul are allowed.
allowed-elements =

# Additional HTML attributes (independent from elements) to allow in the
# generated output, comma-separated. By default, only align and href are
# allowed.
allowed-attributes =
# HTML tags to allow in the generated output, comma-separated.
allowed-elements = a, p, hr, br, ol, ul, li, pre, code, blockquote, del, ins,
strong, em, h1, h2, h3, h4, h5, h6, sub, sup, table, thead, tbody, th, td

# HTML attributes (independent from elements) to allow in the generated output,
# comma-separated.
allowed-attributes = align, href


[hash]
Expand Down
8 changes: 4 additions & 4 deletions isso/tests/test_html.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ def test_github_flavoured_markdown(self):
</code></pre>""")

def test_sanitizer(self):
sanitizer = html.Sanitizer(elements=[], attributes=[])
sanitizer = html.Sanitizer(elements=["p", "a", "code"], attributes=["href"])
examples = [
('Look: <img src="..." />', 'Look: '),
('<a href="http://example.org/">Ha</a>',
Expand Down Expand Up @@ -94,8 +94,8 @@ def test_render(self):
"markup": {
"options": "autolink",
"flags": "",
"allowed-elements": "",
"allowed-attributes": ""
"allowed-elements": "a, p",
"allowed-attributes": "href"
}
})
renderer = html.Markup(conf.section("markup")).render
Expand All @@ -109,7 +109,7 @@ def test_sanitized_render_extensions(self):
"markup": {
"options": "no_intra_emphasis", # Deliberately snake_case
"flags": "",
"allowed-elements": "",
"allowed-elements": "p",
"allowed-attributes": ""
}
})
Expand Down
18 changes: 5 additions & 13 deletions isso/utils/html.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,25 +17,17 @@ def allow_attribute_class(tag, name, value):
return name == "class" and bool(Sanitizer.code_language_pattern.match(value))

def __init__(self, elements, attributes):
# attributes found in Sundown's HTML serializer [1]
# - except for <img> tag, because images are not generated anyways.
# - sub and sup added
#
# [1] https://github.com/vmg/sundown/blob/master/html/html.c
self.elements = ["a", "p", "hr", "br", "ol", "ul", "li",
"pre", "code", "blockquote",
"del", "ins", "strong", "em",
"h1", "h2", "h3", "h4", "h5", "h6", "sub", "sup",
"table", "thead", "tbody", "th", "td"] + elements
self.elements = elements

# allowed attributes for tags
self.attributes = {
"table": ["align"],
"a": ["href"],
"code": Sanitizer.allow_attribute_class,
"*": attributes
}

# If "code" elements are allowed, allow "language-*" CSS classes for syntax highlighting
if "code" in self.elements:
self.attributes["code"] = Sanitizer.allow_attribute_class

def sanitize(self, text):
clean_html = bleach.clean(text, tags=self.elements, attributes=self.attributes, strip=True)

Expand Down

0 comments on commit fd92ccb

Please sign in to comment.