nti.contentfragments.html module

Converters and utilities for dealing with HTML content fragments. In particular, sanitazation.

class nti.contentfragments.html.FakeRe[source]

Bases: object

match(regex, val)[source]
nti.contentfragments.html.may_contain_html_like_markup(*args, **kwargs)[source]
nti.contentfragments.html.sanitize_user_html(user_input, method='html')[source]

Given a user input string of plain text, HTML or HTML fragment, sanitize by removing unsupported/dangerous elements and doing some normalization. If it can be represented in plain text, do so.

Parameters:method (string) – One of the method values acceptable to lxml.etree.tostring(). The default value, html, causes this method to produce either HTML or plain text, whatever is most appropriate. Passing the value text causes this method to produce only plain text captured by traversing the elements with lxml. Note: this is legacy functionality, and callers should generally convert via calling the interfaces.
Returns:Something that implements frg_interfaces.IUnicodeContentFragment, typically either frg_interfaces.IPlainTextContentFragment or frg_interfaces.ISanitizedHTMLContentFragment.