nti.contentfragments¶
Contents:
nti.contentfragments package¶
Submodules¶
nti.contentfragments.interfaces module¶
Content-related interfaces.
-
interface
nti.contentfragments.interfaces.
IAllowedAttributeProvider
[source]¶ A way to provide a whitelist of additional attribute names that would be allowed while parsing a content fragment, thus extending the attributes already allowed.
New in version 1.4.0.
-
allowed_attributes
¶ An iterable of attribute names allowed in a particular context
Implementation: nti.schema.field.IndexedIterable
Read Only: False Required: False Default Value: () Allowed Type: _abcoll.Sequence
Value Type
The attribute name
Implementation: zope.schema.NativeStringLine
Read Only: False Required: True Default Value: None Allowed Type: str
-
-
interface
nti.contentfragments.interfaces.
ICensoredContentEvent
[source]¶ -
content_fragment
¶ The content that was censored
-
name
¶ The name of the attribute under which the censor content will be assigned.
-
context
¶ The context object where the object will be assigned to.
-
censored_content
¶ The censored content
-
-
interface
nti.contentfragments.interfaces.
ICensoredContentPolicy
[source]¶ A top-level policy puts together detection of content ranges to censor with a strategy to censor them
-
censor
(content_fragment, context)¶ Censors the content fragment appropriately and returns the censored value.
Parameters: - content_fragment – The fragment being censored.
- context – The object that this content fragment should be censored with regard to. For example, the fragment’s container or composite object that will hold the fragment.
Returns: The censored content fragment, if any censoring was done to it. May also raise a
ValueError
if censoring is not allowed and the content should be thrown away.
-
-
interface
nti.contentfragments.interfaces.
ICensoredContentScanner
[source]¶ Something that can perform censoring.
Variations of censoring scanners will be registered as named utilities. Particular censoring solutions (the adapters discussed in
ICensoredUnicodeContentFragment
) will put together a combination of these utilities to produce the desired result.The censoring process can further be broken down into two parts: detection of unwanted content, and reacting to unwanted content. For example, reacting might consist of replacing the content with asterisks in plain text, or a special span in HTML, or it might throw an exception to disallow the content altogether. This object performs the first part.
The names may be something like MPAA ratings, or they may follow other categories.
-
scan
(content_fragment)¶ Scan the given content fragment for censored terms and return their positions as a sequence (iterator) of two-tuples (start, end). The returned tuples should be non-overlapping.
-
-
interface
nti.contentfragments.interfaces.
ICensoredContentStrategy
[source]¶ The other half of the content censoring process explained in
ICensoredContentScanner
, responsible for taking action on censoring content.-
censor_ranges
(content_fragment, censored_ranges)¶ Censors the content fragment appropriately and returns the censored value.
Parameters: - content_fragment – The fragment being censored.
- censored_ranges – The ranges of illicit content as produced by
ICensoredContentScanner.scan()
; they are not guaranteed to be in any particular order so you may need to sort them withsorted()
(in reverse)
Returns: The censored content fragment, if any censoring was done to it. May also raise a
ValueError
if censoring is not allowed and the content should be thrown away.
-
-
interface
nti.contentfragments.interfaces.
ICensoredHTMLContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IHTMLContentFragment
,nti.contentfragments.interfaces.ICensoredUnicodeContentFragment
-
interface
nti.contentfragments.interfaces.
ICensoredPlainTextContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IPlainTextContentFragment
,nti.contentfragments.interfaces.ICensoredUnicodeContentFragment
-
interface
nti.contentfragments.interfaces.
ICensoredSanitizedHTMLContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.ISanitizedHTMLContentFragment
,nti.contentfragments.interfaces.ICensoredHTMLContentFragment
-
interface
nti.contentfragments.interfaces.
ICensoredTerm
[source]¶ Extends:
zope.schema.interfaces.ITokenizedTerm
Base interface for a censored term
-
interface
nti.contentfragments.interfaces.
ICensoredUnicodeContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IUnicodeContentFragment
A content fragment that has passed through a censoring process to attempt to ensure it is safe for display to its intended audience (e.g., profanity has been removed if the expected audience is underage/sensitive to that).
The rules for censoring content will be very context specific. In particular, it will depend on who you are, and where you are adding/editing content. The who is important to differentiate between, e.g., students and teachers. The where is important to differentiate between, say, a public forum, and your private notes, or between your Human Sexuality textbook and your Calculus textbook.
For this reason, the censoring process will typically utilize multi-adapters registered on (creator, content_unit). Contrast this with sanitizing HTML, which always follows the same process.
-
interface
nti.contentfragments.interfaces.
IContentFragment
[source]¶ Base interface representing different formats that content can be in.
-
interface
nti.contentfragments.interfaces.
IHTMLContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IUnicodeContentFragment
,zope.mimetype.mtypes.IContentTypeTextHtml
Interface representing content in HTML format.
-
interface
nti.contentfragments.interfaces.
IHTMLContentFragmentField
[source]¶ Extends:
nti.contentfragments.interfaces.ITextUnicodeContentFragmentField
A
Text
type that also requires the object implement an interface descending fromIHTMLContentFragment
.New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
IHyperlinkFormatter
[source]¶ -
find_links
(text)¶ Given a string of text, look through it for hyperlinks and find them.
Returns: A sequence of strings and lxml.etree.Element objects representing the plain text and detected links, in order, within the given text.
-
format
(html_fragment)¶ Process the specified
IHTMLContentFragment
and scan through and convert any plain text links recognized by the this object and inserting new<a>
elements,
-
-
interface
nti.contentfragments.interfaces.
ILatexContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IUnicodeContentFragment
,zope.mimetype.mtypes.IContentTypeTextLatex
Interface representing content in LaTeX format.
-
interface
nti.contentfragments.interfaces.
ILatexFragmentTextLineField
[source]¶ Extends:
nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField
A
TextLine
that requires content to be in LaTeX format.New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
IPlainTextContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IUnicodeContentFragment
,zope.mimetype.mtypes.IContentTypeTextPlain
Interface representing content in plain text format.
-
interface
nti.contentfragments.interfaces.
IPlainTextField
[source]¶ Extends:
nti.contentfragments.interfaces.ITextUnicodeContentFragmentField
A
zope.schema.Text
that requires content to be plain text.New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
IPlainTextLineField
[source]¶ Extends:
nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField
A
TextLine
that requires content to be plain text.
-
interface
nti.contentfragments.interfaces.
IProfanityTerm
[source]¶ Extends:
nti.contentfragments.interfaces.ICensoredTerm
Base interface for a profanity term
-
nti.contentfragments.interfaces.
IPunctuationCharExpression
¶ alias of
nti.contentfragments.interfaces.IPunctuationMarkExpression
-
nti.contentfragments.interfaces.
IPunctuationCharExpressionPlus
¶ alias of
nti.contentfragments.interfaces.IPunctuationMarkExpressionPlus
-
nti.contentfragments.interfaces.
IPunctuationCharPattern
¶ alias of
nti.contentfragments.interfaces.IPunctuationMarkPattern
-
nti.contentfragments.interfaces.
IPunctuationCharPatternPlus
¶ alias of
nti.contentfragments.interfaces.IPunctuationMarkPatternPlus
-
interface
nti.contentfragments.interfaces.
IPunctuationMarkExpression
[source]¶ marker interface for punctuation regular expression
-
interface
nti.contentfragments.interfaces.
IPunctuationMarkExpressionPlus
[source]¶ marker interface for punctuation + space regular expression
-
interface
nti.contentfragments.interfaces.
IPunctuationMarkPattern
[source]¶ marker interface for punctuation regular expression pattern
-
interface
nti.contentfragments.interfaces.
IPunctuationMarkPatternPlus
[source]¶ marker interface for punctuation + space regular expression pattern
-
interface
nti.contentfragments.interfaces.
IRstContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IUnicodeContentFragment
,zope.mimetype.mtypes.IContentTypeTextRst
Interface representing content in RST format.
-
interface
nti.contentfragments.interfaces.
IRstContentFragmentField
[source]¶ Extends:
nti.contentfragments.interfaces.ITextUnicodeContentFragmentField
A
Text
type that also requires the object implement an interface descending fromIRstContentFragment
.New in version 1.6.0.
-
interface
nti.contentfragments.interfaces.
ISanitizedHTMLContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IHTMLContentFragment
HTML content, typically of unknown or untrusted provenance, that has been sanitized for “safe” presentation in a generic, also unknown browsing context. Typically this will mean that certain unsafe constructs, such as <script> tags have been removed.
-
interface
nti.contentfragments.interfaces.
ISanitizedHTMLContentFragmentField
[source]¶ Extends:
nti.contentfragments.interfaces.IHTMLContentFragmentField
A
Text
type that also requires the object implement an interface descending fromISanitizedHTMLContentFragment
.New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
ITagField
[source]¶ Extends:
nti.contentfragments.interfaces.IPlainTextLineField
Requires its content to be only one plain text word that is lowercased.
New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
ITextLatexEscaper
[source]¶ -
_ITextLatexEscaper__call_
(text)¶ scape the specifed text
-
-
interface
nti.contentfragments.interfaces.
ITextLineUnicodeContentFragmentField
[source]¶ Extends:
zope.schema.interfaces.IObject
,zope.schema.interfaces.ITextLine
A
zope.schema.TextLine
type that also requires the object implement an interface descending fromIUnicodeContentFragment
.New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
ITextUnicodeContentFragmentField
[source]¶ Extends:
zope.schema.interfaces.IObject
,zope.schema.interfaces.IText
A
zope.schema.Text
type that also requires the object implement an interface descending fromIUnicodeContentFragment
.New in version 1.2.0.
-
interface
nti.contentfragments.interfaces.
IUnicodeContentFragment
[source]¶ Extends:
nti.contentfragments.interfaces.IContentFragment
,zope.interface.common.collections.ISequence
Content represented as a unicode string.
Although it is simplest to subclass
unicode
, that is not required. At a minimum, what is required are the __getitem__ method (and others declared byIReadSequence
), plus the encode method.Changed in version 1.3.0: Extend
zope.interface.common.collections.ISequence
instead of the semi-deprecatedzope.interface.common.sequence.IReadSequence
. Except on PyPy2, whereISequence
cannot validate against unicode objects.
-
class
nti.contentfragments.interfaces.
CensoredContentEvent
(content_fragment, censored_content, name=None, context=None)[source]¶ Bases:
object
-
class
nti.contentfragments.interfaces.
CensoredHTMLContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.HTMLContentFragment
-
censored
(n)¶
-
-
class
nti.contentfragments.interfaces.
CensoredPlainTextContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.PlainTextContentFragment
-
censored
(n)¶
-
-
class
nti.contentfragments.interfaces.
CensoredSanitizedHTMLContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.CensoredHTMLContentFragment
-
censored
(n)¶
-
-
class
nti.contentfragments.interfaces.
CensoredUnicodeContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces._AddMixin
,nti.contentfragments.interfaces.UnicodeContentFragment
-
class
nti.contentfragments.interfaces.
HTMLContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces._AddMixin
,nti.contentfragments.interfaces.UnicodeContentFragment
-
censored
(n)¶
-
-
class
nti.contentfragments.interfaces.
LatexContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.UnicodeContentFragment
-
class
nti.contentfragments.interfaces.
PlainTextContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.UnicodeContentFragment
-
censored
(n)¶
-
-
class
nti.contentfragments.interfaces.
RstContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.UnicodeContentFragment
-
class
nti.contentfragments.interfaces.
SanitizedHTMLContentFragment
[source]¶ Bases:
nti.contentfragments.interfaces.HTMLContentFragment
-
censored
(n)¶
-
-
class
nti.contentfragments.interfaces.
UnicodeContentFragment
[source]¶ Bases:
unicode
Subclasses should override the
__add__()
method to return objects that implement the appropriate (most derived, generally) interface.This object DOES NOT add a dictionary to the
unicode
type. In particular, it should not be weak referenced. Subclasses that do not expect to be persisted in the ZODB may add additional attributes by adding to the__slots__
field (not the instance value).-
censored
(n)¶
-
translate
(table) → unicode[source]¶ Return a copy of the string S, where all characters have been mapped through the given translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted.
-
nti.contentfragments.censor module¶
algorithms for content censoring.
The algorithms contained in here are trivially simple. We could do much better, for example, with prefix trees. See https://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/ and http://pypi.python.org/pypi/trie/0.1.1
If efficiency really matters, and we have many different filters we are applying, we would need to do a better job pipelining to avoid copies
-
class
nti.contentfragments.censor.
BasicScanner
[source]¶ Bases:
object
-
class
nti.contentfragments.censor.
DefaultCensoredContentPolicy
(fragment=None, target=None)[source]¶ Bases:
object
A content censoring policy that looks up the default scanner and strategy utilities and uses them.
This package does not register this policy as an adapter for anything, you must do that yourself, on (content-fragment, target-object); it can also be registered as a utility or instantiated directly with no arguments.
-
class
nti.contentfragments.censor.
NoOpCensoredContentPolicy
(*args, **kwargs)[source]¶ Bases:
object
A content censoring policy that does no censoring whatesover.
This package does not register this policy as an adapter for anything, you must do that yourself, on (content-fragment, target-object); it can also be registered as a utility or instantiated directly with no arguments.
-
class
nti.contentfragments.censor.
SimpleReplacementCensoredContentStrategy
(replacement_char=u'*')[source]¶ Bases:
object
-
class
nti.contentfragments.censor.
WordMatchScanner
(white_words=(), prohibited_words=())[source]¶
-
nti.contentfragments.censor.
censor_assign
(fragment, target, field_name)[source]¶ Perform manual censoring of assigning an object to a field.
-
nti.contentfragments.censor.
censor_before_assign_components_of_sequence
(sequence, target, event)[source]¶ Register this adapter for (usually any) sequence, some specific interface target, and the
nti.schema.interfaces.IBeforeSequenceAssignedEvent
and it will iterate across the fields and attempt to censor each of them.This package DOES NOT register this event.
-
nti.contentfragments.censor.
censor_before_text_assigned
(fragment, target, event)[source]¶ Watches for field values to be assigned, and looks for specific policies for the given object and field name to handle censoring. If such a policy is found and returns something that is not the original fragment, the event is updated (and so the value assigned to the target is also updated).
nti.contentfragments.html module¶
Converters and utilities for dealing with HTML content fragments. In particular, sanitazation.
-
nti.contentfragments.html.
sanitize_user_html
(user_input, method='html')[source]¶ Given a user input string of plain text, HTML or HTML fragment, sanitize by removing unsupported/dangerous elements and doing some normalization. If it can be represented in plain text, do so.
Parameters: method (string) – One of the method
values acceptable tolxml.etree.tostring()
. The default value,html
, causes this method to produce either HTML or plain text, whatever is most appropriate. Passing the valuetext
causes this method to produce only plain text captured by traversing the elements with lxml. Note: this is legacy functionality, and callers should generally convert via calling the interfaces.Returns: Something that implements frg_interfaces.IUnicodeContentFragment
, typically eitherfrg_interfaces.IPlainTextContentFragment
orfrg_interfaces.ISanitizedHTMLContentFragment
.
nti.contentfragments.latex module¶
Implementations of content fragment transformers for latex.
-
nti.contentfragments.latex.
PlainTextToLatexFragmentConverter
(plain_text, text_scaper=u'')[source]¶ Attempt to convert plain-text strings into LaTeX strings by detecting equations/expressions that could be rendered in latex markup.
nti.contentfragments.punctuation module¶
nti.contentfragments.schema module¶
Helper classes to use content fragments in zope.interface
or zope.schema
declarations.
-
class
nti.contentfragments.schema.
HTMLContentFragment
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.TextUnicodeContentFragment
A
Text
type that also requires the object implement an interface descending fromIHTMLContentFragment
.Pass the keyword arguments for
zope.schema.Text
to the constructor; theschema
argument forObject
is already handled.Note
If you provide a default string that does not already provide
IHTMLContentFragment
, one will be created simply by copying; no validation or transformation will occur.
-
class
nti.contentfragments.schema.
LatexFragmentTextLine
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.TextLineUnicodeContentFragment
A
TextLine
that requires content to be in LaTeX format.Pass the keyword arguments for
TextLine
to the constructor; theschema
argument forObject
is already handled.Note
If you provide a default string that does not already provide
ILatexContentFragment
, one will be created simply by copying; no validation or transformation will occur.
-
class
nti.contentfragments.schema.
PlainText
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.TextUnicodeContentFragment
A
zope.schema.Text
that requires content to be plain text.Pass the keyword arguments for
Text
to the constructor; theschema
argument forObject
is already handled.Note
If you provide a default string that does not already provide
IPlainTextContentFragment
, one will be created simply by copying; no validation or transformation will occur.Caution
This will perform conversions on the input data, stripping things that “look like” HTML, if it does not already implement the required interface.
-
class
nti.contentfragments.schema.
PlainTextLine
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.TextLineUnicodeContentFragment
A
TextLine
that requires content to be plain text.Pass the keyword arguments for
TextLine
to the constructor; theschema
argument forObject
is already handled.Note
If you provide a default string that does not already provide
ILatexContentFragment
, one will be created simply by copying; no validation or transformation will occur.Caution
This will perform conversions on the input data, stripping things that “look like” HTML, if it does not already implement the required interface.
-
class
nti.contentfragments.schema.
RstContentFragment
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.TextUnicodeContentFragment
A
zope.schema.Text
type that also requires the object implement an interface descending fromIRstContentFragment
. Note that currently this does no validation of the content to ensure it is valid reStructuredText.Pass the keyword arguments for
zope.schema.Text
to the constructor; theschema
argument forObject
is already handled.Note
If you provide a default string that does not already provide
IRstContentFragment
, one will be created simply by copying; no validation or transformation will occur.
-
class
nti.contentfragments.schema.
SanitizedHTMLContentFragment
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.HTMLContentFragment
A
Text
type that also requires the object implement an interface descending fromISanitizedHTMLContentFragment
. Note that the default adapter for this can actually produceIPlainTextContentFragment
if there is no HTML present in the input.Pass the keyword arguments for
zope.schema.Text
to the constructor; theschema
argument forObject
is already handled.Note
If you provide a
default
string that does not already provideISanitizedHTMLContentFragment
, one will be created simply by copying; no validation or transformation will occur.
-
class
nti.contentfragments.schema.
Tag
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.PlainTextLine
Requires its content to be only one plain text word that is lowercased.
-
class
nti.contentfragments.schema.
TextLineUnicodeContentFragment
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema._FromUnicodeMixin
,nti.schema.field.Object
,nti.schema.field.ValidTextLine
A
zope.schema.TextLine
type that also requires the object implement an interface descending fromIUnicodeContentFragment
.Pass the keyword arguments for
zope.schema.TextLine
to the constructor; theschema
argument forObject
is already handled.If you pass neither a default nor defaultFactory argument, a defaultFactory argument will be provided to construct an empty content fragment.
Caution
This will perform conversions on the input data, stripping or adjusting things that “look like” HTML, if it does not already implement the required interface; the actual value is likely to be a
ISanitizedHTMLContentFragment
or aIPlainTextContentFragment
.
-
class
nti.contentfragments.schema.
TextUnicodeContentFragment
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema._FromUnicodeMixin
,nti.schema.field.Object
,nti.schema.field.ValidText
A
zope.schema.Text
type that also requires the object implement an interface descending fromIUnicodeContentFragment
.Pass the keyword arguments for
zope.schema.Text
to the constructor; theschema
argument forObject
is already handled.Caution
This will perform conversions on the input data, stripping or adjusting things that “look like” HTML, if it does not already implement the required interface; the actual value is likely to be a
ISanitizedHTMLContentFragment
or aIPlainTextContentFragment
.
-
class
nti.contentfragments.schema.
VerbatimPlainText
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.PlainText
Like PlainText, except instead of running a conversion on the input data, stripping HTML, will simply assume that the input data is already meant to be plain text and will preserve markup as-is.
-
class
nti.contentfragments.schema.
VerbatimPlainTextLine
(*args, **kwargs)[source]¶ Bases:
nti.contentfragments.schema.PlainTextLine
Like PlainTextLine, except instead of running a conversion on the input data, stripping HTML, will simply assume that the input data is already meant to be plain text and will preserve markup as-is.
nti.contentfragments.urlmatcher module¶
An Improved Liberal, Accurate Regex Pattern for Matching URLs
Changes¶
1.9.0 (2021-10-26)¶
- Fix adapting base string input to plain text to behave more like 1.7 by only running the HTML to plain text algorithm if the input looks like it may contain HTML markup. Note that in some instances where characters like ‘<’ were previously escaped to ‘<’, this will no longer happen if the rest of the string doesn’t look like HTML. See issue 44.
- Add schema fields
VerbatimPlainText
andVerbatimPlainTextLine
to assume any incoming unicode value already represents a plain text content fragment, instead of (possibly) passing it through the HTML to plain text algorithm.
1.8.0 (2021-10-06)¶
- Add support for Python 3.9 and 3.10.
- Move to Github Actions from Travis CI.
- The algorithm for converting HTML to plain text has been changed and produces higher quality output. For example, links are preserved in a human-readable fashion. See issue 39.
- Fix an error getting link text when there was no link formatter utility installed. See PR 42.
1.7.0 (2020-10-07)¶
- Allow conversion of reStructuredText fragments to plain text.
1.6.0 (2020-09-02)¶
- Add support for reStructuredText content fragments and corresponding fields.
1.5.0 (2020-07-23)¶
- When sanitizing html, disable link creation when already under an anchor.
1.4.0 (2020-06-17)¶
- Allow
IAllowedAttributeProvider
to be registered to provide additional attributes that would be allowed in sanitized content fragments.
1.3.0 (2020-04-06)¶
- Add support for Python 3.8.
- Depend on zope.interface 5.0.
- Update the datrie dependency. See https://github.com/NextThought/nti.contentfragments/issues/24
- Make
IUnicodeContentFragment
extendzope.interface.common.collections.ISequence
instead of the semi-deprecatedzope.interface.common.sequence.IReadSequence
. - Replace custom interfaces
IString
,IUnicode
andIBytes
with aliases forINativeString
,ITextString
andIByteString
fromzope.interface.common.builtins
. These custom aliases are now deprecated. See https://github.com/NextThought/nti.contentfragments/issues/23. - Fix unicode normalization breaking schema fields with zope.schema 6.0. See https://github.com/NextThought/nti.contentfragments/issues/26
- Ensure all objects have consistent resolution orders.
1.2.1 (2019-11-07)¶
- Remove a word from the censored word list. See issue https://github.com/NextThought/nti.contentfragments/issues/22.
1.2.0 (2018-10-15)¶
- Add support for Python 3.7. Note that
datrie
is not yet available for Python 3.7. - Add support for PyPy3.
- Add interfaces for all schema fields defined in
nti.contentfragments.schema
and make the respective classes implement them.
1.1.1 (2018-06-29)¶
- Packaging: Do not use
html5lib[datrie]
and instead copy that dependency into our own dependencies to workaround a buildout error. See https://github.com/NextThought/nti.contentfragments/issues/17
1.1.0 (2017-06-14)¶
- Remove dependency of
dolmen.builtins
. The interfacesIUnicode
,IBytes
andIString
are now always defined by this package. - Add support for Python 3.6.
1.0.0 (2016-08-19)¶
- Add support for Python 3.
- Stop configuring plone.i18n. It’s a big dependency and doesn’t work on Python 3.
- Introduce our own interfaces for IUnicode and IString, subclassing dolmen.builtins.IUnicode and IString, respectively, if possible.
- The word lists used in censoring are cached in memory.
nti.contentfragments.html._Serializer
has been renamed and is no longer public.- Depend on zope.mimetype >= 2.1.0 for better support of Python 3.
nti.contentfragments¶
Support for working with string-based content in a Zope3/ZTK environment.
Overview¶
In a client/server environment dealing with various types of content from users, it’s important to know what not just the Python type of a particular string is, but also what the semantic type of the string is: HTML, plain text, LaTeX, etc.
This package defines interfaces and classes to be able to record this information. It also features a framework for transforming between the various supported semantic types (e.g., HTML to plain text).
Other features:
- Support for making arbitrary incoming HTML safe (sanitizing it).
- Support for very configurable (optionally) event-based profanity censoring that integrates with nti.schema/zope.schema.
See the documentation for more details.