nti.contentfragments

Contents:

nti.contentfragments package

Submodules

nti.contentfragments.interfaces module

Content-related interfaces.

interface nti.contentfragments.interfaces.IAllowedAttributeProvider[source]

A way to provide a whitelist of additional attribute names that would be allowed while parsing a content fragment, thus extending the attributes already allowed.

New in version 1.4.0.

allowed_attributes

An iterable of attribute names allowed in a particular context

Implementation:nti.schema.field.IndexedIterable
Read Only:False
Required:False
Default Value:()
Allowed Type:_abcoll.Sequence

Value Type

The attribute name

Implementation:zope.schema.NativeStringLine
Read Only:False
Required:True
Default Value:None
Allowed Type:str
interface nti.contentfragments.interfaces.ICensoredContentEvent[source]
content_fragment

The content that was censored

name

The name of the attribute under which the censor content will be assigned.

context

The context object where the object will be assigned to.

censored_content

The censored content

interface nti.contentfragments.interfaces.ICensoredContentPolicy[source]

A top-level policy puts together detection of content ranges to censor with a strategy to censor them

censor(content_fragment, context)

Censors the content fragment appropriately and returns the censored value.

Parameters:
  • content_fragment – The fragment being censored.
  • context – The object that this content fragment should be censored with regard to. For example, the fragment’s container or composite object that will hold the fragment.
Returns:

The censored content fragment, if any censoring was done to it. May also raise a ValueError if censoring is not allowed and the content should be thrown away.

interface nti.contentfragments.interfaces.ICensoredContentScanner[source]

Something that can perform censoring.

Variations of censoring scanners will be registered as named utilities. Particular censoring solutions (the adapters discussed in ICensoredUnicodeContentFragment) will put together a combination of these utilities to produce the desired result.

The censoring process can further be broken down into two parts: detection of unwanted content, and reacting to unwanted content. For example, reacting might consist of replacing the content with asterisks in plain text, or a special span in HTML, or it might throw an exception to disallow the content altogether. This object performs the first part.

The names may be something like MPAA ratings, or they may follow other categories.

scan(content_fragment)

Scan the given content fragment for censored terms and return their positions as a sequence (iterator) of two-tuples (start, end). The returned tuples should be non-overlapping.

interface nti.contentfragments.interfaces.ICensoredContentStrategy[source]

The other half of the content censoring process explained in ICensoredContentScanner, responsible for taking action on censoring content.

censor_ranges(content_fragment, censored_ranges)

Censors the content fragment appropriately and returns the censored value.

Parameters:
  • content_fragment – The fragment being censored.
  • censored_ranges – The ranges of illicit content as produced by ICensoredContentScanner.scan(); they are not guaranteed to be in any particular order so you may need to sort them with sorted() (in reverse)
Returns:

The censored content fragment, if any censoring was done to it. May also raise a ValueError if censoring is not allowed and the content should be thrown away.

interface nti.contentfragments.interfaces.ICensoredHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.IHTMLContentFragment, nti.contentfragments.interfaces.ICensoredUnicodeContentFragment

interface nti.contentfragments.interfaces.ICensoredPlainTextContentFragment[source]

Extends: nti.contentfragments.interfaces.IPlainTextContentFragment, nti.contentfragments.interfaces.ICensoredUnicodeContentFragment

interface nti.contentfragments.interfaces.ICensoredSanitizedHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.ISanitizedHTMLContentFragment, nti.contentfragments.interfaces.ICensoredHTMLContentFragment

interface nti.contentfragments.interfaces.ICensoredTerm[source]

Extends: zope.schema.interfaces.ITokenizedTerm

Base interface for a censored term

interface nti.contentfragments.interfaces.ICensoredUnicodeContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment

A content fragment that has passed through a censoring process to attempt to ensure it is safe for display to its intended audience (e.g., profanity has been removed if the expected audience is underage/sensitive to that).

The rules for censoring content will be very context specific. In particular, it will depend on who you are, and where you are adding/editing content. The who is important to differentiate between, e.g., students and teachers. The where is important to differentiate between, say, a public forum, and your private notes, or between your Human Sexuality textbook and your Calculus textbook.

For this reason, the censoring process will typically utilize multi-adapters registered on (creator, content_unit). Contrast this with sanitizing HTML, which always follows the same process.

interface nti.contentfragments.interfaces.IContentFragment[source]

Base interface representing different formats that content can be in.

interface nti.contentfragments.interfaces.IHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextHtml

Interface representing content in HTML format.

interface nti.contentfragments.interfaces.IHTMLContentFragmentField[source]

Extends: nti.contentfragments.interfaces.ITextUnicodeContentFragmentField

A Text type that also requires the object implement an interface descending from IHTMLContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IHyperlinkFormatter[source]

Given a string of text, look through it for hyperlinks and find them.

Returns:A sequence of strings and lxml.etree.Element objects representing the plain text and detected links, in order, within the given text.
format(html_fragment)

Process the specified IHTMLContentFragment and scan through and convert any plain text links recognized by the this object and inserting new <a> elements,

interface nti.contentfragments.interfaces.ILatexContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextLatex

Interface representing content in LaTeX format.

interface nti.contentfragments.interfaces.ILatexFragmentTextLineField[source]

Extends: nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField

A TextLine that requires content to be in LaTeX format.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IPlainTextContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextPlain

Interface representing content in plain text format.

interface nti.contentfragments.interfaces.IPlainTextField[source]

Extends: nti.contentfragments.interfaces.ITextUnicodeContentFragmentField

A zope.schema.Text that requires content to be plain text.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IPlainTextLineField[source]

Extends: nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField

A TextLine that requires content to be plain text.

interface nti.contentfragments.interfaces.IProfanityTerm[source]

Extends: nti.contentfragments.interfaces.ICensoredTerm

Base interface for a profanity term

nti.contentfragments.interfaces.IPunctuationCharExpression

alias of nti.contentfragments.interfaces.IPunctuationMarkExpression

nti.contentfragments.interfaces.IPunctuationCharExpressionPlus

alias of nti.contentfragments.interfaces.IPunctuationMarkExpressionPlus

nti.contentfragments.interfaces.IPunctuationCharPattern

alias of nti.contentfragments.interfaces.IPunctuationMarkPattern

nti.contentfragments.interfaces.IPunctuationCharPatternPlus

alias of nti.contentfragments.interfaces.IPunctuationMarkPatternPlus

interface nti.contentfragments.interfaces.IPunctuationMarkExpression[source]

marker interface for punctuation regular expression

interface nti.contentfragments.interfaces.IPunctuationMarkExpressionPlus[source]

marker interface for punctuation + space regular expression

interface nti.contentfragments.interfaces.IPunctuationMarkPattern[source]

marker interface for punctuation regular expression pattern

interface nti.contentfragments.interfaces.IPunctuationMarkPatternPlus[source]

marker interface for punctuation + space regular expression pattern

interface nti.contentfragments.interfaces.IRstContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextRst

Interface representing content in RST format.

interface nti.contentfragments.interfaces.IRstContentFragmentField[source]

Extends: nti.contentfragments.interfaces.ITextUnicodeContentFragmentField

A Text type that also requires the object implement an interface descending from IRstContentFragment.

New in version 1.6.0.

interface nti.contentfragments.interfaces.ISanitizedHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.IHTMLContentFragment

HTML content, typically of unknown or untrusted provenance, that has been sanitized for “safe” presentation in a generic, also unknown browsing context. Typically this will mean that certain unsafe constructs, such as <script> tags have been removed.

interface nti.contentfragments.interfaces.ISanitizedHTMLContentFragmentField[source]

Extends: nti.contentfragments.interfaces.IHTMLContentFragmentField

A Text type that also requires the object implement an interface descending from ISanitizedHTMLContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.ITagField[source]

Extends: nti.contentfragments.interfaces.IPlainTextLineField

Requires its content to be only one plain text word that is lowercased.

New in version 1.2.0.

interface nti.contentfragments.interfaces.ITextLatexEscaper[source]
_ITextLatexEscaper__call_(text)

scape the specifed text

interface nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField[source]

Extends: zope.schema.interfaces.IObject, zope.schema.interfaces.ITextLine

A zope.schema.TextLine type that also requires the object implement an interface descending from IUnicodeContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.ITextUnicodeContentFragmentField[source]

Extends: zope.schema.interfaces.IObject, zope.schema.interfaces.IText

A zope.schema.Text type that also requires the object implement an interface descending from IUnicodeContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IUnicodeContentFragment[source]

Extends: nti.contentfragments.interfaces.IContentFragment, zope.interface.common.collections.ISequence

Content represented as a unicode string.

Although it is simplest to subclass unicode, that is not required. At a minimum, what is required are the __getitem__ method (and others declared by IReadSequence), plus the encode method.

Changed in version 1.3.0: Extend zope.interface.common.collections.ISequence instead of the semi-deprecated zope.interface.common.sequence.IReadSequence. Except on PyPy2, where ISequence cannot validate against unicode objects.

class nti.contentfragments.interfaces.CensoredContentEvent(content_fragment, censored_content, name=None, context=None)[source]

Bases: object

class nti.contentfragments.interfaces.CensoredHTMLContentFragment[source]

Bases: nti.contentfragments.interfaces.HTMLContentFragment

censored(n)
class nti.contentfragments.interfaces.CensoredPlainTextContentFragment[source]

Bases: nti.contentfragments.interfaces.PlainTextContentFragment

censored(n)
class nti.contentfragments.interfaces.CensoredSanitizedHTMLContentFragment[source]

Bases: nti.contentfragments.interfaces.CensoredHTMLContentFragment

censored(n)
class nti.contentfragments.interfaces.CensoredUnicodeContentFragment[source]

Bases: nti.contentfragments.interfaces._AddMixin, nti.contentfragments.interfaces.UnicodeContentFragment

class nti.contentfragments.interfaces.HTMLContentFragment[source]

Bases: nti.contentfragments.interfaces._AddMixin, nti.contentfragments.interfaces.UnicodeContentFragment

censored(n)
class nti.contentfragments.interfaces.LatexContentFragment[source]

Bases: nti.contentfragments.interfaces.UnicodeContentFragment

class nti.contentfragments.interfaces.PlainTextContentFragment[source]

Bases: nti.contentfragments.interfaces.UnicodeContentFragment

censored(n)
class nti.contentfragments.interfaces.RstContentFragment[source]

Bases: nti.contentfragments.interfaces.UnicodeContentFragment

class nti.contentfragments.interfaces.SanitizedHTMLContentFragment[source]

Bases: nti.contentfragments.interfaces.HTMLContentFragment

censored(n)
class nti.contentfragments.interfaces.UnicodeContentFragment[source]

Bases: unicode

Subclasses should override the __add__() method to return objects that implement the appropriate (most derived, generally) interface.

This object DOES NOT add a dictionary to the unicode type. In particular, it should not be weak referenced. Subclasses that do not expect to be persisted in the ZODB may add additional attributes by adding to the __slots__ field (not the instance value).

censored(n)
lower() → unicode[source]

Return a copy of the string S converted to lowercase.

translate(table) → unicode[source]

Return a copy of the string S, where all characters have been mapped through the given translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted.

upper() → unicode[source]

Return a copy of S converted to uppercase.

nti.contentfragments.censor module

algorithms for content censoring.

The algorithms contained in here are trivially simple. We could do much better, for example, with prefix trees. See https://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/ and http://pypi.python.org/pypi/trie/0.1.1

If efficiency really matters, and we have many different filters we are applying, we would need to do a better job pipelining to avoid copies

class nti.contentfragments.censor.BasicScanner[source]

Bases: object

do_scan(fragment, ranges)[source]

do_scan is passed a fragment that is guaranteed to be unicode and lower case.

scan(content_fragment)[source]
test_range(new_range, yielded)[source]
class nti.contentfragments.censor.DefaultCensoredContentPolicy(fragment=None, target=None)[source]

Bases: object

A content censoring policy that looks up the default scanner and strategy utilities and uses them.

This package does not register this policy as an adapter for anything, you must do that yourself, on (content-fragment, target-object); it can also be registered as a utility or instantiated directly with no arguments.

censor(fragment, target)[source]
censor_html(fragment, target)[source]
censor_text(fragment, target)[source]
class nti.contentfragments.censor.NoOpCensoredContentPolicy(*args, **kwargs)[source]

Bases: object

A content censoring policy that does no censoring whatesover.

This package does not register this policy as an adapter for anything, you must do that yourself, on (content-fragment, target-object); it can also be registered as a utility or instantiated directly with no arguments.

censor(fragment, _target)[source]
class nti.contentfragments.censor.PipeLineMatchScanner(scanners=())[source]

Bases: nti.contentfragments.censor.BasicScanner

do_scan(content_fragment, yielded)[source]

do_scan is passed a fragment that is guaranteed to be unicode and lower case.

class nti.contentfragments.censor.SimpleReplacementCensoredContentStrategy(replacement_char=u'*')[source]

Bases: object

censor_ranges(content_fragment, censored_ranges)[source]
class nti.contentfragments.censor.TrivialMatchScanner(prohibited_values=())[source]

Bases: nti.contentfragments.censor.BasicScanner

do_scan(content_fragment, yielded)[source]

do_scan is passed a fragment that is guaranteed to be unicode and lower case.

class nti.contentfragments.censor.WordMatchScanner(white_words=(), prohibited_words=())[source]

Bases: nti.contentfragments.censor.BasicScanner

do_scan(content_fragment, yielded)[source]

do_scan is passed a fragment that is guaranteed to be unicode and lower case.

char_tester[source]
nti.contentfragments.censor.censor_assign(fragment, target, field_name)[source]

Perform manual censoring of assigning an object to a field.

nti.contentfragments.censor.censor_before_assign_components_of_sequence(sequence, target, event)[source]

Register this adapter for (usually any) sequence, some specific interface target, and the nti.schema.interfaces.IBeforeSequenceAssignedEvent and it will iterate across the fields and attempt to censor each of them.

This package DOES NOT register this event.

nti.contentfragments.censor.censor_before_text_assigned(fragment, target, event)[source]

Watches for field values to be assigned, and looks for specific policies for the given object and field name to handle censoring. If such a policy is found and returns something that is not the original fragment, the event is updated (and so the value assigned to the target is also updated).

nti.contentfragments.censor.punkt_re_char(lang='en')[source]

nti.contentfragments.html module

Converters and utilities for dealing with HTML content fragments. In particular, sanitazation.

class nti.contentfragments.html.FakeRe[source]

Bases: object

match(regex, val)[source]
nti.contentfragments.html.may_contain_html_like_markup(*args, **kwargs)[source]
nti.contentfragments.html.sanitize_user_html(user_input, method='html')[source]

Given a user input string of plain text, HTML or HTML fragment, sanitize by removing unsupported/dangerous elements and doing some normalization. If it can be represented in plain text, do so.

Parameters:method (string) – One of the method values acceptable to lxml.etree.tostring(). The default value, html, causes this method to produce either HTML or plain text, whatever is most appropriate. Passing the value text causes this method to produce only plain text captured by traversing the elements with lxml. Note: this is legacy functionality, and callers should generally convert via calling the interfaces.
Returns:Something that implements frg_interfaces.IUnicodeContentFragment, typically either frg_interfaces.IPlainTextContentFragment or frg_interfaces.ISanitizedHTMLContentFragment.

nti.contentfragments.latex module

Implementations of content fragment transformers for latex.

nti.contentfragments.latex.PlainTextToLatexFragmentConverter(plain_text, text_scaper=u'')[source]

Attempt to convert plain-text strings into LaTeX strings by detecting equations/expressions that could be rendered in latex markup.

nti.contentfragments.latex.cleanup_equation_tokens(tokens)[source]

Perform cleanups on the individual tokens that make up an equation before converting it to string form.

Returns:A 3-tuple: (before string, tokens, after_string)
nti.contentfragments.latex.escape_tex(text, name=u'')[source]
nti.contentfragments.latex.is_equation_component(token)[source]

nti.contentfragments.punctuation module

nti.contentfragments.schema module

Helper classes to use content fragments in zope.interface or zope.schema declarations.

class nti.contentfragments.schema.HTMLContentFragment(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.TextUnicodeContentFragment

A Text type that also requires the object implement an interface descending from IHTMLContentFragment.

Pass the keyword arguments for zope.schema.Text to the constructor; the schema argument for Object is already handled.

Note

If you provide a default string that does not already provide IHTMLContentFragment, one will be created simply by copying; no validation or transformation will occur.

class nti.contentfragments.schema.LatexFragmentTextLine(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.TextLineUnicodeContentFragment

A TextLine that requires content to be in LaTeX format.

Pass the keyword arguments for TextLine to the constructor; the schema argument for Object is already handled.

Note

If you provide a default string that does not already provide ILatexContentFragment, one will be created simply by copying; no validation or transformation will occur.

class nti.contentfragments.schema.PlainText(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.TextUnicodeContentFragment

A zope.schema.Text that requires content to be plain text.

Pass the keyword arguments for Text to the constructor; the schema argument for Object is already handled.

Note

If you provide a default string that does not already provide IPlainTextContentFragment, one will be created simply by copying; no validation or transformation will occur.

Caution

This will perform conversions on the input data, stripping things that “look like” HTML, if it does not already implement the required interface.

class nti.contentfragments.schema.PlainTextLine(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.TextLineUnicodeContentFragment

A TextLine that requires content to be plain text.

Pass the keyword arguments for TextLine to the constructor; the schema argument for Object is already handled.

Note

If you provide a default string that does not already provide ILatexContentFragment, one will be created simply by copying; no validation or transformation will occur.

Caution

This will perform conversions on the input data, stripping things that “look like” HTML, if it does not already implement the required interface.

class nti.contentfragments.schema.RstContentFragment(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.TextUnicodeContentFragment

A zope.schema.Text type that also requires the object implement an interface descending from IRstContentFragment. Note that currently this does no validation of the content to ensure it is valid reStructuredText.

Pass the keyword arguments for zope.schema.Text to the constructor; the schema argument for Object is already handled.

Note

If you provide a default string that does not already provide IRstContentFragment, one will be created simply by copying; no validation or transformation will occur.

fromUnicode(value)[source]

We implement IFromUnicode by adapting the given object to our text schema.

This happens after unicode normalization.

class nti.contentfragments.schema.SanitizedHTMLContentFragment(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.HTMLContentFragment

A Text type that also requires the object implement an interface descending from ISanitizedHTMLContentFragment. Note that the default adapter for this can actually produce IPlainTextContentFragment if there is no HTML present in the input.

Pass the keyword arguments for zope.schema.Text to the constructor; the schema argument for Object is already handled.

Note

If you provide a default string that does not already provide ISanitizedHTMLContentFragment, one will be created simply by copying; no validation or transformation will occur.

class nti.contentfragments.schema.Tag(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.PlainTextLine

Requires its content to be only one plain text word that is lowercased.

fromUnicode(value)[source]

We implement IFromUnicode by adapting the given object to our text schema.

This happens after unicode normalization.

class nti.contentfragments.schema.TextLineUnicodeContentFragment(*args, **kwargs)[source]

Bases: nti.contentfragments.schema._FromUnicodeMixin, nti.schema.field.Object, nti.schema.field.ValidTextLine

A zope.schema.TextLine type that also requires the object implement an interface descending from IUnicodeContentFragment.

Pass the keyword arguments for zope.schema.TextLine to the constructor; the schema argument for Object is already handled.

If you pass neither a default nor defaultFactory argument, a defaultFactory argument will be provided to construct an empty content fragment.

Caution

This will perform conversions on the input data, stripping or adjusting things that “look like” HTML, if it does not already implement the required interface; the actual value is likely to be a ISanitizedHTMLContentFragment or a IPlainTextContentFragment.

class nti.contentfragments.schema.TextUnicodeContentFragment(*args, **kwargs)[source]

Bases: nti.contentfragments.schema._FromUnicodeMixin, nti.schema.field.Object, nti.schema.field.ValidText

A zope.schema.Text type that also requires the object implement an interface descending from IUnicodeContentFragment.

Pass the keyword arguments for zope.schema.Text to the constructor; the schema argument for Object is already handled.

Caution

This will perform conversions on the input data, stripping or adjusting things that “look like” HTML, if it does not already implement the required interface; the actual value is likely to be a ISanitizedHTMLContentFragment or a IPlainTextContentFragment.

class nti.contentfragments.schema.VerbatimPlainText(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.PlainText

Like PlainText, except instead of running a conversion on the input data, stripping HTML, will simply assume that the input data is already meant to be plain text and will preserve markup as-is.

class nti.contentfragments.schema.VerbatimPlainTextLine(*args, **kwargs)[source]

Bases: nti.contentfragments.schema.PlainTextLine

Like PlainTextLine, except instead of running a conversion on the input data, stripping HTML, will simply assume that the input data is already meant to be plain text and will preserve markup as-is.

nti.contentfragments.schema.Title()[source]

Return a zope.schema.interfaces.IField representing the standard title of some object. This should be stored in the title field.

nti.contentfragments.urlmatcher module

An Improved Liberal, Accurate Regex Pattern for Matching URLs

class nti.contentfragments.urlmatcher.GrubberHyperlinkFormatter[source]

Bases: object

format(html_fragment)[source]
grubber_v1_pattern = <_sre.SRE_Pattern object>

Changes

1.9.0 (2021-10-26)

  • Fix adapting base string input to plain text to behave more like 1.7 by only running the HTML to plain text algorithm if the input looks like it may contain HTML markup. Note that in some instances where characters like ‘<’ were previously escaped to ‘&lt;’, this will no longer happen if the rest of the string doesn’t look like HTML. See issue 44.
  • Add schema fields VerbatimPlainText and VerbatimPlainTextLine to assume any incoming unicode value already represents a plain text content fragment, instead of (possibly) passing it through the HTML to plain text algorithm.

1.8.0 (2021-10-06)

  • Add support for Python 3.9 and 3.10.
  • Move to Github Actions from Travis CI.
  • The algorithm for converting HTML to plain text has been changed and produces higher quality output. For example, links are preserved in a human-readable fashion. See issue 39.
  • Fix an error getting link text when there was no link formatter utility installed. See PR 42.

1.7.0 (2020-10-07)

  • Allow conversion of reStructuredText fragments to plain text.

1.6.1 (2020-09-14)

  • Ensure disallowed tags nested within anchors do not raise. See issue 34.

1.6.0 (2020-09-02)

  • Add support for reStructuredText content fragments and corresponding fields.

1.5.0 (2020-07-23)

  • When sanitizing html, disable link creation when already under an anchor.

1.4.0 (2020-06-17)

  • Allow IAllowedAttributeProvider to be registered to provide additional attributes that would be allowed in sanitized content fragments.

1.3.0 (2020-04-06)

1.2.1 (2019-11-07)

1.2.0 (2018-10-15)

  • Add support for Python 3.7. Note that datrie is not yet available for Python 3.7.
  • Add support for PyPy3.
  • Add interfaces for all schema fields defined in nti.contentfragments.schema and make the respective classes implement them.

1.1.1 (2018-06-29)

1.1.0 (2017-06-14)

  • Remove dependency of dolmen.builtins. The interfaces IUnicode, IBytes and IString are now always defined by this package.
  • Add support for Python 3.6.

1.0.0 (2016-08-19)

  • Add support for Python 3.
  • Stop configuring plone.i18n. It’s a big dependency and doesn’t work on Python 3.
  • Introduce our own interfaces for IUnicode and IString, subclassing dolmen.builtins.IUnicode and IString, respectively, if possible.
  • The word lists used in censoring are cached in memory.
  • nti.contentfragments.html._Serializer has been renamed and is no longer public.
  • Depend on zope.mimetype >= 2.1.0 for better support of Python 3.

nti.contentfragments

Latest release Supported Python versions https://github.com/NextThought/nti.contentfragments/workflows/tests/badge.svg https://coveralls.io/repos/github/NextThought/nti.contentfragments/badge.svg Documentation Status

Support for working with string-based content in a Zope3/ZTK environment.

Overview

In a client/server environment dealing with various types of content from users, it’s important to know what not just the Python type of a particular string is, but also what the semantic type of the string is: HTML, plain text, LaTeX, etc.

This package defines interfaces and classes to be able to record this information. It also features a framework for transforming between the various supported semantic types (e.g., HTML to plain text).

Other features:

  • Support for making arbitrary incoming HTML safe (sanitizing it).
  • Support for very configurable (optionally) event-based profanity censoring that integrates with nti.schema/zope.schema.

See the documentation for more details.

Indices and tables