nti.contentfragments.interfaces module

Content-related interfaces.

interface nti.contentfragments.interfaces.IAllowedAttributeProvider[source]

A way to provide a whitelist of additional attribute names that would be allowed while parsing a content fragment, thus extending the attributes already allowed.

New in version 1.4.0.

allowed_attributes

An iterable of attribute names allowed in a particular context

Implementation:nti.schema.field.IndexedIterable
Read Only:False
Required:False
Default Value:()
Allowed Type:_abcoll.Sequence

Value Type

The attribute name

Implementation:zope.schema.NativeStringLine
Read Only:False
Required:True
Default Value:None
Allowed Type:str
interface nti.contentfragments.interfaces.ICensoredContentEvent[source]
content_fragment

The content that was censored

name

The name of the attribute under which the censor content will be assigned.

context

The context object where the object will be assigned to.

censored_content

The censored content

interface nti.contentfragments.interfaces.ICensoredContentPolicy[source]

A top-level policy puts together detection of content ranges to censor with a strategy to censor them

censor(content_fragment, context)

Censors the content fragment appropriately and returns the censored value.

Parameters:
  • content_fragment – The fragment being censored.
  • context – The object that this content fragment should be censored with regard to. For example, the fragment’s container or composite object that will hold the fragment.
Returns:

The censored content fragment, if any censoring was done to it. May also raise a ValueError if censoring is not allowed and the content should be thrown away.

interface nti.contentfragments.interfaces.ICensoredContentScanner[source]

Something that can perform censoring.

Variations of censoring scanners will be registered as named utilities. Particular censoring solutions (the adapters discussed in ICensoredUnicodeContentFragment) will put together a combination of these utilities to produce the desired result.

The censoring process can further be broken down into two parts: detection of unwanted content, and reacting to unwanted content. For example, reacting might consist of replacing the content with asterisks in plain text, or a special span in HTML, or it might throw an exception to disallow the content altogether. This object performs the first part.

The names may be something like MPAA ratings, or they may follow other categories.

scan(content_fragment)

Scan the given content fragment for censored terms and return their positions as a sequence (iterator) of two-tuples (start, end). The returned tuples should be non-overlapping.

interface nti.contentfragments.interfaces.ICensoredContentStrategy[source]

The other half of the content censoring process explained in ICensoredContentScanner, responsible for taking action on censoring content.

censor_ranges(content_fragment, censored_ranges)

Censors the content fragment appropriately and returns the censored value.

Parameters:
  • content_fragment – The fragment being censored.
  • censored_ranges – The ranges of illicit content as produced by ICensoredContentScanner.scan(); they are not guaranteed to be in any particular order so you may need to sort them with sorted() (in reverse)
Returns:

The censored content fragment, if any censoring was done to it. May also raise a ValueError if censoring is not allowed and the content should be thrown away.

interface nti.contentfragments.interfaces.ICensoredHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.IHTMLContentFragment, nti.contentfragments.interfaces.ICensoredUnicodeContentFragment

interface nti.contentfragments.interfaces.ICensoredPlainTextContentFragment[source]

Extends: nti.contentfragments.interfaces.IPlainTextContentFragment, nti.contentfragments.interfaces.ICensoredUnicodeContentFragment

interface nti.contentfragments.interfaces.ICensoredSanitizedHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.ISanitizedHTMLContentFragment, nti.contentfragments.interfaces.ICensoredHTMLContentFragment

interface nti.contentfragments.interfaces.ICensoredTerm[source]

Extends: zope.schema.interfaces.ITokenizedTerm

Base interface for a censored term

interface nti.contentfragments.interfaces.ICensoredUnicodeContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment

A content fragment that has passed through a censoring process to attempt to ensure it is safe for display to its intended audience (e.g., profanity has been removed if the expected audience is underage/sensitive to that).

The rules for censoring content will be very context specific. In particular, it will depend on who you are, and where you are adding/editing content. The who is important to differentiate between, e.g., students and teachers. The where is important to differentiate between, say, a public forum, and your private notes, or between your Human Sexuality textbook and your Calculus textbook.

For this reason, the censoring process will typically utilize multi-adapters registered on (creator, content_unit). Contrast this with sanitizing HTML, which always follows the same process.

interface nti.contentfragments.interfaces.IContentFragment[source]

Base interface representing different formats that content can be in.

interface nti.contentfragments.interfaces.IHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextHtml

Interface representing content in HTML format.

interface nti.contentfragments.interfaces.IHTMLContentFragmentField[source]

Extends: nti.contentfragments.interfaces.ITextUnicodeContentFragmentField

A Text type that also requires the object implement an interface descending from IHTMLContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IHyperlinkFormatter[source]

Given a string of text, look through it for hyperlinks and find them.

Returns:A sequence of strings and lxml.etree.Element objects representing the plain text and detected links, in order, within the given text.
format(html_fragment)

Process the specified IHTMLContentFragment and scan through and convert any plain text links recognized by the this object and inserting new <a> elements,

interface nti.contentfragments.interfaces.ILatexContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextLatex

Interface representing content in LaTeX format.

interface nti.contentfragments.interfaces.ILatexFragmentTextLineField[source]

Extends: nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField

A TextLine that requires content to be in LaTeX format.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IPlainTextContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextPlain

Interface representing content in plain text format.

interface nti.contentfragments.interfaces.IPlainTextField[source]

Extends: nti.contentfragments.interfaces.ITextUnicodeContentFragmentField

A zope.schema.Text that requires content to be plain text.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IPlainTextLineField[source]

Extends: nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField

A TextLine that requires content to be plain text.

interface nti.contentfragments.interfaces.IProfanityTerm[source]

Extends: nti.contentfragments.interfaces.ICensoredTerm

Base interface for a profanity term

nti.contentfragments.interfaces.IPunctuationCharExpression

alias of nti.contentfragments.interfaces.IPunctuationMarkExpression

nti.contentfragments.interfaces.IPunctuationCharExpressionPlus

alias of nti.contentfragments.interfaces.IPunctuationMarkExpressionPlus

nti.contentfragments.interfaces.IPunctuationCharPattern

alias of nti.contentfragments.interfaces.IPunctuationMarkPattern

nti.contentfragments.interfaces.IPunctuationCharPatternPlus

alias of nti.contentfragments.interfaces.IPunctuationMarkPatternPlus

interface nti.contentfragments.interfaces.IPunctuationMarkExpression[source]

marker interface for punctuation regular expression

interface nti.contentfragments.interfaces.IPunctuationMarkExpressionPlus[source]

marker interface for punctuation + space regular expression

interface nti.contentfragments.interfaces.IPunctuationMarkPattern[source]

marker interface for punctuation regular expression pattern

interface nti.contentfragments.interfaces.IPunctuationMarkPatternPlus[source]

marker interface for punctuation + space regular expression pattern

interface nti.contentfragments.interfaces.IRstContentFragment[source]

Extends: nti.contentfragments.interfaces.IUnicodeContentFragment, zope.mimetype.mtypes.IContentTypeTextRst

Interface representing content in RST format.

interface nti.contentfragments.interfaces.IRstContentFragmentField[source]

Extends: nti.contentfragments.interfaces.ITextUnicodeContentFragmentField

A Text type that also requires the object implement an interface descending from IRstContentFragment.

New in version 1.6.0.

interface nti.contentfragments.interfaces.ISanitizedHTMLContentFragment[source]

Extends: nti.contentfragments.interfaces.IHTMLContentFragment

HTML content, typically of unknown or untrusted provenance, that has been sanitized for “safe” presentation in a generic, also unknown browsing context. Typically this will mean that certain unsafe constructs, such as <script> tags have been removed.

interface nti.contentfragments.interfaces.ISanitizedHTMLContentFragmentField[source]

Extends: nti.contentfragments.interfaces.IHTMLContentFragmentField

A Text type that also requires the object implement an interface descending from ISanitizedHTMLContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.ITagField[source]

Extends: nti.contentfragments.interfaces.IPlainTextLineField

Requires its content to be only one plain text word that is lowercased.

New in version 1.2.0.

interface nti.contentfragments.interfaces.ITextLatexEscaper[source]
_ITextLatexEscaper__call_(text)

scape the specifed text

interface nti.contentfragments.interfaces.ITextLineUnicodeContentFragmentField[source]

Extends: zope.schema.interfaces.IObject, zope.schema.interfaces.ITextLine

A zope.schema.TextLine type that also requires the object implement an interface descending from IUnicodeContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.ITextUnicodeContentFragmentField[source]

Extends: zope.schema.interfaces.IObject, zope.schema.interfaces.IText

A zope.schema.Text type that also requires the object implement an interface descending from IUnicodeContentFragment.

New in version 1.2.0.

interface nti.contentfragments.interfaces.IUnicodeContentFragment[source]

Extends: nti.contentfragments.interfaces.IContentFragment, zope.interface.common.collections.ISequence

Content represented as a unicode string.

Although it is simplest to subclass unicode, that is not required. At a minimum, what is required are the __getitem__ method (and others declared by IReadSequence), plus the encode method.

Changed in version 1.3.0: Extend zope.interface.common.collections.ISequence instead of the semi-deprecated zope.interface.common.sequence.IReadSequence. Except on PyPy2, where ISequence cannot validate against unicode objects.

class nti.contentfragments.interfaces.CensoredContentEvent(content_fragment, censored_content, name=None, context=None)[source]

Bases: object

class nti.contentfragments.interfaces.CensoredHTMLContentFragment[source]

Bases: nti.contentfragments.interfaces.HTMLContentFragment

censored(n)
class nti.contentfragments.interfaces.CensoredPlainTextContentFragment[source]

Bases: nti.contentfragments.interfaces.PlainTextContentFragment

censored(n)
class nti.contentfragments.interfaces.CensoredSanitizedHTMLContentFragment[source]

Bases: nti.contentfragments.interfaces.CensoredHTMLContentFragment

censored(n)
class nti.contentfragments.interfaces.CensoredUnicodeContentFragment[source]

Bases: nti.contentfragments.interfaces._AddMixin, nti.contentfragments.interfaces.UnicodeContentFragment

class nti.contentfragments.interfaces.HTMLContentFragment[source]

Bases: nti.contentfragments.interfaces._AddMixin, nti.contentfragments.interfaces.UnicodeContentFragment

censored(n)
class nti.contentfragments.interfaces.LatexContentFragment[source]

Bases: nti.contentfragments.interfaces.UnicodeContentFragment

class nti.contentfragments.interfaces.PlainTextContentFragment[source]

Bases: nti.contentfragments.interfaces.UnicodeContentFragment

censored(n)
class nti.contentfragments.interfaces.RstContentFragment[source]

Bases: nti.contentfragments.interfaces.UnicodeContentFragment

class nti.contentfragments.interfaces.SanitizedHTMLContentFragment[source]

Bases: nti.contentfragments.interfaces.HTMLContentFragment

censored(n)
class nti.contentfragments.interfaces.UnicodeContentFragment[source]

Bases: unicode

Subclasses should override the __add__() method to return objects that implement the appropriate (most derived, generally) interface.

This object DOES NOT add a dictionary to the unicode type. In particular, it should not be weak referenced. Subclasses that do not expect to be persisted in the ZODB may add additional attributes by adding to the __slots__ field (not the instance value).

censored(n)
lower() → unicode[source]

Return a copy of the string S converted to lowercase.

translate(table) → unicode[source]

Return a copy of the string S, where all characters have been mapped through the given translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None. Unmapped characters are left untouched. Characters mapped to None are deleted.

upper() → unicode[source]

Return a copy of S converted to uppercase.