Module `eyekit.text`

Defines the TextBlock and InterestArea objects for handling texts.

Classes

class Box

Representation of a bounding box, which provides an underlying framework for Character, InterestArea, and TextBlock.

View code on GitHub

Subclasses

Character
InterestArea
TextBlock

Instance variables

var x : float: X-coordinate of the center of the bounding box

View code on GitHub
var y : float: Y-coordinate of the center of the bounding box

View code on GitHub
var x_tl : float: X-coordinate of the top-left corner of the bounding box

View code on GitHub
var y_tl : float: Y-coordinate of the top-left corner of the bounding box

View code on GitHub
var x_br : float: X-coordinate of the bottom-right corner of the bounding box

View code on GitHub
var y_br : float: Y-coordinate of the bottom-right corner of the bounding box

View code on GitHub
var width : float: Width of the bounding box

View code on GitHub
var height : float: Height of the bounding box

View code on GitHub
var box : tuple: The bounding box represented as x_tl, y_tl, width, and height

View code on GitHub
var center : tuple: XY-coordinates of the center of the bounding box

View code on GitHub

class Character (char, x_tl, y_tl, x_br, y_br, baseline, log_pos)

Representation of a single character of text. A Character object is essentially a one-letter string that occupies a position in space and has a bounding box. It is not usually necessary to create Character objects manually; they are created automatically during the instantiation of a TextBlock.

View code on GitHub

Ancestors

Instance variables

var baseline : float: The y position of the character baseline

View code on GitHub
var midline : float: The y position of the character midline

View code on GitHub

Methods

def serialize(self) ‑> list: View code on GitHub

Inherited members

Box:
- box
- center
- height
- width
- x
- x_br
- x_tl
- y
- y_br
- y_tl

class InterestArea (chars, location, padding, right_to_left, id=None)

Representation of an interest area – a portion of a TextBlock object that is of potential interest. It is not usually necessary to create InterestArea objects manually; they are created automatically when you extract areas of interest from a TextBlock.

View code on GitHub

Ancestors

Instance variables

var location : tuple: Location of the interest area in its parent TextBlock (row, start, end)

View code on GitHub
var id : str: Interest area ID. By default, these ID's have the form 1:5:10, which represents the line number and column indices of the InterestArea in its parent TextBlock. However, IDs can also be changed to any arbitrary string.

View code on GitHub
var right_to_left : bool: True if interest area represents right-to-left text

View code on GitHub
var text : str: String representation of the interest area

View code on GitHub
var display_text : str: Same as text except right-to-left text is output in display form

View code on GitHub
var baseline : float: The y position of the text baseline

View code on GitHub
var midline : float: The y position of the text midline

View code on GitHub
var onset : float: The x position of the onset of the interest area. The onset is the left edge of the interest area text without any bounding box padding (or the right edge in the case of right-to-left text).

View code on GitHub
var padding : tuple: Bounding box padding on the top, bottom, left, and right edges

View code on GitHub

Methods

def set_padding(self, *, top: float = None, bottom: float = None, left: float = None, right: float = None): Set the amount of bounding box padding on the top, bottom, left and/or right edges.

View code on GitHub
def adjust_padding(self, *, top: float = None, bottom: float = None, left: float = None, right: float = None): Adjust the current amount of bounding box padding on the top, bottom, left, and/or right edges. Positive values increase the padding, and negative values decrease the padding.

View code on GitHub
def is_left_of(self, fixation) ‑> bool: Returns True if the interest area is to the left of the fixation.

View code on GitHub
def is_right_of(self, fixation) ‑> bool: Returns True if the interest area is to the right of the fixation.

View code on GitHub
def is_before(self, fixation) ‑> bool: Returns True if the interest area is before the fixation. An interest area comes before a fixation if it is to the left of that fixation (or to the right in the case of right-to-left text).

View code on GitHub
def is_after(self, fixation) ‑> bool: Returns True if the interest area is after the fixation. An interest area comes after a fixation if it is to the right of that fixation (or to the left in the case of right-to-left text).

View code on GitHub
def serialize(self) ‑> dict: Returns the InterestArea's initialization arguments as a dictionary for serialization.

View code on GitHub

Inherited members

Box:
- box
- center
- height
- width
- x
- x_br
- x_tl
- y
- y_br
- y_tl

class TextBlock (text: list, *, position: tuple = None, font_face: str = None, font_size: float = None, line_height: float = None, align: str = None, anchor: str = None, right_to_left: bool = None, alphabet: str = None, autopad: bool = None)

Representation of a piece of text, which may be a word, sentence, or entire multiline passage.

Initialized with:

text The line of text (string) or lines of text (list of strings). Optionally, these can be marked up with arbitrary interest areas; for example, The quick brown fox jump[ed]{past-suffix} over the lazy dog.
position XY-coordinates describing the position of the TextBlock on the screen. The x-coordinate should be either the left edge, right edge, or center point of the TextBlock, depending on how the anchor argument has been set (see below). The y-coordinate should always correspond to the baseline of the first (or only) line of text.
font_face Name of a font available on your system. The keywords italic and/or bold can also be included to select the desired style, e.g., Minion Pro bold italic.
font_size Font size in pixels. At 72dpi, this is equivalent to the font size in points. To convert a font size from some other dpi, use font_size_at_72dpi().
line_height Distance between lines of text in pixels. In general, for single line spacing, the line height is equal to the font size; for double line spacing, the line height is equal to 2 × the font size, etc. By default, the line height is assumed to be the same as the font size (single line spacing). If autopad is set to True (see below), the line height also effectively determines the height of the bounding boxes around interest areas.
align Alignment of the text within the TextBlock. Must be set to left, center, or right, and defaults to left (unless right_to_left is set to True, in which case align defaults to right).
anchor Anchor point of the TextBlock. This determines the interpretation of the position argument (see above). Must be set to left, center, or right, and defaults to the same as the align argument. For example, if position was set to the center of the screen, the align and anchor arguments would have the following effects:
right_to_left Set to True if the text is in a right-to-left script (Arabic, Hebrew, Urdu, etc.). If you are working with the Arabic script, you should reshape the text prior to passing it into Eyekit by using, for example, the Arabic-reshaper package.
alphabet A string of characters that are to be considered alphabetical, which determines what counts as a word. By default, this includes any character defined as a letter or number in unicode, plus the underscore character. However, if you need to modify Eyekit's default behavior, you can set a specific alphabet here. For example, if you wanted to treat apostrophes and hyphens as alphabetical, you might use alphabet="A-Za-z'-". This would allow a sentence like "Where's the orang-utan?" to be treated as three words rather than five.
autopad If True (the default), padding is automatically added to InterestArea bounding boxes to avoid horizontal gaps between words and vertical gaps between lines. Horizontal padding (half of the width of a space character) is added to the left and right edges, unless the character to the left or right of the interest area is alphabetical (e.g. if the interest area is word-internal). Vertical padding is added to the top and bottom edges, such that bounding box heights will be equal to the line_height (see above).

View code on GitHub

Ancestors

Static methods

def defaults(*, position: tuple = None, font_face: str = None, font_size: float = None, line_height: float = None, align: str = None, anchor: str = None, right_to_left: bool = None, alphabet: str = None, autopad: bool = None)

Set default TextBlock parameters. If you plan to create several TextBlocks with the same parameters, it may be useful to set the default parameters at the top of your script or at the start of your session:

import eyekit
eyekit.TextBlock.defaults(font_face='Helvetica')
txt = eyekit.TextBlock('The quick brown fox')
print(txt.font_face) # 'Helvetica'

View code on GitHub

Instance variables

var text : list: Original input text

View code on GitHub
var position : tuple: Position of the TextBlock

View code on GitHub
var font_face : str: Name of the font

View code on GitHub
var font_size : float: Font size in points

View code on GitHub
var line_height : float: Line height in points

View code on GitHub
var align : str: Alignment of the text (either left, center, or right)

View code on GitHub
var anchor : str: Anchor point of the text (either left, center, or right)

View code on GitHub
var right_to_left : bool: Right-to-left text

View code on GitHub
var alphabet : str: Characters that are considered alphabetical

View code on GitHub
var autopad : bool: Whether or not automatic padding is switched on

View code on GitHub
var n_rows : int: Number of rows in the text (i.e. the number of lines)

View code on GitHub
var n_cols : int: Number of columns in the text (i.e. the number of characters in the widest line)

View code on GitHub
var n_lines : int: Number of lines in the text (i.e. alias of n_rows)

View code on GitHub
var baselines : list: Y-coordinate of the baseline of each line of text

View code on GitHub
var midlines : list: Y-coordinate of the midline of each line of text

View code on GitHub

Methods

def interest_areas(self): Iterate over each interest area that was manually marked up in the raw text. To mark up an interest area, use brackets to mark the area itself followed immediately by braces to provide an ID (e.g., TextBlock("The quick [brown]{word_id} fox.")).

View code on GitHub
def lines(self): Iterate over each line as an InterestArea.

View code on GitHub
def words(self, pattern: str = None, *, line_n: int = None, alphabetical_only: bool = True): Iterate over each word as an InterestArea. Optionally, you can supply a regex pattern to pick out specific words. For example, '(?i)the' gives you case-insensitive occurrences of the word the or '[a-z]+ing' gives you lower-case words ending with -ing. line_n limits the iteration to a specific line number. If alphabetical_only is set to True, a word is defined as a string of consecutive alphabetical characters (as defined by the TextBlock's alphabet property); if False, a word is defined as a string of consecutive non-whitespace characters.

View code on GitHub
def characters(self, *, line_n: int = None, alphabetical_only: bool = True): Iterate over each character as an InterestArea. line_n limits the iteration to a specific line number. If alphabetical_only is set to True, the iterator will only yield alphabetical characters (as defined by the TextBlock's alphabet property).

View code on GitHub
def ngrams(self, ngram_width: int, *, line_n: int = None, alphabetical_only: bool = True): Iterate over each ngram, for given n, as an InterestArea. line_n limits the iteration to a specific line number. If alphabetical_only is set to True, an ngram is defined as a string of consecutive alphabetical characters (as defined by the TextBlock's alphabet property) of length ngram_width.

View code on GitHub
def zones(self): Deprecated in 0.4.1 and removed in 0.7. Use TextBlock.interest_areas() instead.

View code on GitHub
def which_line(self, fixation) ‑> InterestArea: Deprecated in 0.6 and removed in 0.7.

View code on GitHub
def which_word(self, fixation, pattern: str = None, *, line_n: int = None, alphabetical_only: bool = True) ‑> InterestArea: Deprecated in 0.6 and removed in 0.7.

View code on GitHub
def which_character(self, fixation, *, line_n: int = None, alphabetical_only: bool = True) ‑> InterestArea: Deprecated in 0.6 and removed in 0.7.

View code on GitHub
def serialize(self) ‑> dict: Returns the TextBlock's initialization arguments as a dictionary for serialization.

View code on GitHub

Inherited members

Box:
- box
- center
- height
- width
- x
- x_br
- x_tl
- y
- y_br
- y_tl