The full Pygments API

This page describes the Pygments API.

High-level API

Functions from the pygments module:

pygments.lex(code, lexer)

Lex code with the lexer (must be a Lexer instance) and return an iterable of tokens. Currently, this only calls lexer.get_tokens().

pygments.format(tokens, formatter, outfile=None)

Format tokens (an iterable of tokens) with the formatter formatter (a Formatter instance).

If outfile is given and a valid file object (an object with a write method), the result will be written to it, otherwise it is returned as a string.

pygments.highlight(code, lexer, formatter, outfile=None)

This is the most high-level highlighting function. It combines lex and format in one function.

Functions from pygments.lexers:

pygments.lexers.get_lexer_by_name(_alias, **options)

Return an instance of a Lexer subclass that has alias in its aliases list. The lexer is given the options at its instantiation.

Will raise pygments.util.ClassNotFound if no lexer with that alias is found.

pygments.lexers.get_lexer_for_filename(_fn, code=None, **options)

Get a lexer for a filename.

Return a Lexer subclass instance that has a filename pattern matching fn. The lexer is given the options at its instantiation.

Raise pygments.util.ClassNotFound if no lexer for that filename is found.

If multiple lexers match the filename pattern, use their analyse_text() methods to figure out which one is more appropriate.

pygments.lexers.get_lexer_for_mimetype(_mime, **options)

Return a Lexer subclass instance that has mime in its mimetype list. The lexer is given the options at its instantiation.

Will raise pygments.util.ClassNotFound if not lexer for that mimetype is found.

pygments.lexers.load_lexer_from_file(filename, lexername='CustomLexer', **options)

Load a lexer from a file.

This method expects a file located relative to the current working directory, which contains a Lexer class. By default, it expects the Lexer to be name CustomLexer; you can specify your own class name as the second argument to this function.

Users should be very careful with the input, because this method is equivalent to running eval on the input file.

Raises ClassNotFound if there are any problems importing the Lexer.

New in version 2.2.

pygments.lexers.guess_lexer(_text, **options)

Return a Lexer subclass instance that’s guessed from the text in text. For that, the analyse_text() method of every known lexer class is called with the text as argument, and the lexer which returned the highest value will be instantiated and returned.

pygments.util.ClassNotFound is raised if no lexer thinks it can handle the content.

pygments.lexers.guess_lexer_for_filename(_fn, _text, **options)

As guess_lexer(), but only lexers which have a pattern in filenames or alias_filenames that matches filename are taken into consideration.

pygments.util.ClassNotFound is raised if no lexer thinks it can handle the content.

pygments.lexers.get_all_lexers(plugins=True)

Return a generator of tuples in the form (name, aliases, filenames, mimetypes) of all know lexers.

If plugins is true (the default), plugin lexers supplied by entrypoints are also returned. Otherwise, only builtin ones are considered.

pygments.lexers.find_lexer_class_by_name(_alias)

Return the Lexer subclass that has alias in its aliases list, without instantiating it.

Like get_lexer_by_name, but does not instantiate the class.

Will raise pygments.util.ClassNotFound if no lexer with that alias is found.

New in version 2.2.

pygments.lexers.find_lexer_class(name)

Return the Lexer subclass that with the name attribute as given by the name argument.

Functions from pygments.formatters:

pygments.formatters.get_formatter_by_name(_alias, **options)

Return an instance of a Formatter subclass that has alias in its aliases list. The formatter is given the options at its instantiation.

Will raise pygments.util.ClassNotFound if no formatter with that alias is found.

pygments.formatters.get_formatter_for_filename(fn, **options)

Return a Formatter subclass instance that has a filename pattern matching fn. The formatter is given the options at its instantiation.

Will raise pygments.util.ClassNotFound if no formatter for that filename is found.

pygments.formatters.load_formatter_from_file(filename, formattername='CustomFormatter', **options)

Return a Formatter subclass instance loaded from the provided file, relative to the current directory.

The file is expected to contain a Formatter class named formattername (by default, CustomFormatter). Users should be very careful with the input, because this method is equivalent to running eval() on the input file. The formatter is given the options at its instantiation.

pygments.util.ClassNotFound is raised if there are any errors loading the formatter.

New in version 2.2.

Functions from pygments.styles:

pygments.styles.get_style_by_name(name)

Return a style class by its short name. The names of the builtin styles are listed in pygments.styles.STYLE_MAP.

Will raise pygments.util.ClassNotFound if no style of that name is found.

pygments.styles.get_all_styles()

Return a generator for all styles by name, both builtin and plugin.

pygments.styles.STYLE_MAP = {'abap': 'abap::AbapStyle', 'algol': 'algol::AlgolStyle', 'algol_nu': 'algol_nu::Algol_NuStyle', 'arduino': 'arduino::ArduinoStyle', 'autumn': 'autumn::AutumnStyle', 'borland': 'borland::BorlandStyle', 'bw': 'bw::BlackWhiteStyle', 'coffee': 'coffee::CoffeeStyle', 'colorful': 'colorful::ColorfulStyle', 'default': 'default::DefaultStyle', 'dracula': 'dracula::DraculaStyle', 'emacs': 'emacs::EmacsStyle', 'friendly': 'friendly::FriendlyStyle', 'friendly_grayscale': 'friendly_grayscale::FriendlyGrayscaleStyle', 'fruity': 'fruity::FruityStyle', 'github-dark': 'gh_dark::GhDarkStyle', 'gruvbox-dark': 'gruvbox::GruvboxDarkStyle', 'gruvbox-light': 'gruvbox::GruvboxLightStyle', 'igor': 'igor::IgorStyle', 'inkpot': 'inkpot::InkPotStyle', 'lightbulb': 'lightbulb::LightbulbStyle', 'lilypond': 'lilypond::LilyPondStyle', 'lovelace': 'lovelace::LovelaceStyle', 'manni': 'manni::ManniStyle', 'material': 'material::MaterialStyle', 'monokai': 'monokai::MonokaiStyle', 'murphy': 'murphy::MurphyStyle', 'native': 'native::NativeStyle', 'nord': 'nord::NordStyle', 'nord-darker': 'nord::NordDarkerStyle', 'one-dark': 'onedark::OneDarkStyle', 'paraiso-dark': 'paraiso_dark::ParaisoDarkStyle', 'paraiso-light': 'paraiso_light::ParaisoLightStyle', 'pastie': 'pastie::PastieStyle', 'perldoc': 'perldoc::PerldocStyle', 'rainbow_dash': 'rainbow_dash::RainbowDashStyle', 'rrt': 'rrt::RrtStyle', 'sas': 'sas::SasStyle', 'solarized-dark': 'solarized::SolarizedDarkStyle', 'solarized-light': 'solarized::SolarizedLightStyle', 'staroffice': 'staroffice::StarofficeStyle', 'stata-dark': 'stata_dark::StataDarkStyle', 'stata-light': 'stata_light::StataLightStyle', 'tango': 'tango::TangoStyle', 'trac': 'trac::TracStyle', 'vim': 'vim::VimStyle', 'vs': 'vs::VisualStudioStyle', 'xcode': 'xcode::XcodeStyle', 'zenburn': 'zenburn::ZenburnStyle'}

A dictionary of built-in styles, mapping style names to 'submodule::classname' strings. This list is deprecated. Use pygments.styles.STYLES instead

Lexers

The base lexer class from which all lexers are derived is:

class pygments.lexer.Lexer(**options)

Lexer for a specific language.

See also Write your own lexer, a high-level guide to writing lexers.

Lexer classes have attributes used for choosing the most appropriate lexer based on various criteria.

name

Full name of the lexer, in human-readable form

aliases

A list of short, unique identifiers that can be used to look up the lexer from a list, e.g., using get_lexer_by_name().

filenames

A list of fnmatch patterns that match filenames which contain content for this lexer. The patterns in this list should be unique among all lexers.

alias_filenames = []

A list of fnmatch patterns that match filenames which may or may not contain content for this lexer. This list is used by the guess_lexer_for_filename() function, to determine which lexers are then included in guessing the correct one. That means that e.g. every lexer for HTML and a template language should include \*.html in this list.

mimetypes

A list of MIME types for content that can be lexed with this lexer.

priority = 0

Priority, should multiple lexers match and no content is provided

Lexers included in Pygments should have two additional attributes:

url

URL of the language specification/definition. Used in the Pygments documentation. Set to an empty string to disable.

version_added

Version of Pygments in which the lexer was added.

Lexers included in Pygments may have additional attributes:

_example

Example file name. Relative to the tests/examplefiles directory. This is used by the documentation generator to show an example.

You can pass options to the constructor. The basic options recognized by all lexers and processed by the base Lexer class are:

stripnl

Strip leading and trailing newlines from the input (default: True).

stripall

Strip all leading and trailing whitespace from the input (default: False).

ensurenl

Make sure that the input ends with a newline (default: True). This is required for some lexers that consume input linewise.

New in version 1.3.

tabsize

If given and greater than 0, expand tabs in the input (default: 0).

encoding

If given, must be an encoding name. This encoding will be used to convert the input string to Unicode, if it is not already a Unicode string (default: 'guess', which uses a simple UTF-8 / Locale / Latin1 detection. Can also be 'chardet' to use the chardet library, if it is installed.

inencoding

Overrides the encoding if given.

__init__(**options)

This constructor takes arbitrary options as keyword arguments. Every subclass must first process its own options and then call the Lexer constructor, since it processes the basic options like stripnl.

An example looks like this:

def __init__(self, **options):
    self.compress = options.get('compress', '')
    Lexer.__init__(self, **options)

As these options must all be specifiable as strings (due to the command line usage), there are various utility functions available to help with that, see Utilities.

add_filter(filter_, **options)

Add a new stream filter to this lexer.

static analyse_text(text)

A static method which is called for lexer guessing.

It should analyse the text and return a float in the range from 0.0 to 1.0. If it returns 0.0, the lexer will not be selected as the most probable one, if it returns 1.0, it will be selected immediately. This is used by guess_lexer.

The LexerMeta metaclass automatically wraps this function so that it works like a static method (no self or cls parameter) and the return value is automatically converted to float. If the return value is an object that is boolean False it’s the same as if the return values was 0.0.

get_tokens(text, unfiltered=False)

This method is the basic interface of a lexer. It is called by the highlight() function. It must process the text and return an iterable of (tokentype, value) pairs from text.

Normally, you don’t need to override this method. The default implementation processes the options recognized by all lexers (stripnl, stripall and so on), and then yields all tokens from get_tokens_unprocessed(), with the index dropped.

If unfiltered is set to True, the filtering mechanism is bypassed even if filters are defined.

get_tokens_unprocessed(text)

This method should process the text and return an iterable of (index, tokentype, value) tuples where index is the starting position of the token within the input text.

It must be overridden by subclasses. It is recommended to implement it as a generator to maximize effectiveness.

There are several base class derived from Lexer you can use to build your lexer from:

class pygments.lexer.RegexLexer(*args, **kwds)

Base for simple stateful regular expression-based lexers. Simplifies the lexing process so that you need only provide a list of states and regular expressions.

class pygments.lexer.ExtendedRegexLexer(*args, **kwds)

A RegexLexer that uses a context object to store its state.

class pygments.lexer.DelegatingLexer(_root_lexer, _language_lexer, _needle=('Other',), **options)

This lexer takes two lexer as arguments. A root lexer and a language lexer. First everything is scanned using the language lexer, afterwards all Other tokens are lexed using the root lexer.

The lexers from the template lexer package use this base lexer.

Formatters

A formatter is derived from this class:

class pygments.formatter.Formatter(**options)

Converts a token stream to text.

Formatters should have attributes to help selecting them. These are similar to the corresponding Lexer attributes.

name

Full name for the formatter, in human-readable form.

aliases

A list of short, unique identifiers that can be used to lookup the formatter from a list, e.g. using get_formatter_by_name().

filenames

A list of fnmatch patterns that match filenames for which this formatter can produce output. The patterns in this list should be unique among all formatters.

You can pass options as keyword arguments to the constructor. All formatters accept these basic options:

style

The style to use, can be a string or a Style subclass (default: “default”). Not used by e.g. the TerminalFormatter.

full

Tells the formatter to output a “full” document, i.e. a complete self-contained document. This doesn’t have any effect for some formatters (default: false).

title

If full is true, the title that should be used to caption the document (default: ‘’).

encoding

If given, must be an encoding name. This will be used to convert the Unicode token strings to byte strings in the output. If it is “” or None, Unicode strings will be written to the output file, which most file-like objects do not support (default: None).

outencoding

Overrides encoding if given.

__init__(**options)

As with lexers, this constructor takes arbitrary optional arguments, and if you override it, you should first process your own options, then call the base class implementation.

format(tokensource, outfile)

This method must format the tokens from the tokensource iterable and write the formatted version to the file object outfile.

Formatter options can control how exactly the tokens are converted.

get_style_defs(arg='')

This method must return statements or declarations suitable to define the current style for subsequent highlighted text (e.g. CSS classes in the HTMLFormatter).

The optional argument arg can be used to modify the generation and is formatter dependent (it is standardized because it can be given on the command line).

This method is called by the -S command-line option, the arg is then given by the -a option.

Utilities

The pygments.util module has some utility functions usable for processing command line options. All of the following functions get values from a dictionary of options. If the value is already in the type expected by the option, it is returned as-is. Otherwise, if the value is a string, it is first converted to the expected type if possible.

exception pygments.util.OptionError

This exception will be raised by all option processing functions if the type or value of the argument is not correct.

pygments.util.get_bool_opt(options, optname, default=None)

Intuitively, this is options.get(optname, default), but restricted to Boolean value. The Booleans can be represented as string, in order to accept Boolean value from the command line arguments. If the key optname is present in the dictionary options and is not associated with a Boolean, raise an OptionError. If it is absent, default is returned instead.

The valid string values for True are 1, yes, true and on, the ones for False are 0, no, false and off (matched case-insensitively).

pygments.util.get_int_opt(options, optname, default=None)

As get_bool_opt(), but interpret the value as an integer.

pygments.util.get_list_opt(options, optname, default=None)

If the key optname from the dictionary options is a string, split it at whitespace and return it. If it is already a list or a tuple, it is returned as a list.

pygments.util.get_choice_opt(options, optname, allowed, default=None, normcase=False)

If the key optname from the dictionary is not in the sequence allowed, raise an error, otherwise return it.

It also defines an exception:

exception pygments.util.ClassNotFound

Raised if one of the lookup functions didn’t find a matching class.