Cataloger API

The cataloger package supports a fully featured programmatic API which allows Python scripts to build catalogs and check the directory structure against a previously built catalog.


Exceptions

exception cataloger.processor.CatalogError

Raised when there is an error within the catalog file itself (i.e. a formatting issue).

exception cataloger.processor.ConfigError

Raised when there is an error within the config file.

Command Methods

cataloger.commands.create_catalog(verbose=0, **kwargs)
cataloger.commands.check_catalog(verbose=0, **kwargs)

Using the directory structure, either create a catalog, or compare the directory structure against a cataloge. By default using settings defined in catalog.cfg if it exists. If a setting is defined in the config file it can be overidden by argumensts passed to the function.

Returns a Cataloger instance.

Parameters:
  • no_config (Boolean) – True if the config file should be ignored. Defaults to False
  • config (str) – The name of the config file to use Defaults to ‘’. Unless a specific config file is provided the processor will attempt to read defaults.DEFAULT_CONFIG_FILE as the config file.
  • catalog (str) – The name of the catalog file to use Defaults to catalog.cat
  • hash (str) – The name of the hash algorithm to use Defaults to sha224
  • root (str) – The root directory to start the creation or check process. Can be either a absolute or a path relative to the current working directory Defaults to ‘.’
  • extensions (set) – The file extensions to catalog. Defaults to .py, .html, .txt, .css, .js, .gif, .png, .jpg, .jpeg
  • rm_extension (set) – A set of extensiions to remove from catalogue No Default
  • add_extension (set) – A set of extensiions to add to catalogue No Default
  • ignore_directory (set) – A set of directories under root which are to be ignored and not catalogued Defaults to static, htmlcov, media, build, dist, docs
  • rm_directory (set) – A set of directories to removed from the ignore_directory set. No Default
  • add_directory (set) – A list/set of directories to be added to the ignore_directory set. No Default
  • include_filter (list) – A glob file matching filter of files to catalogue. Default behaviour is that all files which have a file extension in the extensions set are catalogued
  • exclude_filter (list) – A glob file matching filter of files to exclude from catalogue. Default behaviour is that no files which have a file extension in the extensions set is excluded from the catalogue.
Raises:

Config file processing :

All of the arguments (and by extension the command line arguments) are processed after the config file if any.

If the no_config flag is True, then all config files are ignored and only the parameters passed to the functions are used.

If no_config is False (the default), and config is ‘’ or None, then the default config file is used only if it exists, and no error is created if the file doesn’t exist.

If no_config is False, and config is provided (even if it is the default name) then the config file is used if it exists, but a warning is generated if the file doesn’t exist - execution continues as if the config file is empty.

Cataloger class

class cataloger.commands.Cataloger

An instance of the Cataloger class is returned by both create_catalog() and check_catalog(). The Cataloger class is not intended to be instantiated on it’s own.

The Cataloger class contains a number of attributes and methods to enable programatic access to the results of the catalog creation or catalog checking tasks.

catalog_file_name

The read only name of the catalog file created or being checked.

processed_count

A read only count of the number files in the catalog.

extension_counts

A read only dictionary of file extensions and the count for each extension.

  • key : file extension (with leading dot)
  • value : A count of the files within the catalog with this extension
excluded_files

A read only list of the paths of all excluded files. All file paths are relative to the root path parameter.

mismatched_files

A read only list of the paths of all mismatched files. All file paths are relative to the root path parameter.

extra_files

A read only list of the paths of all extra files.All file paths are relative to the root path parameter.

missing_files

A read only list of the paths of all missing files. All file paths are relative to the root path parameter.

catalog_summary_by_directory
A generator method which yields a dictionary for each directory within the catalog - the dictionary has the following keys :
  • path: The relative path of the directory being reported on
  • processed: The count of all files in the catalog in this directory
  • excluded: The count of all excluded files from this directory
  • mismatched: The count of the mismatched files from this directory. Will always be zero after a create_catalog() call.
  • missing: The count of the missing files from this directory. Will always be zero after a create_catalog() call.
  • extra: The count of the extra files from this directory. Will always be zero after a create_catalog() call.
is_file_in_catalog(file_path)

True if this file exists in the catalog

is_directory_in_catalog(directory)

True if this directory exists in the catalog