chemdataextractor

.config

Config file reader/writer.

chemdataextractor.config.construct_yaml_str(self, node)[source]

Override the default string handling function to always return unicode objects.

class chemdataextractor.config.Config(path=None)[source]

Bases: collections.abc.MutableMapping

Read and write to config file.

A config object is essentially a string key-value store that can be treated like a dictionary:

c = Config()
c['foo'] = 'bar'
print c['foo']

The file location may be specified:

c = Config('~/matt/anotherconfig.yml')
c['where'] = 'in a different file'

If no location is specified, the environment variable CHEMDATAEXTRACTOR_CONFIG is checked and used if available. Otherwise, a standard config location is used, which varies depending on the operating system. You can check the location using the path property. For more information see https://github.com/ActiveState/appdirs

It is possible to edit the file by hand with a text editor. It is in YAML format.

Warning: multiple instances of Config() pointing to the same file will not see each others’ changes, and will overwrite the entire file when any key is changed.

__init__(path=None)[source]
Parameters:

path (string) – (Optional) Path to config file location.

path

The path to the config file.

clear()[source]

Clear all values from config.

chemdataextractor.config.config = <Config: /home/docs/.config/ChemDataExtractor/chem...

Global config instance.

.data

Tools for loading and caching data files.

class chemdataextractor.data.Package(path, server_root=None, remote_path=None, unzip=False, untar=False, custom_download=None)[source]

Bases: object

Data package.

__init__(path, server_root=None, remote_path=None, unzip=False, untar=False, custom_download=None)[source]
Parameters:
  • path (str) – The path to where this package will be located under ChemDataExtractor’s default data directory.

  • (optional) server_root (str) – The root path for the server. If you do not supply the remote_path parameter, this will be used to find the remote path for the package.

  • (optional) remote_path (str) – The remote path for the package.

  • (optional) unzip (bool) – Whether the package should be unzipped after download. You should only ever set this or untar to True.

  • (optional) untar (bool) – Whether the package should be untarred after download. You should only ever set this or unzip to True.

remote_path
local_path
remote_exists()[source]
local_exists()[source]
download(force=False)[source]
default_download(force=False)[source]
chemdataextractor.data.get_data_dir()[source]

Return path to the data directory.

chemdataextractor.data.find_data(path, warn=True, get_data=True)[source]

Return the absolute path to a data file within the data directory.

chemdataextractor.data.load_model(path)[source]

Load a model from a pickle file in the data directory. Cached so model is only loaded once.

chemdataextractor.data.PACKAGES = [<Package: models/cem_crf-1.0.pickle>, <Package: m...

Current active data packages

.errors

Error classes for ChemDataExtractor.

exception chemdataextractor.errors.ChemDataExtractorError[source]

Bases: Exception

Base ChemDataExtractor exception.

exception chemdataextractor.errors.ReaderError[source]

Bases: chemdataextractor.errors.ChemDataExtractorError

Raised when a reader is unable to read a document.

exception chemdataextractor.errors.ModelNotFoundError[source]

Bases: chemdataextractor.errors.ChemDataExtractorError

Raised when a model file could not be found.

.utils

Miscellaneous utility functions.

chemdataextractor.utils.memoized_property(fget)[source]

Decorator to create memoized properties.

chemdataextractor.utils.memoize(obj)[source]

Decorator to create memoized functions, methods or classes.

chemdataextractor.utils.python_2_unicode_compatible(klass)[source]

Fix __str__, __unicode__ and __repr__ methods under Python 2.

class chemdataextractor.utils.Singleton[source]

Bases: type

Singleton metaclass.

chemdataextractor.utils.flatten(x)[source]

Return a single flat list containing elements from nested lists.

chemdataextractor.utils.first(el)[source]
chemdataextractor.utils.ensure_dir(path)[source]

Ensure a directory exists.