.model¶

Models for storing relationships extracted using chemdataextractor. The hierarchy for models has been greatly rewritten in 2.0, introducing breaking changes to older scripts using ChemDataExtractor. Please refer to the migration guide and the examples for an overview of the changes.

Classes for representing chemical models.

.model.base¶

Data model for extracted information.

class chemdataextractor.model.base.BaseType(default=None, null=False, required=False, requiredness=1.0, contextual=False, contextual_range=<chemdataextractor.model.contextual_range.DocumentRange object>, parse_expression=None, updatable=False, binding=False, ignore_when_merging=False, never_merge=False)[source]¶

Bases: object

name = None¶

__init__(default=None, null=False, required=False, requiredness=1.0, contextual=False, contextual_range=<chemdataextractor.model.contextual_range.DocumentRange object>, parse_expression=None, updatable=False, binding=False, ignore_when_merging=False, never_merge=False)[source]¶

Parameters:

default – (Optional) The default value for this field if none is set.
null (bool) – (Optional) Include in serialized output even if value is None. Default False.
required (bool) – (Optional) Whether a value is required. Default False.
contextual (bool) – (Optional) Whether this value is contextual. Default False.
contextual_range (ContextualRange) – (Optional) The maximum range within which contextual merging can occur if the value is contextual. Default DocumentRange. (i.e. merges across the entire document)
parse_expression (BaseParserElement) – (Optional) Expression for parsing, instance of a subclass of BaseParserElement. Default None.
updatable (bool) – (Optional) Whether the parse_expression can be changed by the document as parsing occurs. Default False.
binding (bool) – (Optional) If this option is set to True, any submodels that have an attribute with the same name must have the same value for this attribute. Default False.
ignore_when_merging (bool) – (Optional) If this option is set to True, any records with a different value for this field is treated as corresponding to the same physical record.

reset()[source]¶: Reset the parse expression to the initial value.

process(value)[source]¶: Convert an assigned value into the desired data format for this field.

serialize(value, primitive=False)[source]¶: Serialize this field.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.StringType(default=None, null=False, required=False, requiredness=1.0, contextual=False, contextual_range=<chemdataextractor.model.contextual_range.DocumentRange object>, parse_expression=None, updatable=False, binding=False, ignore_when_merging=False, never_merge=False)[source]¶

Bases: chemdataextractor.model.base.BaseType

process(value)[source]¶: Convert value to a unicode string. Useful in case lxml _ElementUnicodeResult are passed from parser.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.FloatType(default=None, null=False, required=False, requiredness=1.0, contextual=False, contextual_range=<chemdataextractor.model.contextual_range.DocumentRange object>, parse_expression=None, updatable=False, binding=False, ignore_when_merging=False, never_merge=False)[source]¶

Bases: chemdataextractor.model.base.BaseType

An floating point number field.

process(value)[source]¶: Convert value to a float.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.ModelType(model, **kwargs)[source]¶

Bases: chemdataextractor.model.base.BaseType

__init__(model, **kwargs)[source]¶

Parameters:

default – (Optional) The default value for this field if none is set.
null (bool) – (Optional) Include in serialized output even if value is None. Default False.
required (bool) – (Optional) Whether a value is required. Default False.
contextual (bool) – (Optional) Whether this value is contextual. Default False.
contextual_range (ContextualRange) – (Optional) The maximum range within which contextual merging can occur if the value is contextual. Default DocumentRange. (i.e. merges across the entire document)
parse_expression (BaseParserElement) – (Optional) Expression for parsing, instance of a subclass of BaseParserElement. Default None.
updatable (bool) – (Optional) Whether the parse_expression can be changed by the document as parsing occurs. Default False.
binding (bool) – (Optional) If this option is set to True, any submodels that have an attribute with the same name must have the same value for this attribute. Default False.
ignore_when_merging (bool) – (Optional) If this option is set to True, any records with a different value for this field is treated as corresponding to the same physical record.

process(value)[source]¶: Convert an assigned value into the desired data format for this field.

serialize(value, primitive=False)[source]¶: Serialize this field.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.ListType(field, default=None, sorted_=False, **kwargs)[source]¶

Bases: chemdataextractor.model.base.BaseType

__init__(field, default=None, sorted_=False, **kwargs)[source]¶

Parameters:

default – (Optional) The default value for this field if none is set.
null (bool) – (Optional) Include in serialized output even if value is None. Default False.
required (bool) – (Optional) Whether a value is required. Default False.
contextual (bool) – (Optional) Whether this value is contextual. Default False.
contextual_range (ContextualRange) – (Optional) The maximum range within which contextual merging can occur if the value is contextual. Default DocumentRange. (i.e. merges across the entire document)
parse_expression (BaseParserElement) – (Optional) Expression for parsing, instance of a subclass of BaseParserElement. Default None.
updatable (bool) – (Optional) Whether the parse_expression can be changed by the document as parsing occurs. Default False.
binding (bool) – (Optional) If this option is set to True, any submodels that have an attribute with the same name must have the same value for this attribute. Default False.
ignore_when_merging (bool) – (Optional) If this option is set to True, any records with a different value for this field is treated as corresponding to the same physical record.

serialize(value, primitive=False)[source]¶: Serialize this field.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.InferredProperty(field, origin_field, inferrer, **kwargs)[source]¶

Bases: chemdataextractor.model.base.BaseType

A property that is inferred from the value of another property via an inferrer function. An example is the processing the raw value extracted from a document into a list of floats, which can be seen in QuantityModel, where value is inferred from raw_value.

__init__(field, origin_field, inferrer, **kwargs)[source]¶

Parameters:

field (BaseType) – The type expected as a result of inference.
origin_field (str) – The name of the field from which to infer the value. This can be a keypath, as detailed in BaseModel
inferrer (function) – The function which is used to infer the value of the field. The function should have a signature of (object value of the origin field, BaseModel the instance for which the value is being inferred) -> object the value that the inferred field should have
default – (Optional) The default value for this field if none is set.
null (bool) – (Optional) Include in serialized output even if value is None. Default False.
required (bool) – (Optional) Whether a value is required. Default False.
contextual (bool) – (Optional) Whether this value is contextual. Default False.
parse_expression (BaseParserElement) – (Optional) Expression for parsing, instance of a subclass of BaseParserElement. Default None.
updatable (bool) – (Optional) Whether the parse_expression can be changed by the document as parsing occurs. Default False
binding (bool) – (Optional) If this option is set to True, any submodels that have an attribute with the same name must have the same value for this attribute

process(value)[source]¶: Convert an assigned value into the desired data format for this field.

serialize(value, primitive=False)[source]¶: Serialize this field.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.SetType(field, default=None, **kwargs)[source]¶

Bases: chemdataextractor.model.base.BaseType

__init__(field, default=None, **kwargs)[source]¶

Parameters:

default – (Optional) The default value for this field if none is set.
null (bool) – (Optional) Include in serialized output even if value is None. Default False.
required (bool) – (Optional) Whether a value is required. Default False.
contextual (bool) – (Optional) Whether this value is contextual. Default False.
contextual_range (ContextualRange) – (Optional) The maximum range within which contextual merging can occur if the value is contextual. Default DocumentRange. (i.e. merges across the entire document)
parse_expression (BaseParserElement) – (Optional) Expression for parsing, instance of a subclass of BaseParserElement. Default None.
updatable (bool) – (Optional) Whether the parse_expression can be changed by the document as parsing occurs. Default False.
binding (bool) – (Optional) If this option is set to True, any submodels that have an attribute with the same name must have the same value for this attribute. Default False.
ignore_when_merging (bool) – (Optional) If this option is set to True, any records with a different value for this field is treated as corresponding to the same physical record.

serialize(value, primitive=False)[source]¶: Serialize this field.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.base.ModelMeta[source]¶

Bases: abc.ABCMeta

required_fields¶

class chemdataextractor.model.base.BaseModel(**raw_data)[source]¶

Bases: object

A base class for representing a model within ChemDataExtractor. Each model can have a number of fields that are declared with the class:

class ExampleModel(BaseModel):
    string_field = StringType()
    number_field = FloatType()

See the documentation for BaseType for more information. These fields are required for ChemDataExtractor to correctly identify what to extract and for merging different records for the same model.

The attributes in the models can then be accessed via either dot notation:

example_record.string_field

or dictionary notation:

example_record["string_field"]

You can have nexted models, as in the example below, where a new class, ExampleModel2 can contain an ExampleModel:

class ExampleModel2(BaseModel):
    model_field = ModelType(ExampleModel)

keypath notation can be used to find the nested properties:

example_record2["model_field.string_field"]

fields = {}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

specifier = None¶

__init__(**raw_data)[source]¶

classmethod deserialize(serialized)[source]¶

get_confidence(key, default_confidence=None, pooling_method=<function min_value>)[source]¶

set_confidence(key, value)[source]¶

total_confidence(pooling_method=<function min_value>, _account_for_merging=False)[source]¶

is_unidentified¶: If there is no ‘compound’ field associated with the model but the compound is contextual

classmethod reset_updatables()[source]¶: Reset all updatable parse_expressions of properties associated with the class.

classmethod update(definitions, strict=True)[source]¶

Update this Element’s updatable attributes with new information from definitions

Parameters:: {list} -- list of definitions found in this element (definitions) –

updated¶: True/False dependent on if a specifier within the model was updated.

keys()[source]¶

items()[source]¶

values()[source]¶

get(key, default=None)[source]¶

contextual_fulfilled¶

Whether all the contextual fields have been extracted.

Returns:: True if all fields have been found, False if not.
Return type:: bool

required_fulfilled¶

Whether all the required fields have been extracted.

Returns:: True if all fields have been found, False if not.
Return type:: bool

noncontextual_required_fulfilled¶

Whether all the non-contextual required fields have been extracted.

Returns:: True if all fields have been found, False if not.
Return type:: bool

serialize(primitive=False)[source]¶: Convert Model to python dictionary.

to_json(*args, **kwargs)[source]¶: Convert Model to JSON.

is_superset(other)[source]¶

Whether this model instance is a ‘superset’ of the other model instance.

A model instance is a ‘superset’ of another if it satisfies the following conditions:

The model instances are of the same type
For each of the attributes of the model instances, either:
- This instance has more information, or
- Both instances have the same information

Parameters:: other (BaseModel) – The other model instance to compare with this model instance
Returns:: Whether this model instance is a superset of the other model instance
Return type:: bool

is_subset(other)[source]¶

Whether this model instance is a ‘subset’ of the other model instance.

A model instance is a ‘subset’ of another if it satisfies the following conditions:

The model instances are of the same type
For each of the attributes of the model instances, either:
- The other instance has more information, or
- Both instances have the same information

Parameters:: other (BaseModel) – The other model instance to compare with this model instance
Returns:: Whether this model instance is a subset of the other model instance
Return type:: bool

merge_contextual(other, distance=<chemdataextractor.model.contextual_range.SentenceRange object>)[source]¶

Merges any fields marked contextual with additional information from other provided that:

other is of the same type and they don’t have any conflicting fields

other is a model type that is part of this model and that field is currently set to be the default value or the field can be merged with the other.

Note

This method mutates the model it’s called on and returns it.

Parameters:: other (BaseModel) – The other model to merge into this model
Returns:: A merged model
Return type:: BaseModel

contextual_range(field_name)[source]¶

The contextual range for a field. Override this method to allow for contextual ranges to change with time.

Parameters:: field_name (str) – The name of the field for which to calculate the contextual range
Returns:: The contextual range for the field given the current record
Return type:: ContextualRange

merge_all(other, strict=True)[source]¶

Merges any properties between other and self, regardless of whether that field is contextual. Checks to make sure that there are no conflicts between the values contained in self and those in other.

Note

This method mutates the model it’s called on and returns it.

Parameters:: other (BaseModel) – The other model to merge into this model
Returns:: A merged model
Return type:: BaseModel

merge_confidence(other, field_name)[source]¶

classmethod flatten(include_inferred=True)[source]¶

A set of all models that are associated with this model. For example, if we have a model like the following with multiple submodels:

class A(BaseModel):
    pass

class B(BaseModel):
    a = ModelType(A)

class C(BaseModel):
    b = ModelType(B)

then C.flatten() would give the result:

set(C, B, A)

Returns:: The set of all models associated with this model.
Return type:: set(BaseModel)

binding_properties¶

A dictionary of all binding properties in this model, and their values.

Note

This function only returns those properties that are immediately binding for this model, and not for any submodels.

Returns:: A dictionary with the names of all binding fields as the keys and their values as the values.
Return type:: {str: Any}

record_method¶: Description (string) of which method was used to create this record.

is_empty¶

class chemdataextractor.model.base.ModelList(*models)[source]¶

Bases: collections.abc.MutableSequence

Wrapper around a list of Models objects to facilitate operations on all at once.

__init__(*models)[source]¶: Initialize self. See help(type(self)) for accurate signature.

insert(index, value)[source]¶: S.insert(index, value) – insert value before index

serialize()[source]¶: Serialize to a list of python dictionaries.

to_json(*args, **kwargs)[source]¶: Convert ModelList to JSON.

remove_subsets(strict=False)[source]¶

Remove any subsets contained within the ModelList.

Parameters:: strict (bool) – Default True. Whether only strict subsets are removed. When this is False, duplicates are removed too.

chemdataextractor.model.base.sort_merge_candidates(merge_candidates, adjust_by_confidence=True)[source]¶

.model.model¶

Model classes for physical properties.

class chemdataextractor.model.model.Compound(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

names¶

labels¶

roles¶

parsers = [<chemdataextractor.parse.cem.CompoundParser object>, <chemdataextractor.parse.cem.CompoundHeadingParser object>, <chemdataextractor.parse.cem.ChemicalLabelParser object>, <chemdataextractor.parse.cem.CompoundTableParser object>]¶

merge(other)[source]¶: Merge data from another Compound into this Compound.

is_unidentified¶: If there is no ‘compound’ field associated with the model but the compound is contextual

is_id_only¶: Return True if identifier information only.

classmethod update(definitions, strict=True)[source]¶

Update the Compound labels parse expression

Parameters:: {list} -- list of definitions found in this element (definitions) –

construct_label_expression(label)[source]¶

fields = {'labels': <chemdataextractor.model.base.SetType object>, 'names': <chemdataextractor.model.base.SetType object>, 'roles': <chemdataextractor.model.base.SetType object>}¶

class chemdataextractor.model.model.Apparatus(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

name¶

parsers = [<chemdataextractor.parse.apparatus.ApparatusParser object>]¶

fields = {'name': <chemdataextractor.model.base.StringType object>}¶

class chemdataextractor.model.model.UvvisPeak(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

value¶: Peak value, i.e. wavelength

units¶: Peak value units

extinction¶

extinction_units¶

shape¶

fields = {'extinction': <chemdataextractor.model.base.StringType object>, 'extinction_units': <chemdataextractor.model.base.StringType object>, 'shape': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.StringType object>, 'value': <chemdataextractor.model.base.StringType object>}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.UvvisSpectrum(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

solvent¶

temperature¶

temperature_units¶

concentration¶

concentration_units¶

apparatus¶

peaks¶

compound¶

parsers = [<chemdataextractor.parse.uvvis.UvvisParser object>]¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'compound': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'peaks': <chemdataextractor.model.base.ListType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'temperature': <chemdataextractor.model.base.StringType object>, 'temperature_units': <chemdataextractor.model.base.StringType object>}¶

class chemdataextractor.model.model.IrPeak(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

value¶

units¶

strength¶

bond¶

fields = {'bond': <chemdataextractor.model.base.StringType object>, 'strength': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.StringType object>, 'value': <chemdataextractor.model.base.StringType object>}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.IrSpectrum(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

solvent¶

temperature¶

temperature_units¶

concentration¶

concentration_units¶

apparatus¶

peaks¶

compound¶

parsers = [<chemdataextractor.parse.ir.IrParser object>]¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'compound': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'peaks': <chemdataextractor.model.base.ListType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'temperature': <chemdataextractor.model.base.StringType object>, 'temperature_units': <chemdataextractor.model.base.StringType object>}¶

class chemdataextractor.model.model.NmrPeak(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

shift¶

intensity¶

multiplicity¶

coupling¶

coupling_units¶

number¶

assignment¶

fields = {'assignment': <chemdataextractor.model.base.StringType object>, 'coupling': <chemdataextractor.model.base.StringType object>, 'coupling_units': <chemdataextractor.model.base.StringType object>, 'intensity': <chemdataextractor.model.base.StringType object>, 'multiplicity': <chemdataextractor.model.base.StringType object>, 'number': <chemdataextractor.model.base.StringType object>, 'shift': <chemdataextractor.model.base.StringType object>}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.NmrSpectrum(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

nucleus¶

solvent¶

frequency¶

frequency_units¶

standard¶

temperature¶

temperature_units¶

concentration¶

concentration_units¶

apparatus¶

peaks¶

compound¶

parsers = [<chemdataextractor.parse.nmr.NmrParser object>]¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'compound': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'frequency': <chemdataextractor.model.base.StringType object>, 'frequency_units': <chemdataextractor.model.base.StringType object>, 'nucleus': <chemdataextractor.model.base.StringType object>, 'peaks': <chemdataextractor.model.base.ListType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'standard': <chemdataextractor.model.base.StringType object>, 'temperature': <chemdataextractor.model.base.StringType object>, 'temperature_units': <chemdataextractor.model.base.StringType object>}¶

class chemdataextractor.model.model.MeltingPoint(**raw_data)[source]¶

Bases: chemdataextractor.model.units.temperature.TemperatureModel

solvent¶

concentration¶

concentration_units¶

apparatus¶

compound¶

parsers = [<chemdataextractor.parse.mp_new.MpParser object>]¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'compound': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

class chemdataextractor.model.model.GlassTransition(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

A glass transition temperature.

value¶

units¶

method¶

concentration¶

concentration_units¶

compound¶

parsers = [<chemdataextractor.parse.tg.TgParser object>]¶

fields = {'compound': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'method': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.StringType object>, 'value': <chemdataextractor.model.base.StringType object>}¶

class chemdataextractor.model.model.QuantumYield(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

A quantum yield measurement.

value¶

units¶

solvent¶

type¶

standard¶

standard_value¶

standard_solvent¶

concentration¶

concentration_units¶

temperature¶

temperature_units¶

apparatus¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'standard': <chemdataextractor.model.base.StringType object>, 'standard_solvent': <chemdataextractor.model.base.StringType object>, 'standard_value': <chemdataextractor.model.base.StringType object>, 'temperature': <chemdataextractor.model.base.StringType object>, 'temperature_units': <chemdataextractor.model.base.StringType object>, 'type': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.StringType object>, 'value': <chemdataextractor.model.base.StringType object>}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.FluorescenceLifetime(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

A fluorescence lifetime measurement.

value¶

units¶

solvent¶

concentration¶

concentration_units¶

temperature¶

temperature_units¶

apparatus¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'temperature': <chemdataextractor.model.base.StringType object>, 'temperature_units': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.StringType object>, 'value': <chemdataextractor.model.base.StringType object>}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.ElectrochemicalPotential(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

An oxidation or reduction potential, from cyclic voltammetry.

value¶

units¶

type¶

solvent¶

concentration¶

concentration_units¶

temperature¶

temperature_units¶

apparatus¶

fields = {'apparatus': <chemdataextractor.model.base.ModelType object>, 'concentration': <chemdataextractor.model.base.StringType object>, 'concentration_units': <chemdataextractor.model.base.StringType object>, 'solvent': <chemdataextractor.model.base.StringType object>, 'temperature': <chemdataextractor.model.base.StringType object>, 'temperature_units': <chemdataextractor.model.base.StringType object>, 'type': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.StringType object>, 'value': <chemdataextractor.model.base.StringType object>}¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.NeelTemperature(**raw_data)[source]¶

Bases: chemdataextractor.model.units.temperature.TemperatureModel

expression = <chemdataextractor.parse.elements.IWord object>¶

specifier¶

compound¶

fields = {'compound': <chemdataextractor.model.base.ModelType object>, 'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.CurieTemperature(**raw_data)[source]¶

Bases: chemdataextractor.model.units.temperature.TemperatureModel

expression = <chemdataextractor.parse.elements.First object>¶

specifier¶

compound¶

fields = {'compound': <chemdataextractor.model.base.ModelType object>, 'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.InteratomicDistance(**raw_data)[source]¶

Bases: chemdataextractor.model.units.length.LengthModel

specifier_expression = <chemdataextractor.parse.elements.And object>¶

specifier¶

rij_label = <chemdataextractor.parse.elements.Regex object>¶

species¶

compound¶

another_label¶

fields = {'another_label': <chemdataextractor.model.base.StringType object>, 'compound': <chemdataextractor.model.base.ModelType object>, 'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'species': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.CoordinationNumber(**raw_data)[source]¶

Bases: chemdataextractor.model.units.quantity_model.DimensionlessModel

coordination_number_label = <chemdataextractor.parse.elements.Regex object>¶

specifier_expression = <chemdataextractor.parse.elements.Regex object>¶

specifier¶

cn_label¶

compound¶

fields = {'cn_label': <chemdataextractor.model.base.StringType object>, 'compound': <chemdataextractor.model.base.ModelType object>, 'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.model.CNLabel(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

coordination_number_label = <chemdataextractor.parse.elements.Regex object>¶

specifier = <chemdataextractor.parse.elements.And object>¶

label_Juraj¶

compound¶

parsers = [<chemdataextractor.parse.auto.AutoSentenceParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

fields = {'compound': <chemdataextractor.model.base.ModelType object>, 'label_Juraj': <chemdataextractor.model.base.StringType object>}¶

.model.units¶

Types for representing quantities, dimensions, and units.

codeauthor:: Taketomo Isazawa (ti250@cam.ac.uk)

chemdataextractor.model.units.standard_units¶

.model.units.unit¶

Base types for making units. Refer to the example on creating new units and dimensions for more detail on how to create your own units.

class chemdataextractor.model.units.unit.UnitType(default=None, null=False, required=False, requiredness=1.0, contextual=False, contextual_range=<chemdataextractor.model.contextual_range.DocumentRange object>, parse_expression=None, updatable=False, binding=False, ignore_when_merging=False, never_merge=False)[source]¶

Bases: chemdataextractor.model.base.BaseType

A field containing a Unit of some type.

process(value)[source]¶: Convert an assigned value into the desired data format for this field.

serialize(value, primitive=False)[source]¶: Serialize this field.

is_empty(value)[source]¶: Return whether a value is considered empty for the case of this field.

class chemdataextractor.model.units.unit.MetaUnit[source]¶

Bases: type

Metaclass to ensure that all subclasses of Unit take the magnitude into account when converting to standard units.

class chemdataextractor.model.units.unit.Unit(dimensions, magnitude=0.0, powers=None)[source]¶

Bases: object

Object represeting units. Implement subclasses of this for basic units. Units like meters, seconds, and Kelvins are already implemented in ChemDataExtractor. These can then be combined by simply dividing or multiplying them to create more complex units. Alternatively, one can create these by subclassing Unit and setting the powers parameter as desired. For example, a speed could be represented as either:

speedunit = Meter() / Second()

class SpeedUnit(Unit):

    def __init__(self, magnitude=1.0):
        super(SpeedUnit, self).__init__(Length()/Time(),
                                        powers={Meter():1.0, Second():-1.0} )

speedunit = SpeedUnit()

and either method should produce the same results.

Any subclass of Unit which represents a real unit should implement the following methods:

convert_value_to_standard
convert_value_from_standard
convert_error_to_standard
convert_error_from_standard

These methods ensure that Units can be seamlessly converted to other ones. Any magnitudes placed in front of the units, e.g. kilometers, are handled automatically. Care must be taken that the ‘standard’ unit chosen is obvious, consistent, and documented, else another user may implement new units with the same dimensions but a different standard unit, resulting in unexpected errors. To ensure correct behaviour, one should also define the standard unit in code by setting the corresponding dimension’s standard_units, unless the dimension is a composite one, in which case the standard unit can often be inferred from the constituent units’ standard untis

base_magnitude = 0.0¶

constituent_units = None¶

Unit instance for showing constituent units. Used for creating more complex models. An example would be:

class Newton(Unit):
    constituent_units = Gram(magnitude=3.0) * Meter() * (Second()) ** (-2.0)

__init__(dimensions, magnitude=0.0, powers=None)[source]¶

Creates a unit object. Subclass Unit to create concrete units. For examples, see lengths.py and times.py

Parameters:

dimensions (Dimension) – The dimensions this unit is for, e.g. Temperature
magnitude (float) – (Optional) The magnitude of the unit. e.g. km would be meters with an magnitude of 3
powers (dict[Unit : float]) – (Optional) For representing any more complicated units, e.g. m/s may have this parameter set to {Meter():1.0, Second():-1.0}

convert_value_to_standard(value)[source]¶

convert_value_from_standard(value)[source]¶

convert_error_to_standard(value)[source]¶

convert_error_from_standard(value)[source]¶

class chemdataextractor.model.units.unit.DimensionlessUnit(magnitude=0.0)[source]¶

Bases: chemdataextractor.model.units.unit.Unit

Special case to handle dimensionless quantities.

__init__(magnitude=0.0)[source]¶

Parameters:: magnitude (float) – The magnitude of the unit.

convert_to_standard(value)[source]¶

convert_error_from_standard(value)¶

convert_error_to_standard(value)¶

convert_from_standard(value)[source]¶

convert_value_from_standard(value)¶

convert_value_to_standard(value)¶

.model.units.dimension¶

Base types for dimensions. Refer to the example on creating new units and dimensions for more detail on how to create your own dimensions.

chemdataextractor.model.units.dimension.standard_units¶

class chemdataextractor.model.units.dimension.Dimension[source]¶

Bases: object

Class for representing physical dimensions.

constituent_dimensions = None¶

Used for creating composite dimensions. It is of type Dimension. An example would be speed, in which case we would have:

class Speed(Dimension):
    constituent_dimensions = Length() / Time()

units_dict = {}¶

Used for extracting units with these dimensions. It is of type dictionary{chemdataextractor.parse.element : Unit or None}.

An element is the key for None when an element is needed for autoparsing to work correctly, but one does not want to take account of this when extracting a unit from a merged string.

An example of this is °C, which is always split into two tokens, so we need to be able to capture ° and C separately using elements from the units_dict, but we do not want this to affect extract_units(), to which the single string ‘°C’ is passed in. As a solution, we have the following units_dict:

units_dict = {R('°?(((K|k)elvin(s)?)|K)\.?', group=0): Kelvin,
      R('(°C|((C|c)elsius))\.?', group=0): Celsius,
      R('°?((F|f)ahrenheit|F)\.?', group=0): Fahrenheit,
      R('°|C', group=0): None}

Note

The units_dict has been extensively tested using regex elements, and while in theory it may work with other parse elements, it is strongly recommended that you use a regex element. If a regex element is specified, it should

Not have a $ symbol at the end: the units can be passed in with numbers or other symbols after it, and these are also used in the autoparser to find candidate tokens which may contain units, and a $ symbol at the end would stop this from working
Have the group attribute set to 0. Unless this is set, the default behaviour of the regex element is to return the whole token in which the match was found. This is unhelpful behaviour for our logic for extracting units, as we want to extract only the exact characters that matched the unit.

standard_units¶

The standard units for this dimension. Of type Unit.

Set this attribute when creating a new dimension to make converting to the standard units easy via convert_to_standard(), and to make it clear in the code what the standard units are.

The standard units when you multiply dimensions together/ have composite dimensions are automatically handled by the class.

class chemdataextractor.model.units.dimension.Dimensionless[source]¶

Bases: chemdataextractor.model.units.dimension.Dimension

Special case to handle dimensionless quantities.

standard_units¶

.model.units.quantity_model¶

Base types for making quantity models.

codeauthor:: Taketomo Isazawa (ti250@cam.ac.uk)

class chemdataextractor.model.units.quantity_model.QuantityModel(**raw_data)[source]¶

Bases: chemdataextractor.model.base.BaseModel

Class for modelling quantities. Subclasses of this model can be used in conjunction with Autoparsers to extract properties with zero human intervention. However, they must be constructed in a certain way for them to work optimally with autoparsers. Namely, they should have:

A specifier field with an associated parse expression (Optional, only required if autoparsers are desired). These parse expressions will be updated automatically using forward-looking Interdependency Resolution if the updatable flag is set to True.
These specifiers should also have required set to True so that spurious matches are not found.
If applicable, a compound field, named compound.

Any parse_expressions set in the model should have an added action to ensure that the results are a single word. An example would be to call add_action(join) on each parse expression.

raw_value¶

raw_units¶

value¶: A property that is inferred from the value of another property via an inferrer function. An example is the processing the raw value extracted from a document into a list of floats, which can be seen in QuantityModel, where value is inferred from raw_value.

units¶: A property that is inferred from the value of another property via an inferrer function. An example is the processing the raw value extracted from a document into a list of floats, which can be seen in QuantityModel, where value is inferred from raw_value.

error¶: A property that is inferred from the value of another property via an inferrer function. An example is the processing the raw value extracted from a document into a list of floats, which can be seen in QuantityModel, where value is inferred from raw_value.

dimensions = None¶

specifier¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

convert_to(unit)[source]¶

Convert from current units to the given units. Raises AttributeError if the current unit is not set.

Note

This method both modifies the current model and returns the modified model.

Parameters:: unit (Unit) – The Unit to convert to
Returns:: The quantity in the given units.
Return type:: QuantityModel

convert_to_standard()[source]¶

Convert from current units to the standard units. Raises AttributeError if the current unit has not been set or the dimensions do not have standard units.

Note

This method both modifies the current model and returns the modified model.

Returns:: The quantity in the given units.
Return type:: QuantityModel

convert_value(from_unit, to_unit)[source]¶

Convert between the given units. If no units have been set for this model, assumes that it’s in standard units.

Parameters:

from_unit (Unit) – The Unit to convert from
to_unit (Unit) – The Unit to convert to

Returns:

The value as expressed in the new unit

Return type:

float

convert_error(from_unit, to_unit)[source]¶

Converts error between given units If no units have been set for this model, assumes that it’s in standard units.

Parameters:

from_unit (Unit) – The Unit to convert from
to_unit (Unit) – The Unit to convert to

Returns:

The error as expressed in the new unit

Return type:

float

is_equal(other)[source]¶

Tests whether the two quantities are physically equal, i.e. whether they represent the same value just in different units.

Parameters:: other (QuantityModel) – The quantity being compared with
Returns:: Whether the two quantities are equal
Return type:: bool

is_superset(other)[source]¶

Whether this model instance is a ‘superset’ of the other model instance.

A model instance is a ‘superset’ of another if it satisfies the following conditions:

The model instances are of the same type
For each of the attributes of the model instances, either:
- This instance has more information, or
- Both instances have the same information

Parameters:: other (BaseModel) – The other model instance to compare with this model instance
Returns:: Whether this model instance is a superset of the other model instance
Return type:: bool

fields = {'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

class chemdataextractor.model.units.quantity_model.DimensionlessModel(**raw_data)[source]¶

Bases: chemdataextractor.model.units.quantity_model.QuantityModel

Special case to handle dimensionless quantities

dimensions = <chemdataextractor.model.units.dimension.Dimensionless object>¶

raw_units¶

fields = {'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

.model.units.length¶

Units and models for lengths.

codeauthor:: Taketomo Isazawa (ti250@cam.ac.uk)

class chemdataextractor.model.units.length.Length[source]¶

Bases: chemdataextractor.model.units.dimension.Dimension

Dimension subclass for lengths.

standard_units¶

units_dict = {<chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.length.Meter'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.length.Mile'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.length.Angstrom'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.length.Micron'>}¶

class chemdataextractor.model.units.length.LengthModel(**raw_data)[source]¶

Bases: chemdataextractor.model.units.quantity_model.QuantityModel

Model for lengths.

dimensions = <chemdataextractor.model.units.length.Length object>¶

fields = {'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.units.length.LengthUnit(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.unit.Unit

Base class for units with dimensions of length. The standard value for length is defined to be a meter, implemented in the Meter class.

__init__(magnitude=0.0, powers=None)[source]¶

Creates a unit object. Subclass Unit to create concrete units. For examples, see lengths.py and times.py

Parameters:

dimensions (Dimension) – The dimensions this unit is for, e.g. Temperature
magnitude (float) – (Optional) The magnitude of the unit. e.g. km would be meters with an magnitude of 3
powers (dict[Unit : float]) – (Optional) For representing any more complicated units, e.g. m/s may have this parameter set to {Meter():1.0, Second():-1.0}

convert_error_from_standard(value)¶

convert_error_to_standard(value)¶

convert_value_from_standard(value)¶

convert_value_to_standard(value)¶

class chemdataextractor.model.units.length.Meter(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.length.LengthUnit

Class for meters.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.length.Mile(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.length.LengthUnit

Class for miles.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.length.Angstrom(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.length.LengthUnit

Class for Angstroms.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.length.Micron(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.length.LengthUnit

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

.model.units.mass¶

Units and models for masses.

codeauthor:: Taketomo Isazawa (ti250@cam.ac.uk)

class chemdataextractor.model.units.mass.Mass[source]¶

Bases: chemdataextractor.model.units.dimension.Dimension

Dimension subclass for masses.

standard_units¶

units_dict = {<chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.mass.Gram'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.mass.Pound'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.mass.Pound'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.mass.Tonne'>}¶

class chemdataextractor.model.units.mass.MassModel(**raw_data)[source]¶

Bases: chemdataextractor.model.units.quantity_model.QuantityModel

Model for mass.

dimensions = <chemdataextractor.model.units.mass.Mass object>¶

fields = {'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.units.mass.MassUnit(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.unit.Unit

Base class for units with dimensions of mass. The standard value for mass is defined to be a kilogram, which can be created with Gram(magnitude=3.0)

__init__(magnitude=0.0, powers=None)[source]¶

Creates a unit object. Subclass Unit to create concrete units. For examples, see lengths.py and times.py

Parameters:

dimensions (Dimension) – The dimensions this unit is for, e.g. Temperature
magnitude (float) – (Optional) The magnitude of the unit. e.g. km would be meters with an magnitude of 3
powers (dict[Unit : float]) – (Optional) For representing any more complicated units, e.g. m/s may have this parameter set to {Meter():1.0, Second():-1.0}

convert_error_from_standard(value)¶

convert_error_to_standard(value)¶

convert_value_from_standard(value)¶

convert_value_to_standard(value)¶

class chemdataextractor.model.units.mass.Gram(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.mass.MassUnit

Class for grams.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.mass.Pound(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.mass.MassUnit

Class for pounds.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.mass.Tonne(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.mass.MassUnit

Class for tonnes, i.e. metric tons.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

.model.units.time¶

Units and models for times.

codeauthor:: Taketomo Isazawa (ti250@cam.ac.uk)

class chemdataextractor.model.units.time.Time[source]¶

Bases: chemdataextractor.model.units.dimension.Dimension

Dimension subclass for times.

standard_units¶

units_dict = {<chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.time.Day'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.time.Year'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.time.Hour'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.time.Minute'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.time.Second'>}¶

class chemdataextractor.model.units.time.TimeModel(**raw_data)[source]¶

Bases: chemdataextractor.model.units.quantity_model.QuantityModel

Model for times. These models should strictly be used for time intervals, never absolute times, as peculiarities of calendars are not supported, e.g. a minute is always defined as 60 seconds.

dimensions = <chemdataextractor.model.units.time.Time object>¶

fields = {'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.units.time.TimeUnit(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.unit.Unit

__init__(magnitude=0.0, powers=None)[source]¶: Base class for units with dimensions of time. The standard value for time is defined to be a second, implemented in the Second class.

convert_error_from_standard(value)¶

convert_error_to_standard(value)¶

convert_value_from_standard(value)¶

convert_value_to_standard(value)¶

class chemdataextractor.model.units.time.Second(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.time.TimeUnit

Class for seconds.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.time.Hour(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.time.TimeUnit

Class for hours.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.time.Minute(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.time.TimeUnit

Class for minutes.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.time.Year(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.time.TimeUnit

Class for years.

convert_to_standard(value)[source]¶

convert_from_standard(value)[source]¶

convert_error_from_standard(value)¶

convert_error_to_standard(value)¶

convert_value_from_standard(value)¶

convert_value_to_standard(value)¶

class chemdataextractor.model.units.time.Day(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.time.TimeUnit

Class for days.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

.model.units.temperature¶

Units and models for temperatures.

codeauthor:: Taketomo Isazawa (ti250@cam.ac.uk)

class chemdataextractor.model.units.temperature.Temperature[source]¶

Bases: chemdataextractor.model.units.dimension.Dimension

Dimension subclass for temperatures.

standard_units¶

units_dict = {<chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.temperature.Kelvin'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.temperature.Celsius'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.temperature.Celsius'>, <chemdataextractor.parse.elements.Regex object>: <class 'chemdataextractor.model.units.temperature.Fahrenheit'>, <chemdataextractor.parse.elements.Regex object>: None}¶

class chemdataextractor.model.units.temperature.TemperatureModel(**raw_data)[source]¶

Bases: chemdataextractor.model.units.quantity_model.QuantityModel

Model for temperatures.

dimensions = <chemdataextractor.model.units.temperature.Temperature object>¶

fields = {'error': <chemdataextractor.model.base.InferredProperty object>, 'raw_units': <chemdataextractor.model.base.StringType object>, 'raw_value': <chemdataextractor.model.base.StringType object>, 'specifier': <chemdataextractor.model.base.StringType object>, 'units': <chemdataextractor.model.base.InferredProperty object>, 'value': <chemdataextractor.model.base.InferredProperty object>}¶

parsers = [<chemdataextractor.parse.template.MultiQuantityModelTemplateParser object>, <chemdataextractor.parse.template.QuantityModelTemplateParser object>, <chemdataextractor.parse.auto.AutoTableParser object>]¶

class chemdataextractor.model.units.temperature.TemperatureUnit(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.unit.Unit

Base class for units with dimensions of temperature. The standard value for temperature is defined to be a Kelvin, implemented in the Kelvin class.

__init__(magnitude=0.0, powers=None)[source]¶

Creates a unit object. Subclass Unit to create concrete units. For examples, see lengths.py and times.py

Parameters:

dimensions (Dimension) – The dimensions this unit is for, e.g. Temperature
magnitude (float) – (Optional) The magnitude of the unit. e.g. km would be meters with an magnitude of 3
powers (dict[Unit : float]) – (Optional) For representing any more complicated units, e.g. m/s may have this parameter set to {Meter():1.0, Second():-1.0}

convert_error_from_standard(value)¶

convert_error_to_standard(value)¶

convert_value_from_standard(value)¶

convert_value_to_standard(value)¶

class chemdataextractor.model.units.temperature.Kelvin(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.temperature.TemperatureUnit

Class for Kelvins.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.temperature.Celsius(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.temperature.TemperatureUnit

Class for Celsius

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

class chemdataextractor.model.units.temperature.Fahrenheit(magnitude=0.0, powers=None)[source]¶

Bases: chemdataextractor.model.units.temperature.TemperatureUnit

Class for Fahrenheit.

convert_value_to_standard(value)¶

convert_value_from_standard(value)¶

convert_error_to_standard(value)¶

convert_error_from_standard(value)¶

.eval

.nlp