API Reference¶
Note
Private methods are not included in the documentation!
Config file reader/writer. |
|
Tools for loading and caching data files. |
|
Error classes for ChemDataExtractor. |
|
Miscellaneous utility functions. |
Tools for dealing with bibliographic information. |
|
BibTeX parser. |
|
Tools for parsing people’s names from strings into various name components. |
|
Parse metadata stored as XMP (Extensible Metadata Platform). |
ChemDataExtractor command line interface. |
|
Chemical entity mention (CEM) commands. |
|
Command line tools for dealing with CHEMDNER corpus. |
|
Word clusters command-line interface. |
|
Commands for managing ChemDataExtractor configuration. |
|
Data and model management interface. |
|
Commands for building a dictionary-based chemical named entity recognizer. |
|
Commands for running evaluations. |
|
Part of speech tagging commands. |
|
Tokenizer command line interface. |
Document processing. |
|
Document model. |
|
Document elements. |
|
Figure document elements. |
|
MetaData Document elements |
|
Table document elements |
|
Text-based document elements. |
Evaluation of extraction results |
|
|
Classes for representing chemical models. |
|
Data model for extracted information. |
|
Model classes for physical properties. |
|
Types for representing quantities, dimensions, and units. |
|
Base types for making units. |
|
Base types for dimensions. |
|
Base types for making quantity models. |
|
Units and models for lengths. |
|
Units and models for masses. |
|
Units and models for times. |
|
Units and models for temperatures. |
Chemistry-aware natural language processing framework. |
|
Abbreviation detection. |
|
New and improved named entity recognition (NER) for Chemical entity mentions (CEM). |
|
Tagger wrappers that wrap AllenNLP functionality. |
|
Named entity recognition (NER) for Chemical entity mentions (CEM). |
|
Tools for reading and writing text corpora. |
|
Cache features of previously seen words. |
|
Part-of-speech tagging. |
|
Tagger implementations. |
|
Word and sentence tokenizers. |
Parse text using rule-based grammars. |
|
Actions to perform during parsing. |
|
Parser for automatic parsing, without user-written parsing rules. |
|
Base classes for parsing sentences and tables. |
|
Chemical entity mention parser elements. |
|
Common parser elements. |
|
Parser elements. |
|
IR spectrum text parser. |
|
NMR text parser. |
|
NMR text parser. |
|
Glass transition temperature parser. |
|
UV-vis text parser. |
Reader classes that read a file and produce a ChemDataExtractor Document object. |
|
Readers for documents from the ACS. |
|
Abstract base classes for document readers. |
|
Readers for ChemSpider SyntheticPages. |
|
XML and HTML readers based on lxml. |
|
Readers for NLM Journal Archiving and Interchange DTD XML files. |
|
PDF document reader. |
|
Plain text document reader. |
|
Readers for documents from the RSC. |
|
Readers for USPTO patents. |
|
Elsevier XML reader |
|
Readers for documents from Springer. |
|
|
Cluster of phrase objects and associated cluster dictionaries |
|
Extraction pattern object |
|
Extraction pattern object |
|
Phrase object |
|
Classes for defining new chemical relationships |
|
|
|
Various utility functions |
Declarative scraping framework for extracting structured data from HTML and XML documents. |
|
Abstract base classes that define the interface for Scrapers, Fields, Crawlers, etc. |
|
Clean HTML or XML by removing tags completely or replacing with their contents. |
|
Extend cssselect to improve handling of pseudo-elements. |
|
An entity to extract. |
|
Fields to define on an entity. |
|
Concrete classes for scraping and searching. |
|
Tool for selecting content from HTML or XML using CSS or XPath expressions. |
|
Scraping tools for specific publishers. |
|
Tools for scraping documents from NLM Journal Archiving and Interchange DTD XML files. |
|
Tools for scraping documents from The Royal Society of Chemistry. |
|
Tools for scraping documents from Springer, Biomed Central and Chemistry Central XML files. |
|
Tools for scraping documents from Elsevier. |
Tools for processing text. |
|
Chemistry text handling tools. |
|
Tools for converting LaTeX to unicode. |
|
Tools for normalizing text. |
|
Text processors. |