.cli

Command line interface tools

ChemDataExtractor command line interface.

Once installed, ChemDataExtractor provides a command-line tool that can be used by typing ‘cde’ in a terminal.

chemdataextractor.cli.cli(ctx, verbose)

ChemDataExtractor command line interface.

chemdataextractor.cli.extract(ctx, input, output)

Run ChemDataExtractor on a document.

chemdataextractor.cli.read(ctx, input, output)

Output processed document elements.

.cli.cem

Chemical entity mention (CEM) commands.

chemdataextractor.cli.cem.cem(ctx)

Chemical NER commands.

chemdataextractor.cli.cem.train_crf(ctx, input, output, clusters)

Train CRF CEM recognizer.

.cli.chemdner

Command line tools for dealing with CHEMDNER corpus.

chemdataextractor.cli.chemdner.chemdner_cli(ctx)

CHEMDNER commands.

chemdataextractor.cli.chemdner.prepare_gold(ctx, annotations, gout)

Prepare bc-evaluate gold file from annotations supplied by CHEMDNER.

chemdataextractor.cli.chemdner.prepare_tokens(ctx, input, annotations, tout, lout)

Prepare tokenized and tagged corpus file from those supplied by CHEMDNER.

chemdataextractor.cli.chemdner.tag(ctx, corpus, output)

Tag chemical entities and write CHEMDNER annotations predictions file.

.cli.cluster

Word clusters command-line interface.

chemdataextractor.cli.cluster.cluster_cli(ctx)

Word clusters commands.

chemdataextractor.cli.cluster.load(ctx, input, output)

Read clusters from file and save to model file.

.cli.config

Commands for managing ChemDataExtractor configuration.

chemdataextractor.cli.config.config_cli(ctx)

Manage configuration.

chemdataextractor.cli.config.list(ctx)

List all config values.

chemdataextractor.cli.config.get(ctx)

Get the config value for a key.

chemdataextractor.cli.config.set(ctx, key, value)

Set the config value for a key.

chemdataextractor.cli.config.remove(ctx, key)

Remove the config value for a key.

chemdataextractor.cli.config.clear(ctx)

Clear all config values.

.cli.data

Data and model management interface.

chemdataextractor.cli.data.data_cli(ctx)

Data and model management commands.

chemdataextractor.cli.data.where(ctx)

Print path to data directory.

chemdataextractor.cli.data.list(ctx)

List active data packages.

chemdataextractor.cli.data.download(ctx)

Download data.

chemdataextractor.cli.data.clean(ctx)

Prune data that is no longer required.

.cli.dict

Commands for building a dictionary-based chemical named entity recognizer.

chemdataextractor.cli.dict.dict_cli(ctx)

Chemical dictionary commands.

chemdataextractor.cli.dict.prepare_jochem(ctx, jochem, output, csoutput)

Process and filter jochem file to produce list of names for dictionary.

chemdataextractor.cli.dict.prepare_include(ctx, include, output)

Process and filter include file to produce list of names for dictionary.

chemdataextractor.cli.dict.build(ctx, inputs, output, cs)

Build chemical name dictionary.

chemdataextractor.cli.dict.tag(ctx, model, cs, corpus, output)

Tag chemical entities and write CHEMDNER annotations predictions file.

.cli.evaluate

Commands for running evaluations.

chemdataextractor.cli.evaluate.evaluate(ctx)

Evaluation commands.

chemdataextractor.cli.evaluate.run(input)
chemdataextractor.cli.evaluate.compare()
chemdataextractor.cli.evaluate.eval_document(gold, out, transform=None)[source]
chemdataextractor.cli.evaluate.get_names(cs)[source]

Return list of every name.

chemdataextractor.cli.evaluate.get_labels(cs)[source]

Return list of every label.

chemdataextractor.cli.evaluate.get_ids(cs)[source]

Return chemical identifier records.

chemdataextractor.cli.evaluate.get_spectra_type(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_subject(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_peaks(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_solvent(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_core(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_temp(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_apparatus(cs)[source]
chemdataextractor.cli.evaluate.get_spectra_full(cs)[source]
chemdataextractor.cli.evaluate.get_property_value(cs)[source]
chemdataextractor.cli.evaluate.get_property_units(cs)[source]
chemdataextractor.cli.evaluate.get_property_subject(cs)[source]
chemdataextractor.cli.evaluate.get_property_solvent(cs)[source]
chemdataextractor.cli.evaluate.get_property_temperature(cs)[source]
chemdataextractor.cli.evaluate.get_property_apparatus(cs)[source]
chemdataextractor.cli.evaluate.get_property_core(cs)[source]
chemdataextractor.cli.evaluate.get_property_full(cs)[source]

.cli.pos

Part of speech tagging commands.

chemdataextractor.cli.pos.pos_cli(ctx)

POS tagger commands.

chemdataextractor.cli.pos.train_all(ctx, output)

Train POS tagger on WSJ, GENIA, and both. With and without cluster features.

chemdataextractor.cli.pos.evaluate_all(ctx, model)

Evaluate POS taggers on WSJ and GENIA.

chemdataextractor.cli.pos.train(ctx, output, corpus, clusters)

Train POS Tagger.

chemdataextractor.cli.pos.evaluate(ctx, model, corpus, clusters)

Evaluate performance of POS Tagger.

chemdataextractor.cli.pos.train_perceptron(ctx, output, corpus, clusters)

Train Averaged Perceptron POS Tagger.

chemdataextractor.cli.pos.evaluate_perceptron(ctx, model, corpus)

Evaluate performance of Averaged Perceptron POS Tagger.

chemdataextractor.cli.pos.tag(ctx, input, output)

Output POS-tagged tokens.

.cli.tokenize

Tokenizer command line interface.

chemdataextractor.cli.tokenize.tokenize_cli(ctx)

Tokenizer commands.

chemdataextractor.cli.tokenize.train_punkt(ctx, input, output, abbr, colloc)

Train Punkt sentence splitter using sentences in input.

chemdataextractor.cli.tokenize.sentences(ctx, input, output)

Read input document, and output sentences.

chemdataextractor.cli.tokenize.words(ctx, input, output)

Read input document, and output words.