.cli¶
Command line interface tools
ChemDataExtractor command line interface.
Once installed, ChemDataExtractor provides a command-line tool that can be used by typing ‘cde’ in a terminal.
-
chemdataextractor.cli.cli(ctx, verbose)¶ ChemDataExtractor command line interface.
-
chemdataextractor.cli.extract(ctx, input, output)¶ Run ChemDataExtractor on a document.
-
chemdataextractor.cli.read(ctx, input, output)¶ Output processed document elements.
.cli.cem¶
Chemical entity mention (CEM) commands.
-
chemdataextractor.cli.cem.cem(ctx)¶ Chemical NER commands.
-
chemdataextractor.cli.cem.train_crf(ctx, input, output, clusters)¶ Train CRF CEM recognizer.
.cli.chemdner¶
Command line tools for dealing with CHEMDNER corpus.
-
chemdataextractor.cli.chemdner.chemdner_cli(ctx)¶ CHEMDNER commands.
-
chemdataextractor.cli.chemdner.prepare_gold(ctx, annotations, gout)¶ Prepare bc-evaluate gold file from annotations supplied by CHEMDNER.
-
chemdataextractor.cli.chemdner.prepare_tokens(ctx, input, annotations, tout, lout)¶ Prepare tokenized and tagged corpus file from those supplied by CHEMDNER.
-
chemdataextractor.cli.chemdner.tag(ctx, corpus, output)¶ Tag chemical entities and write CHEMDNER annotations predictions file.
.cli.cluster¶
Word clusters command-line interface.
-
chemdataextractor.cli.cluster.cluster_cli(ctx)¶ Word clusters commands.
-
chemdataextractor.cli.cluster.load(ctx, input, output)¶ Read clusters from file and save to model file.
.cli.config¶
Commands for managing ChemDataExtractor configuration.
-
chemdataextractor.cli.config.config_cli(ctx)¶ Manage configuration.
-
chemdataextractor.cli.config.list(ctx)¶ List all config values.
-
chemdataextractor.cli.config.get(ctx)¶ Get the config value for a key.
-
chemdataextractor.cli.config.set(ctx, key, value)¶ Set the config value for a key.
-
chemdataextractor.cli.config.remove(ctx, key)¶ Remove the config value for a key.
-
chemdataextractor.cli.config.clear(ctx)¶ Clear all config values.
.cli.data¶
Data and model management interface.
-
chemdataextractor.cli.data.data_cli(ctx)¶ Data and model management commands.
-
chemdataextractor.cli.data.where(ctx)¶ Print path to data directory.
-
chemdataextractor.cli.data.list(ctx)¶ List active data packages.
-
chemdataextractor.cli.data.download(ctx)¶ Download data.
-
chemdataextractor.cli.data.clean(ctx)¶ Prune data that is no longer required.
.cli.dict¶
Commands for building a dictionary-based chemical named entity recognizer.
-
chemdataextractor.cli.dict.dict_cli(ctx)¶ Chemical dictionary commands.
-
chemdataextractor.cli.dict.prepare_jochem(ctx, jochem, output, csoutput)¶ Process and filter jochem file to produce list of names for dictionary.
-
chemdataextractor.cli.dict.prepare_include(ctx, include, output)¶ Process and filter include file to produce list of names for dictionary.
-
chemdataextractor.cli.dict.build(ctx, inputs, output, cs)¶ Build chemical name dictionary.
-
chemdataextractor.cli.dict.tag(ctx, model, cs, corpus, output)¶ Tag chemical entities and write CHEMDNER annotations predictions file.
.cli.evaluate¶
Commands for running evaluations.
-
chemdataextractor.cli.evaluate.evaluate(ctx)¶ Evaluation commands.
-
chemdataextractor.cli.evaluate.run(input)¶
-
chemdataextractor.cli.evaluate.compare()¶
.cli.pos¶
Part of speech tagging commands.
-
chemdataextractor.cli.pos.pos_cli(ctx)¶ POS tagger commands.
-
chemdataextractor.cli.pos.train_all(ctx, output)¶ Train POS tagger on WSJ, GENIA, and both. With and without cluster features.
-
chemdataextractor.cli.pos.evaluate_all(ctx, model)¶ Evaluate POS taggers on WSJ and GENIA.
-
chemdataextractor.cli.pos.train(ctx, output, corpus, clusters)¶ Train POS Tagger.
-
chemdataextractor.cli.pos.evaluate(ctx, model, corpus, clusters)¶ Evaluate performance of POS Tagger.
-
chemdataextractor.cli.pos.train_perceptron(ctx, output, corpus, clusters)¶ Train Averaged Perceptron POS Tagger.
-
chemdataextractor.cli.pos.evaluate_perceptron(ctx, model, corpus)¶ Evaluate performance of Averaged Perceptron POS Tagger.
-
chemdataextractor.cli.pos.tag(ctx, input, output)¶ Output POS-tagged tokens.
.cli.tokenize¶
Tokenizer command line interface.
-
chemdataextractor.cli.tokenize.tokenize_cli(ctx)¶ Tokenizer commands.
-
chemdataextractor.cli.tokenize.train_punkt(ctx, input, output, abbr, colloc)¶ Train Punkt sentence splitter using sentences in input.
-
chemdataextractor.cli.tokenize.sentences(ctx, input, output)¶ Read input document, and output sentences.
-
chemdataextractor.cli.tokenize.words(ctx, input, output)¶ Read input document, and output words.