Welcome to ChemDataExtractor!

ChemDataExtractor is a toolkit for extracting chemical information from the scientific literature. Check out the Online Demo!

Features:

  • HTML, XML and PDF document readers
  • Chemistry-aware natural language processing pipeline
  • Chemical named entity recognition
  • Rule-based parsing grammars for property and spectra extraction
  • Table parser for extracting tabulated data
  • Document processing to resolve data interdependencies

Indices and tables