Published August 21, 2023 | Version cdk-2.9
Software Open

cdk/cdk: CDK 2.9

  • 1. NextMove Software ltd
  • 2. @BiGCAT-UM
  • 3. EMBL-EBI
  • 4. @vernalis
  • 5. @ncats
  • 6. Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
  • 7. @pendingai
  • 8. National University of Singapore
  • 9. University of Plovdiv
  • 10. Molecular Materials Informatics, Inc.
  • 11. Aalto University
  • 12. Ideaconsult Ltd. @ideaconsult
  • 13. Lhasa Limited
  • 14. ACENET / Memorial University / Digital Research Alliance of Canada
  • 15. Friedrich-Schiller-Universität
  • 16. Friedrich-Schiller University, Jena

Description

Summary

Improved abbreviation handling 991. The Abbreviation handling has been tweaked with more and cleaner options:
Abbreviations  abbreviations = new Abbreviations();
// abbreviations.setContractToSingleLabel(true); // old (still supported)
abbreviations.with(Abbreviations.Option.ALLOW_SINGLETON); // new
// abbreviations.setContractOnHetero(true); // old (still supported)
abbreviations.with(Abbreviations.Option.AUTO_CONTRACT_HETERO); // new

The full options are described here: Abbreviations.Option.

More arrow types

Now includes NoGo/Equilibrium/RetroSynthetic - #927. See IReaction.Direction. Examples:

Multi-step Reaction SMILES

https://github.com/cdk/cdk/pull/986

An new entry point to the SMILES parser has been added to parse into a "multi-step" reaction where by the product of one step is the reactant the the next. The basic idea is to allow more than two '>'. Parts at even positions are reactants/products and odd positions are agents/catalysts/solvents.

Basic idea:

SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.getInstance());
IReactionSet rset = sp.parseReactionSetSmiles("[Pb]>>[Ag]>>[Au] lead-to-silver-to-gold");

Real example (see next bullet for depiction):

ClC1=NC=2N(C(=C1)N(CC3=CC=CC=C3)CC4=CC=CC=C4)N=CC2C(OCC)=O>C1(=CC(=CC(=N1)C)N)N2C[C@H](CCC2)O.O1CCOCC1.CC1(C2=C(C(=CC=C2)P(C3=CC=CC=C3)C4=CC=CC=C4)OC5=C(C=CC=C15)P(C6=CC=CC=C6)C7=CC=CC=C7)C.C=1C=CC(=CC1)\C=C\C(=O)\C=C\C2=CC=CC=C2.C=1C=CC(=CC1)\C=C\C(=O)\C=C\C2=CC=CC=C2.C=1C=CC(=CC1)\C=C\C(=O)\C=C\C2=CC=CC=C2.[Pd].[Pd].[Cs]OC(=O)O[Cs]>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N(CC4=CC=CC=C4)CC5=CC=CC=C5)N=CC3C(OCC)=O)N6C[C@H](CCC6)O>CO.C1CCOC1.O.O[Li]>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N(CC4=CC=CC=C4)CC5=CC=CC=C5)N=CC3C(O)=O)N6C[C@H](CCC6)O>CN(C)C(=[N+](C)C)ON1C2=C(C=CC=N2)N=N1.F[P-](F)(F)(F)(F)F.[NH4+].[Cl-].CN(C)C=O.CCN(C(C)C)C(C)C>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N(CC4=CC=CC=C4)CC5=CC=CC=C5)N=CC3C(N)=O)N6C[C@H](CCC6)O>>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N)N=CC3C(N)=O)N4C[C@H](CCC4)O |f:4.5.6.7.8,16.17,18.19|  US20190241576A1
Reaction Set and Multi-step depiction

https://github.com/cdk/cdk/pull/986

The DepictionGenerator) has been extended to depict reaction sets. If the product of the previous reaction is the same as the reactant in the next (object identity) it is omitted for a terser depiction:

More correct PubChemFingerprinter

Explicit hydrogens are not longer required and there is an option to use a more correct ring set definition matching closer the original CACTVS substructure keys. This is now on by default:

IChemObject builder = SilentChemObjectBuilder.getInstance();
new PubchemFingerprinter(builder); // new - default is to use "ESSSR-like" ring set
new PubchemFingerprinter(builder, false); // old - for backwards compatible with FP generated with older CDK versions
Universal (InChI) SMILES for large molecules 979.

The InChI now supports > 999 atoms, we have the option to generate a SMILES using the InChI canonical labelling, it makes sense to use the larger molecules flag and support more.

Authors
  75 John Mayfield
  17 Egon Willighagen
   6 Uli Fechner
   4 Mark J. Williamson
   3 Mark Williamson
   1 Parit Bansal
   1 Matthias Mailänder
Full Changelog

Files

cdk/cdk-cdk-2.9.zip

Files (25.9 MB)

Name Size Download all
md5:941e6d315db935a94c580dc195b3b51b
25.9 MB Preview Download

Additional details

Related works