cdk/cdk: CDK 2.9
Creators
- John Mayfield1
- Egon Willighagen2
- Rajarshi Guha
- gilleain torrance
- Kazuya Ujihara
- Syed Asad Rahman3
- Jonathan Alvarsson
- Mark J. Williamson4
- Saulius Gražulis
- Danny Katzel5
- Tomáš Pluskal6
- Uli7
- Xavier Linn
- Yap Chun Wei8
- Daniel Szisz
- Nikolay Kochev9
- Alex Clark10
- Arvid Berg
- Eric Bach11
- Nina Jeliazkova12
- Ralf Stephan
- Jeffrey Plante13
- Klas Jönsson
- Krishna Dole
- Oliver Stueker14
- Nicolas Alfonso De Pineda Gutierrez
- Michael Wenk15
- kaibioinfo16
- AndyHowlettGitHub
- Dmitry Katsubo
- 1. NextMove Software ltd
- 2. @BiGCAT-UM
- 3. EMBL-EBI
- 4. @vernalis
- 5. @ncats
- 6. Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences
- 7. @pendingai
- 8. National University of Singapore
- 9. University of Plovdiv
- 10. Molecular Materials Informatics, Inc.
- 11. Aalto University
- 12. Ideaconsult Ltd. @ideaconsult
- 13. Lhasa Limited
- 14. ACENET / Memorial University / Digital Research Alliance of Canada
- 15. Friedrich-Schiller-Universität
- 16. Friedrich-Schiller University, Jena
Description
Summary
- Improved abbreviation handling
- More arrow types
- Multi-step Reaction SMILES
- Reaction Set and Multi-step depiction
- More correct PubChemFingerprinter
- Universal (InChI) SMILES for large molecules-SMILES-for-large-molecules)
- Dependency updates and stability improvements
Abbreviations abbreviations = new Abbreviations();
// abbreviations.setContractToSingleLabel(true); // old (still supported)
abbreviations.with(Abbreviations.Option.ALLOW_SINGLETON); // new
// abbreviations.setContractOnHetero(true); // old (still supported)
abbreviations.with(Abbreviations.Option.AUTO_CONTRACT_HETERO); // new
The full options are described here: Abbreviations.Option.
More arrow typesNow includes NoGo/Equilibrium/RetroSynthetic - #927. See IReaction.Direction. Examples:
Multi-step Reaction SMILEShttps://github.com/cdk/cdk/pull/986
An new entry point to the SMILES parser has been added to parse into a "multi-step" reaction where by the product of one step is the reactant the the next. The basic idea is to allow more than two '>'. Parts at even positions are reactants/products and odd positions are agents/catalysts/solvents.
Basic idea:
SmilesParser sp = new SmilesParser(SilentChemObjectBuilder.getInstance());
IReactionSet rset = sp.parseReactionSetSmiles("[Pb]>>[Ag]>>[Au] lead-to-silver-to-gold");
Real example (see next bullet for depiction):
ClC1=NC=2N(C(=C1)N(CC3=CC=CC=C3)CC4=CC=CC=C4)N=CC2C(OCC)=O>C1(=CC(=CC(=N1)C)N)N2C[C@H](CCC2)O.O1CCOCC1.CC1(C2=C(C(=CC=C2)P(C3=CC=CC=C3)C4=CC=CC=C4)OC5=C(C=CC=C15)P(C6=CC=CC=C6)C7=CC=CC=C7)C.C=1C=CC(=CC1)\C=C\C(=O)\C=C\C2=CC=CC=C2.C=1C=CC(=CC1)\C=C\C(=O)\C=C\C2=CC=CC=C2.C=1C=CC(=CC1)\C=C\C(=O)\C=C\C2=CC=CC=C2.[Pd].[Pd].[Cs]OC(=O)O[Cs]>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N(CC4=CC=CC=C4)CC5=CC=CC=C5)N=CC3C(OCC)=O)N6C[C@H](CCC6)O>CO.C1CCOC1.O.O[Li]>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N(CC4=CC=CC=C4)CC5=CC=CC=C5)N=CC3C(O)=O)N6C[C@H](CCC6)O>CN(C)C(=[N+](C)C)ON1C2=C(C=CC=N2)N=N1.F[P-](F)(F)(F)(F)F.[NH4+].[Cl-].CN(C)C=O.CCN(C(C)C)C(C)C>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N(CC4=CC=CC=C4)CC5=CC=CC=C5)N=CC3C(N)=O)N6C[C@H](CCC6)O>>C1(=CC(=CC(=N1)C)NC2=NC=3N(C(=C2)N)N=CC3C(N)=O)N4C[C@H](CCC4)O |f:4.5.6.7.8,16.17,18.19| US20190241576A1
Reaction Set and Multi-step depiction
https://github.com/cdk/cdk/pull/986
The DepictionGenerator
) has been extended to depict reaction sets. If the product of the previous reaction is the same as the reactant in the next (object identity) it is omitted for a terser depiction:
Explicit hydrogens are not longer required and there is an option to use a more correct ring set definition matching closer the original CACTVS substructure keys. This is now on by default:
IChemObject builder = SilentChemObjectBuilder.getInstance();
new PubchemFingerprinter(builder); // new - default is to use "ESSSR-like" ring set
new PubchemFingerprinter(builder, false); // old - for backwards compatible with FP generated with older CDK versions
Universal (InChI) SMILES for large molecules
979.
The InChI now supports > 999 atoms, we have the option to generate a SMILES using the InChI canonical labelling, it makes sense to use the larger molecules flag and support more.
Authors 75 John Mayfield
17 Egon Willighagen
6 Uli Fechner
4 Mark J. Williamson
3 Mark Williamson
1 Parit Bansal
1 Matthias Mailänder
Full Changelog
- added test cases to SeedGeneratorTest for a negative and a positive return value from AtomEncoder, respectively Uli Fechner on 2022-09-19
- Update the Version tag and the sonatype URL. John Mayfield on 2022-09-19
- nexusUrl also needs updating John Mayfield on 2022-09-19
- New CDK Development Version (2.9-SNAPSHOT) John Mayfield on 2022-09-19
- added javadoc for some methods, re-factored a few methods to avoid code duplication Uli Fechner on 2022-10-20
- Additional reaction arrow types. John Mayfield on 2022-11-03
- Slight x adjustment to the resonance arrow. John Mayfield on 2022-11-12
- Add a default implementation for IAtomColor John Mayfield on 2022-11-12
- CXSMILES needs to escape ":", add some special semantics if we see a Data-Sgroup starting with "cdk." treat this as "cdk:" which is what we use in the CDKConstants. John Mayfield on 2022-11-12
- Add a new property to capture the arrow type coming from CXSMILES. We could also store this in CTFile and on CXSMILES write John Mayfield on 2022-11-12
- Correct dimension calculations now we have nudged the side components/agents up slightly John Mayfield on 2022-11-12
- Remove redundant convention from the CML, there is a CDKConvention which is not used but it does nothing with the substructureList. John Mayfield on 2022-11-13
- Generate OSGi metadata. Matthias Mailänder on 2022-11-25
- A new contributor Egon Willighagen on 2022-11-26
- InChI mass delta (isotopic shift) are based of a table rather than major isotopes. Moslty the same when there is a major isotopes, but not when an element does not have a major isotope (e.g. Tc) John Mayfield on 2022-11-26
- BFS paths, when we update our paths here we need to overwrite the number of paths as well. John Mayfield on 2022-11-26
- Change SonarCloud key John Mayfield on 2022-11-27
- Update README.md John Mayfield on 2022-11-27
- added test cases to BasicMoleculeHashGeneratorTest; added javadoc to MoleculeHashGenerator Uli Fechner on 2022-11-30
- Support non-sequential atom index ins MDL V3000 inputs, fixes #943. John Mayfield on 2022-12-01
- Newer XOM version Egon Willighagen on 2022-12-02
- Removed the Xerces and Xalan dependencies Egon Willighagen on 2022-12-02
- CMLXOM 4.4 Egon Willighagen on 2022-12-02
- Set environment as SonarCloud John Mayfield on 2022-12-05
- added maven plugin to generate checksum for bundle uber-jar Uli Fechner on 2023-01-11
- added test case for MDLV2000Reader that gives rise to ArrayIndexOutOfBoundsException; some basic code clean up in MDLV2000ReaderTest Uli Fechner on 2023-01-17
- Remove the SonarCloud environment - it did not fix the issue of access to the SONAR_TOKEN when we approved it. John Mayfield on 2023-01-17
- Bounds check before access the atoms for a bond, this is a fatal error we can not recover from and should throw/stop parsing even in relaxed mode, handleError can not capture that yet. John Mayfield on 2023-01-18
- added verification of valid ranges for atom indices when reading bond block in MDLV2000Reader; added test case with invalid atom indices above valid threshold Uli Fechner on 2023-01-19
- Bump log4j to 2.19.0 Mark J. Williamson on 2023-01-28
- Bump JUnit to 5.9.2 Mark J. Williamson on 2023-01-28
- Bump log4j to 2.19.0 Mark J. Williamson on 2023-01-28
- Bump JUnit to 5.9.2 Mark J. Williamson on 2023-01-28
- New commits, new copyright statement Egon Willighagen on 2023-01-28
- Bumped the run version in the example to match the latest stable release Egon Willighagen on 2023-01-28
- Update mockito-core to 4.11.0 Mark Williamson on 2023-01-28
- Update slf4j to 2.0.6 Mark Williamson on 2023-01-28
- Update javacc-maven-plugin to 3.0.1 Mark Williamson on 2023-01-28
- Fix IteratingSdfReader - Ensure extra empty lines don't affect the property reading of the SDF property fields. John Mayfield on 2023-03-27
- Log4j 2.20.0 and CMLXOM 4.5 Egon Willighagen on 2023-04-10
- JNA InChI 1.2 Egon Willighagen on 2023-04-10
- Modern (safer) way of using codecov Egon Willighagen on 2023-04-10
- Updated xml-apis to 2.0.2 Egon Willighagen on 2023-04-10
- Build deps updates Egon Willighagen on 2023-04-10
- Upgrade to BEAM 1.3.5 - Fix aromatic phosphorus "https://github.com/johnmay/beam/commit/aafd244f2f83ba0983dd5810398cf4cd2ea13b16" John Mayfield on 2023-04-13
- Improved PubChem Fingerprinter through comparission to the CACTVS_SUBSTRUCTURE_KEYS field in PubChem SDfiles. - Add an option to use the more correct "ESSSR" ring set - now we have this option - Count implicit hydrogens, SMARTS patterns are already implicit/explicit ambivalent - Use integers for atomic number count lookups - Tweak the "isSaturated" definition based on checked examples John Mayfield on 2023-04-27
- propogating molecule id in svg to the corresponding title element as a class to connect the title and molecule Parit Bansal on 2023-06-22
- Turn on the large molecules flag in InChI Universal SMILES numbers. John Mayfield on 2023-06-23
- Smaller test example size John Mayfield on 2023-06-23
- Larger stack size John Mayfield on 2023-06-23
- Set the stack size in the argLine. John Mayfield on 2023-06-23
- Rewrite of the test because on my machine (today) the bounding box was '0 0 28 34'; I relaxed the expectations a bit Egon Willighagen on 2023-06-23
- Update documentation on PubChem fingerprinter. John Mayfield on 2023-06-28
- Updated the AUTHORS for recent work Egon Willighagen on 2023-06-28
- XOM 1.3.9 Egon Willighagen on 2023-07-08
- JUnit 5.10.0-RC1 Egon Willighagen on 2023-07-08
- Apache Felix OSGi Core 1.4.0 Egon Willighagen on 2023-07-09
- SMILES extension to store a multi-step reaction. John Mayfield on 2023-08-01
- Use ReactionManipulator to simplify the CXSMILES layer application John Mayfield on 2023-08-01
- Some refactoring to make it easy to lay out multiple reactions. John Mayfield on 2023-08-01
- Simplify the layout of reactions with a common method for both vector graphics and raster images. John Mayfield on 2023-08-01
- Allow the ReactionBounds to determine the final arrow size. Padding is also now scaled like bond length. John Mayfield on 2023-08-01
- Preliminary depiction for reaction sets. John Mayfield on 2023-08-01
- Minor formatting change, wrap the parameters. John Mayfield on 2023-08-01
- Detect a sequence of reactions and lay them out without replicating the intermediate compounds. John Mayfield on 2023-08-01
- Fix a corner case now we have a null arrow. John Mayfield on 2023-08-01
- Handle CXSMILES on multistep reaction SMILES. John Mayfield on 2023-08-01
- Fix typo, we need to grab all the product labels not just the first one John Mayfield on 2023-08-01
- Fix an issue with PDF rescaling, MM_TO_POINT when dimensions are automatic. John Mayfield on 2023-08-01
- AtomNumber in reactions has been wrong for a while, track the number with a counter class and ensure in sequence reactions they are numbered correctly (i.e. no gaps). The highlighting should also be cleared after all the bounds are generated John Mayfield on 2023-08-01
- Fix a corner case with empty reactants/products. John Mayfield on 2023-08-01
- Improved positioning of the parts below the arrow (currently just the conditions) John Mayfield on 2023-08-01
- Fix SVG,PX unit spacing. Now padding is scaled we need to unscale it first. John Mayfield on 2023-08-01
- Cleanup some code smells. John Mayfield on 2023-08-01
- Ensure scaling works correctly for raster images. John Mayfield on 2023-08-01
- Fix a slight issue with the bounds/resizing calculations of reaction depiction layouts. We now factor in the side component padding at a earlier stage so don't need to shuffle things along later. John Mayfield on 2023-08-06
- Slight improvement in selecting how we rotate a structure during layout. Instead of using all bonds in the molecule we only use non-hidden (in abbreviations) bonds and non-terminal. John Mayfield on 2023-08-06
- For n=1 one we want bonds to be left to right and not aligned at 30 deg, e.g. "Cl-Ph". John Mayfield on 2023-08-06
- sonarcloud wants J17 now Egon Willighagen on 2023-08-06
- Respect the contractSingleFragments option. John Mayfield on 2023-08-08
- Use an enum set, we will offer more options for contraction. John Mayfield on 2023-08-08
- Allow contraction on "terminal" carbons. John Mayfield on 2023-08-08
- Support contraction on a single bond, Ph2 and (NEt2)2 for example. John Mayfield on 2023-08-08
- Allow full contraction when there are multiple connected groups. This can still be controlled with the appropriate option. John Mayfield on 2023-08-08
- Add a helper class AdjacentGroup to record information about the abbreviation we are attempting to make. This allow us to automatically generate better abbreviations like Et3P instead of PEt3 (carbons come first). Also add in the option to turn on/off linker contractions John Mayfield on 2023-08-08
- Additional unit tests for abbreviations. John Mayfield on 2023-08-08
- Allow PhCl to be generated John Mayfield on 2023-08-08
- Improve the logic for when an abbreviation is applied or not. John Mayfield on 2023-08-08
- Simplify the logic here. John Mayfield on 2023-08-08
- Minor optimisation, only substitutes with more than one atoms should be considered. John Mayfield on 2023-08-08
- Add the option to keep atoms together, for now this simply blocks any abbreviations on those atoms. This is useful for highlighting. John Mayfield on 2023-08-08
- Instead of a set, using a mapping to indicate what things should be grouped or not. John Mayfield on 2023-08-08
- Do not allow contraction across stereochemistry central atom/bonds. John Mayfield on 2023-08-09
- Better evasion of ambiguity with cases like NiPr being N iPr or Ni Pr. John Mayfield on 2023-08-09
- Another minor tweak, -COEt is better than -CEtO John Mayfield on 2023-08-10
- Blocking stereochemistry atoms, should happen after we have done the complete fragment lookup John Mayfield on 2023-08-10
- Update README.md John Mayfield on 2023-08-10
- These are not tests - there are no assertions. SonarCloud recommended @Nested but the correct fix is to remove the @Test John Mayfield on 2023-08-10
- Fix possible divide by 0? John Mayfield on 2023-08-10
- Other occurrences of (possible?) divide by 0. John Mayfield on 2023-08-10
- More redundant test annotations John Mayfield on 2023-08-10
- Just avoid the division and use a constant. John Mayfield on 2023-08-10
- Final 3 "bugs" from sonarcloud John Mayfield on 2023-08-10
- Suppress bundle plugin about 'pom' projects. John Mayfield on 2023-08-21
- Small tweak to depiction orientation on small molecules, if all the bonds are aligned (aligned == bonds.size()) this is a better selection even if the value is small. John Mayfield on 2023-08-21
- Fix MayGen test which hard coded exact coordinates, these now changed. John Mayfield on 2023-08-21
- CDK 2.9 John Mayfield on 2023-08-21
Files
cdk/cdk-cdk-2.9.zip
Files
(25.9 MB)
Name | Size | Download all |
---|---|---|
md5:941e6d315db935a94c580dc195b3b51b
|
25.9 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/cdk/cdk/tree/cdk-2.9 (URL)