Chem
The synkit.Chem module provides tools for reaction SMILES processing, including atom‐map canonicalization, equivalence validation, and SMILES standardization.
Canonicalization
The class CanonRSMI
standardizes reaction SMILES and atom-map indices by computing a canonical relabeling of mapped atoms. It employs a Weisfeiler–Lehman colour-refinement back-end (default: 3 iterations) to ensure that each atom-map assignment is uniquely and consistently ordered across isomorphic reactions [1].
from synkit.Chem.Reaction import CanonRSMI
canon = CanonRSMI(backend='wl', wl_iterations=3)
canon.canonicalise(
'[CH3:1][CH:2]=[O:3].[CH:4]([H:7])([H:8])[CH:5]=[O:6]'
'>>'
'[CH3:1][CH:2]=[CH:4][CH:5]=[O:6].[O:3]([H:7])([H:8])'
)
print(canon.canonical_rsmi)
>> '[CH:3]([CH3:7])=[O:8].[H:1][CH:4]([H:2])[CH:6]=[O:5]>>[CH:3](=[CH:4][CH:6]=[O:5])[CH3:7].[H:1][O:8][H:2]'
AAM comparison
The class AAMValidator
verifies atom‐map equivalence by constructing an Imaginary Transition State (ITS) graph for each reaction and testing graph isomorphism via NetworkX’s VF2 algorithm. This approach ensures that two atom‐mapped reactions produce identical ITS topologies before and after mapping [2].
from synkit.Chem.Reaction import AAMValidator
validator = AAMValidator()
rsmi_1 = (
'[CH3:1][C:2](=[O:3])[OH:4].[CH3:5][OH:6]'
'>>'
'[CH3:1][C:2](=[O:3])[O:6][CH3:5].[OH2:4]'
)
rsmi_2 = (
'[CH3:5][C:1](=[O:2])[OH:3].[CH3:6][OH:4]'
'>>'
'[CH3:5][C:1](=[O:2])[O:4][CH3:6].[OH2:3]'
)
is_eq = validator.smiles_check(rsmi_1, rsmi_2, check_method='ITS')
print(is_eq)
>> True
Standardization
The class Standardize
cleans and normalizes reaction SMILES by applying RDKit sanitization, removing atom‐map annotations, and stripping stereochemical labels. Its configurable options allow you to toggle atom‐map removal and stereo‐ignoring treatments to produce a minimal, canonical SMILES representation suitable for downstream processing.```
from synkit.Chem.Reaction.standardize import Standardize
std = Standardize()
rsmi = (
'[CH3:1][CH:2]=[O:3].[CH:4]([H:7])([H:8])[CH:5]=[O:6]'
'>>'
'[CH3:1][CH:2]=[CH:4][CH:5]=[O:6].[O:3]([H:7])([H:8])'
)
std_rsmi = std.fit(rsmi, remove_aam=True, ignore_stereo=True)
print(std_rsmi)
>> 'CC=O.CC=O>>CC=CC=O.O'
See Also
synkit.Graph
— graph modeling