Chemical Data Processing Library Python API - Version 1.4.0
Classes | Public Member Functions | Static Public Attributes | Properties | List of all members
CDPL.Descr.CircularFingerprintGenerator Class Reference

Generation of atom-centered circular substructure fingerprints in the spirit of SciTegic's Extended Connectivity Fingerprints (ECFP). More...

+ Inheritance diagram for CDPL.Descr.CircularFingerprintGenerator:

Classes

class  DefAtomIdentifierFunctor
 The functor for the generation of ECFP atom identifiers. More...
 
class  DefBondIdentifierFunctor
 The default functor for the generation of bond identifiers. More...
 

Public Member Functions

None __init__ ()
 Constructs the CircularFingerprintGenerator instance.
 
None __init__ (CircularFingerprintGenerator gen)
 Initializes a copy of the CircularFingerprintGenerator instance gen. More...
 
None __init__ (Chem.MolecularGraph molgraph)
 Constructs the CircularFingerprintGenerator instance and generates the atom-centered circular substructure fingerprint of the molecular graph molgraph. More...
 
int getObjectID ()
 Returns the numeric identifier (ID) of the wrapped C++ class instance. More...
 
None setAtomIdentifierFunction (Chem.SizeTypeAtomMolecularGraphFunctor func)
 Allows to specify a customized function for the generation of initial atom identifiers. More...
 
None setBondIdentifierFunction (Chem.UInt64BondFunctor func)
 Allows to specify a customized function for the generation of initial bond identifiers. More...
 
None setNumIterations (int num_iter)
 Allows to specify the desired number of feature substructure growing iterations. More...
 
int getNumIterations ()
 Returns the number of feature substructure growing iterations. More...
 
None includeHydrogens (bool include)
 Specifies whether hydrogens shall be included in the generated fingerprint. More...
 
bool hydrogensIncluded ()
 Tells whether hydrogens are considered during fingerprint generation. More...
 
None includeChirality (bool include)
 Specifies whether atom stereo configurations shall be incorporated into atom identifiers. More...
 
bool chiralityIncluded ()
 Tells whether atom chirality is considered during fingerprint generation. More...
 
None generate (Chem.MolecularGraph molgraph)
 Generates the atom-centered circular substructure fingerprint of the molecular graph molgraph. More...
 
None setFeatureBits (Util.BitSet bs, bool reset=True)
 Maps previously generated feature identifiers to bit indices and sets the correponding bits of bs. More...
 
None setFeatureBits (int atom_idx, Util.BitSet bs, bool reset=True)
 Maps previously generated identifiers of structural features involving the atom specified by atom_idx to bit indices and sets the correponding bits of bs. More...
 
int getNumFeatures ()
 Returns the number of features generated by the most recent call to generate(). More...
 
int getFeatureIdentifier (int ftr_idx)
 Returns the identifier of the feature at index ftr_idx. More...
 
Util.BitSet getFeatureSubstructure (int ftr_idx)
 Returns the atom-bit mask describing the substructure covered by the feature at index ftr_idx. More...
 
None getFeatureSubstructure (int ftr_idx, Chem.Fragment frag, bool clear=True)
 Extracts the substructure covered by the feature at index ftr_idx into frag. More...
 
None getFeatureSubstructures (int bit_idx, int bs_size, Chem.FragmentList frags, bool clear=True)
 Extracts the substructures of every feature that, when folded into a bitset of size bs_size, maps to the bit index bit_idx. More...
 
CircularFingerprintGenerator assign (CircularFingerprintGenerator gen)
 Replaces the current state of self with a copy of the state of the CircularFingerprintGenerator instance gen. More...
 

Static Public Attributes

int DEF_ATOM_PROPERTY_FLAGS = 3166
 Specifies the default set of atomic properties considered in the generation of atom identifiers by DefAtomIdentifierFunctor.
 
int DEF_BOND_PROPERTY_FLAGS = 10
 Specifies the default set of bond properties considered in the generation of bond identifiers by DefBondIdentifierFunctor.
 

Properties

 objectID = property(getObjectID)
 
 numFeatures = property(getNumFeatures)
 
 numIterations = property(getNumIterations, setNumIterations)
 
 incHydrogens = property(hydrogensIncluded, includeHydrogens)
 
 incChirality = property(chiralityIncluded, includeChirality)
 

Detailed Description

Generation of atom-centered circular substructure fingerprints in the spirit of SciTegic's Extended Connectivity Fingerprints (ECFP).

Starting from initial atom and bond identifiers (generated either by the built-in DefAtomIdentifierFunctor / DefBondIdentifierFunctor or by user-supplied functions) the generator runs a configurable number of growing iterations (see setNumIterations()). Each iteration produces a new set of feature identifiers from the identifiers of the previous iteration and the connecting bonds, capturing circular substructures of incrementing radius. The resulting feature identifiers can be folded into a bitset of any size via setFeatureBits().

See also
[STECFP]

Constructor & Destructor Documentation

◆ __init__() [1/2]

None CDPL.Descr.CircularFingerprintGenerator.__init__ ( CircularFingerprintGenerator  gen)

Initializes a copy of the CircularFingerprintGenerator instance gen.

Parameters
genThe CircularFingerprintGenerator instance to copy.

◆ __init__() [2/2]

None CDPL.Descr.CircularFingerprintGenerator.__init__ ( Chem.MolecularGraph  molgraph)

Constructs the CircularFingerprintGenerator instance and generates the atom-centered circular substructure fingerprint of the molecular graph molgraph.

Parameters
molgraphThe molecular graph to process.

Member Function Documentation

◆ getObjectID()

int CDPL.Descr.CircularFingerprintGenerator.getObjectID ( )

Returns the numeric identifier (ID) of the wrapped C++ class instance.

Different Python CircularFingerprintGenerator instances may reference the same underlying C++ class instance. The commonly used Python expression a is not b thus cannot tell reliably whether the two CircularFingerprintGenerator instances a and b reference different C++ objects. The numeric identifier returned by this method allows to correctly implement such an identity test via the simple expression a.getObjectID() != b.getObjectID().

Returns
The numeric ID of the internally referenced C++ class instance.

◆ setAtomIdentifierFunction()

None CDPL.Descr.CircularFingerprintGenerator.setAtomIdentifierFunction ( Chem.SizeTypeAtomMolecularGraphFunctor  func)

Allows to specify a customized function for the generation of initial atom identifiers.

Parameters
funcA CircularFingerprintGenerator.AtomIdentifierFunction instance that wraps the target function.
Note
By default, atom identifiers are generated by a CircularFingerprintGenerator.DefAtomIdentifierFunctor instance. If the generated initial identifier for an atom is 0, the atom is regarded as not being present in the processed molecular graph.

◆ setBondIdentifierFunction()

None CDPL.Descr.CircularFingerprintGenerator.setBondIdentifierFunction ( Chem.UInt64BondFunctor  func)

Allows to specify a customized function for the generation of initial bond identifiers.

Parameters
funcA CircularFingerprintGenerator.BondIdentifierFunction instance that wraps the target function.
Note
By default, bond identifiers are generated by a CircularFingerprintGenerator.DefBondIdentifierFunctor instance. If the generated initial identifier for a bond is 0, the bond is regarded as not being present in the processed molecular graph.

◆ setNumIterations()

None CDPL.Descr.CircularFingerprintGenerator.setNumIterations ( int  num_iter)

Allows to specify the desired number of feature substructure growing iterations.

Parameters
num_iterThe number of iterations.
Note
The default number of iterations is 2.

◆ getNumIterations()

int CDPL.Descr.CircularFingerprintGenerator.getNumIterations ( )

Returns the number of feature substructure growing iterations.

Returns
The number of iterations.

◆ includeHydrogens()

None CDPL.Descr.CircularFingerprintGenerator.includeHydrogens ( bool  include)

Specifies whether hydrogens shall be included in the generated fingerprint.

Parameters
includeIf True, hydrogens are considered as regular atoms during fingerprint generation.

◆ hydrogensIncluded()

bool CDPL.Descr.CircularFingerprintGenerator.hydrogensIncluded ( )

Tells whether hydrogens are considered during fingerprint generation.

Returns
True if hydrogens are considered, and False otherwise.

◆ includeChirality()

None CDPL.Descr.CircularFingerprintGenerator.includeChirality ( bool  include)

Specifies whether atom stereo configurations shall be incorporated into atom identifiers.

Parameters
includeIf True, atom chirality is considered during fingerprint generation.

◆ chiralityIncluded()

bool CDPL.Descr.CircularFingerprintGenerator.chiralityIncluded ( )

Tells whether atom chirality is considered during fingerprint generation.

Returns
True if atom chirality is considered, and False otherwise.

◆ generate()

None CDPL.Descr.CircularFingerprintGenerator.generate ( Chem.MolecularGraph  molgraph)

Generates the atom-centered circular substructure fingerprint of the molecular graph molgraph.

Parameters
molgraphThe molecular graph to process.

◆ setFeatureBits() [1/2]

None CDPL.Descr.CircularFingerprintGenerator.setFeatureBits ( Util.BitSet  bs,
bool   reset = True 
)

Maps previously generated feature identifiers to bit indices and sets the correponding bits of bs.

Parameters
bsThe target bitset.
resetIf True, bs will be cleared before any feature bits are set.
Note
The binary fingerprint size is specified implicitly via the size of bs.
See also
generate()

◆ setFeatureBits() [2/2]

None CDPL.Descr.CircularFingerprintGenerator.setFeatureBits ( int  atom_idx,
Util.BitSet  bs,
bool   reset = True 
)

Maps previously generated identifiers of structural features involving the atom specified by atom_idx to bit indices and sets the correponding bits of bs.

Parameters
atom_idxThe index of the atom that has to be involved in the structural features.
bsThe target bitset.
resetIf True, bs will be cleared before any feature bits are set.
Note
The binary fingerprint size is specified implicitly via the size of bs.
See also
generate()

◆ getNumFeatures()

int CDPL.Descr.CircularFingerprintGenerator.getNumFeatures ( )

Returns the number of features generated by the most recent call to generate().

Returns
The number of features.

◆ getFeatureIdentifier()

int CDPL.Descr.CircularFingerprintGenerator.getFeatureIdentifier ( int  ftr_idx)

Returns the identifier of the feature at index ftr_idx.

Parameters
ftr_idxThe zero-based feature index.
Returns
The feature identifier.

◆ getFeatureSubstructure() [1/2]

Util.BitSet CDPL.Descr.CircularFingerprintGenerator.getFeatureSubstructure ( int  ftr_idx)

Returns the atom-bit mask describing the substructure covered by the feature at index ftr_idx.

In the returned bitset, the bit at position i is set if the atom with index i is part of the feature substructure.

Parameters
ftr_idxThe zero-based feature index.
Returns
A reference to the atom-bit mask.

◆ getFeatureSubstructure() [2/2]

None CDPL.Descr.CircularFingerprintGenerator.getFeatureSubstructure ( int  ftr_idx,
Chem.Fragment  frag,
bool   clear = True 
)

Extracts the substructure covered by the feature at index ftr_idx into frag.

Parameters
ftr_idxThe zero-based feature index.
fragThe output fragment.
clearIf True, frag is cleared before atoms and bonds are added.

◆ getFeatureSubstructures()

None CDPL.Descr.CircularFingerprintGenerator.getFeatureSubstructures ( int  bit_idx,
int  bs_size,
Chem.FragmentList  frags,
bool   clear = True 
)

Extracts the substructures of every feature that, when folded into a bitset of size bs_size, maps to the bit index bit_idx.

Parameters
bit_idxThe target bit index.
bs_sizeThe bitset size used for the folding.
fragsThe output fragment list.
clearIf True, frags is cleared before any fragments are appended.

◆ assign()

CircularFingerprintGenerator CDPL.Descr.CircularFingerprintGenerator.assign ( CircularFingerprintGenerator  gen)

Replaces the current state of self with a copy of the state of the CircularFingerprintGenerator instance gen.

Parameters
genThe CircularFingerprintGenerator instance to copy.
Returns
self