![]() |
Chemical Data Processing Library C++ API - Version 1.4.0
|
Generation of atom-centered circular substructure fingerprints in the spirit of SciTegic's Extended Connectivity Fingerprints (ECFP). More...
#include <CircularFingerprintGenerator.hpp>
Classes | |
| class | DefAtomIdentifierFunctor |
| The functor for the generation of ECFP atom identifiers. More... | |
| class | DefBondIdentifierFunctor |
| The default functor for the generation of bond identifiers. More... | |
Public Types | |
| typedef std::function< std::uint64_t(const Chem::Atom &, const Chem::MolecularGraph &)> | AtomIdentifierFunction |
| Type of the generic functor class used to store user-defined functions or function objects for the generation of atom identifiers. More... | |
| typedef std::function< std::uint64_t(const Chem::Bond &)> | BondIdentifierFunction |
| Type of the generic functor class used to store user-defined functions or function objects for the generation of bond identifiers. More... | |
Public Member Functions | |
| CircularFingerprintGenerator () | |
Constructs the CircularFingerprintGenerator instance. More... | |
| CircularFingerprintGenerator (const Chem::MolecularGraph &molgraph) | |
Constructs the CircularFingerprintGenerator instance and generates the atom-centered circular substructure fingerprint of the molecular graph molgraph. More... | |
| void | setAtomIdentifierFunction (const AtomIdentifierFunction &func) |
| Allows to specify a customized function for the generation of initial atom identifiers. More... | |
| void | setBondIdentifierFunction (const BondIdentifierFunction &func) |
| Allows to specify a customized function for the generation of initial bond identifiers. More... | |
| void | setNumIterations (std::size_t num_iter) |
| Allows to specify the desired number of feature substructure growing iterations. More... | |
| std::size_t | getNumIterations () const |
| Returns the number of feature substructure growing iterations. More... | |
| void | includeHydrogens (bool include) |
| Specifies whether hydrogens shall be included in the generated fingerprint. More... | |
| bool | hydrogensIncluded () const |
| Tells whether hydrogens are considered during fingerprint generation. More... | |
| void | includeChirality (bool include) |
| Specifies whether atom stereo configurations shall be incorporated into atom identifiers. More... | |
| bool | chiralityIncluded () const |
| Tells whether atom chirality is considered during fingerprint generation. More... | |
| void | generate (const Chem::MolecularGraph &molgraph) |
| Generates the atom-centered circular substructure fingerprint of the molecular graph molgraph. More... | |
| void | setFeatureBits (Util::BitSet &bs, bool reset=true) const |
| Maps previously generated feature identifiers to bit indices and sets the correponding bits of bs. More... | |
| void | setFeatureBits (std::size_t atom_idx, Util::BitSet &bs, bool reset=true) const |
| Maps previously generated identifiers of structural features involving the atom specified by atom_idx to bit indices and sets the correponding bits of bs. More... | |
| std::size_t | getNumFeatures () const |
| Returns the number of features generated by the most recent call to generate(). More... | |
| std::uint64_t | getFeatureIdentifier (std::size_t ftr_idx) const |
| Returns the identifier of the feature at index ftr_idx. More... | |
| const Util::BitSet & | getFeatureSubstructure (std::size_t ftr_idx) const |
| Returns the atom-bit mask describing the substructure covered by the feature at index ftr_idx. More... | |
| void | getFeatureSubstructure (std::size_t ftr_idx, Chem::Fragment &frag, bool clear=true) const |
| Extracts the substructure covered by the feature at index ftr_idx into frag. More... | |
| void | getFeatureSubstructures (std::size_t bit_idx, std::size_t bs_size, Chem::FragmentList &frags, bool clear=true) const |
| Extracts the substructures of every feature that, when folded into a bitset of size bs_size, maps to the bit index bit_idx. More... | |
Static Public Attributes | |
| static constexpr unsigned int | DEF_ATOM_PROPERTY_FLAGS |
| Specifies the default set of atomic properties considered in the generation of atom identifiers by DefAtomIdentifierFunctor. More... | |
| static constexpr unsigned int | DEF_BOND_PROPERTY_FLAGS |
| Specifies the default set of bond properties considered in the generation of bond identifiers by DefBondIdentifierFunctor. More... | |
Generation of atom-centered circular substructure fingerprints in the spirit of SciTegic's Extended Connectivity Fingerprints (ECFP).
Starting from initial atom and bond identifiers (generated either by the built-in DefAtomIdentifierFunctor / DefBondIdentifierFunctor or by user-supplied functions) the generator runs a configurable number of growing iterations (see setNumIterations()). Each iteration produces a new set of feature identifiers from the identifiers of the previous iteration and the connecting bonds, capturing circular substructures of incrementing radius. The resulting feature identifiers can be folded into a bitset of any size via setFeatureBits().
| typedef std::function<std::uint64_t(const Chem::Atom&, const Chem::MolecularGraph&)> CDPL::Descr::CircularFingerprintGenerator::AtomIdentifierFunction |
Type of the generic functor class used to store user-defined functions or function objects for the generation of atom identifiers.
Functions or function objects for the generation of atom identifiers are required to take the atom (as a const reference to Chem::Atom) and containing molecular graph (as a const reference to Chem::MolecularGraph) as argument and return the identifier as an integer of type std::uint64_t (see [FUNWRP]).
| typedef std::function<std::uint64_t(const Chem::Bond&)> CDPL::Descr::CircularFingerprintGenerator::BondIdentifierFunction |
Type of the generic functor class used to store user-defined functions or function objects for the generation of bond identifiers.
Functions or function objects for the generation of bond identifiers are required to take the bond (as a const reference to Chem::Bond) as argument and return the identifier as an integer of type std::uint64_t (see [FUNWRP]).
| CDPL::Descr::CircularFingerprintGenerator::CircularFingerprintGenerator | ( | ) |
Constructs the CircularFingerprintGenerator instance.
| CDPL::Descr::CircularFingerprintGenerator::CircularFingerprintGenerator | ( | const Chem::MolecularGraph & | molgraph | ) |
Constructs the CircularFingerprintGenerator instance and generates the atom-centered circular substructure fingerprint of the molecular graph molgraph.
| molgraph | The molecular graph to process. |
| void CDPL::Descr::CircularFingerprintGenerator::setAtomIdentifierFunction | ( | const AtomIdentifierFunction & | func | ) |
Allows to specify a customized function for the generation of initial atom identifiers.
| func | A CircularFingerprintGenerator::AtomIdentifierFunction instance that wraps the target function. |
| void CDPL::Descr::CircularFingerprintGenerator::setBondIdentifierFunction | ( | const BondIdentifierFunction & | func | ) |
Allows to specify a customized function for the generation of initial bond identifiers.
| func | A CircularFingerprintGenerator::BondIdentifierFunction instance that wraps the target function. |
| void CDPL::Descr::CircularFingerprintGenerator::setNumIterations | ( | std::size_t | num_iter | ) |
Allows to specify the desired number of feature substructure growing iterations.
| num_iter | The number of iterations. |
| std::size_t CDPL::Descr::CircularFingerprintGenerator::getNumIterations | ( | ) | const |
Returns the number of feature substructure growing iterations.
| void CDPL::Descr::CircularFingerprintGenerator::includeHydrogens | ( | bool | include | ) |
Specifies whether hydrogens shall be included in the generated fingerprint.
| include | If true, hydrogens are considered as regular atoms during fingerprint generation. |
| bool CDPL::Descr::CircularFingerprintGenerator::hydrogensIncluded | ( | ) | const |
Tells whether hydrogens are considered during fingerprint generation.
true if hydrogens are considered, and false otherwise. | void CDPL::Descr::CircularFingerprintGenerator::includeChirality | ( | bool | include | ) |
Specifies whether atom stereo configurations shall be incorporated into atom identifiers.
| include | If true, atom chirality is considered during fingerprint generation. |
| bool CDPL::Descr::CircularFingerprintGenerator::chiralityIncluded | ( | ) | const |
Tells whether atom chirality is considered during fingerprint generation.
true if atom chirality is considered, and false otherwise. | void CDPL::Descr::CircularFingerprintGenerator::generate | ( | const Chem::MolecularGraph & | molgraph | ) |
Generates the atom-centered circular substructure fingerprint of the molecular graph molgraph.
| molgraph | The molecular graph to process. |
| void CDPL::Descr::CircularFingerprintGenerator::setFeatureBits | ( | Util::BitSet & | bs, |
| bool | reset = true |
||
| ) | const |
Maps previously generated feature identifiers to bit indices and sets the correponding bits of bs.
| bs | The target bitset. |
| reset | If true, bs will be cleared before any feature bits are set. |
| void CDPL::Descr::CircularFingerprintGenerator::setFeatureBits | ( | std::size_t | atom_idx, |
| Util::BitSet & | bs, | ||
| bool | reset = true |
||
| ) | const |
Maps previously generated identifiers of structural features involving the atom specified by atom_idx to bit indices and sets the correponding bits of bs.
| atom_idx | The index of the atom that has to be involved in the structural features. |
| bs | The target bitset. |
| reset | If true, bs will be cleared before any feature bits are set. |
| std::size_t CDPL::Descr::CircularFingerprintGenerator::getNumFeatures | ( | ) | const |
Returns the number of features generated by the most recent call to generate().
| std::uint64_t CDPL::Descr::CircularFingerprintGenerator::getFeatureIdentifier | ( | std::size_t | ftr_idx | ) | const |
Returns the identifier of the feature at index ftr_idx.
| ftr_idx | The zero-based feature index. |
| const Util::BitSet& CDPL::Descr::CircularFingerprintGenerator::getFeatureSubstructure | ( | std::size_t | ftr_idx | ) | const |
Returns the atom-bit mask describing the substructure covered by the feature at index ftr_idx.
In the returned bitset, the bit at position i is set if the atom with index i is part of the feature substructure.
| ftr_idx | The zero-based feature index. |
const reference to the atom-bit mask. | void CDPL::Descr::CircularFingerprintGenerator::getFeatureSubstructure | ( | std::size_t | ftr_idx, |
| Chem::Fragment & | frag, | ||
| bool | clear = true |
||
| ) | const |
Extracts the substructure covered by the feature at index ftr_idx into frag.
| ftr_idx | The zero-based feature index. |
| frag | The output fragment. |
| clear | If true, frag is cleared before atoms and bonds are added. |
| void CDPL::Descr::CircularFingerprintGenerator::getFeatureSubstructures | ( | std::size_t | bit_idx, |
| std::size_t | bs_size, | ||
| Chem::FragmentList & | frags, | ||
| bool | clear = true |
||
| ) | const |
Extracts the substructures of every feature that, when folded into a bitset of size bs_size, maps to the bit index bit_idx.
| bit_idx | The target bit index. |
| bs_size | The bitset size used for the folding. |
| frags | The output fragment list. |
| clear | If true, frags is cleared before any fragments are appended. |
|
staticconstexpr |
Specifies the default set of atomic properties considered in the generation of atom identifiers by DefAtomIdentifierFunctor.
|
staticconstexpr |
Specifies the default set of bond properties considered in the generation of bond identifiers by DefBondIdentifierFunctor.