Chemical Data Processing Library C++ API - Version 1.4.0
Classes | Public Types | Public Member Functions | Static Public Attributes | List of all members
CDPL::Descr::PathFingerprintGenerator Class Reference

Generation of Daylight-style path fingerprints of molecular graphs. More...

#include <PathFingerprintGenerator.hpp>

Classes

class  DefAtomDescriptorFunctor
 The default functor for the generation of atom descriptors. More...
 
class  DefBondDescriptorFunctor
 The default functor for the generation of bond descriptors. More...
 

Public Types

typedef std::function< std::uint64_t(const Chem::Atom &)> AtomDescriptorFunction
 Type of the generic functor class used to store user-defined functions or function objects for the generation of atom descriptors. More...
 
typedef std::function< std::uint64_t(const Chem::Bond &)> BondDescriptorFunction
 Type of the generic functor class used to store user-defined functions or function objects for the generation of bond descriptors. More...
 

Public Member Functions

 PathFingerprintGenerator ()
 Constructs the PathFingerprintGenerator instance. More...
 
 PathFingerprintGenerator (const Chem::MolecularGraph &molgraph, Util::BitSet &fp)
 Constructs the PathFingerprintGenerator instance and generates the fingerprint of the molecular graph molgraph. More...
 
void setAtomDescriptorFunction (const AtomDescriptorFunction &func)
 Allows to specify a custom function for the generation of atom descriptors. More...
 
void setBondDescriptorFunction (const BondDescriptorFunction &func)
 Allows to specify a custom function for the generation of bond descriptors. More...
 
void setMinPathLength (std::size_t min_length)
 Allows to specify the minimum length a path must have to contribute to the generated fingerprint. More...
 
std::size_t getMinPathLength () const
 Returns the minimum length a path must have to contribute to the generated fingerprint. More...
 
void setMaxPathLength (std::size_t max_length)
 Allows to specify the maximum considered path length. More...
 
std::size_t getMaxPathLength () const
 Returns the maximum considered path length. More...
 
void includeHydrogens (bool include)
 Specifies whether hydrogens shall be considered during path enumeration. More...
 
bool hydrogensIncluded () const
 Tells whether hydrogens are considered during path enumeration. More...
 
void generate (const Chem::MolecularGraph &molgraph, Util::BitSet &fp)
 Generates the fingerprint of the molecular graph molgraph. More...
 

Static Public Attributes

static constexpr unsigned int DEF_ATOM_PROPERTY_FLAGS
 Specifies the default set of atomic properties considered in the generation of atom descriptors by PathFingerprintGenerator::DefAtomDescriptorFunction. More...
 
static constexpr unsigned int DEF_BOND_PROPERTY_FLAGS
 Specifies the default set of bond properties considered in the generation of bond descriptors by PathFingerprintGenerator::DefBondDescriptorFunction. More...
 

Detailed Description

Generation of Daylight-style path fingerprints of molecular graphs.

The generator enumerates atom-bond-atom paths up to a configurable maximum length, derives a hash for each path from per-atom and per-bond descriptors (provided by the DefAtomDescriptorFunctor / DefBondDescriptorFunctor or user-supplied alternatives), and folds the hashes into a bitset of any size.

See also
[DTPFP]

Member Typedef Documentation

◆ AtomDescriptorFunction

typedef std::function<std::uint64_t(const Chem::Atom&)> CDPL::Descr::PathFingerprintGenerator::AtomDescriptorFunction

Type of the generic functor class used to store user-defined functions or function objects for the generation of atom descriptors.

Functions or function objects for the generation of atom descriptors are required to take the atom (as a const reference to Chem::Atom) as argument and return the descriptor as an integer of type std::uint64_t (see [FUNWRP]).

◆ BondDescriptorFunction

typedef std::function<std::uint64_t(const Chem::Bond&)> CDPL::Descr::PathFingerprintGenerator::BondDescriptorFunction

Type of the generic functor class used to store user-defined functions or function objects for the generation of bond descriptors.

Functions or function objects for the generation of bond descriptors are required to take the bond (as a const reference to Chem::Bond) as argument and return the descriptor as an integer of type std::uint64_t (see [FUNWRP]).

Constructor & Destructor Documentation

◆ PathFingerprintGenerator() [1/2]

CDPL::Descr::PathFingerprintGenerator::PathFingerprintGenerator ( )

Constructs the PathFingerprintGenerator instance.

◆ PathFingerprintGenerator() [2/2]

CDPL::Descr::PathFingerprintGenerator::PathFingerprintGenerator ( const Chem::MolecularGraph molgraph,
Util::BitSet fp 
)

Constructs the PathFingerprintGenerator instance and generates the fingerprint of the molecular graph molgraph.

Parameters
molgraphThe molecular graph for which to generate the fingerprint.
fpThe generated fingerprint.

Member Function Documentation

◆ setAtomDescriptorFunction()

void CDPL::Descr::PathFingerprintGenerator::setAtomDescriptorFunction ( const AtomDescriptorFunction func)

Allows to specify a custom function for the generation of atom descriptors.

Parameters
funcA PathFingerprintGenerator::AtomDescriptorFunction instance that wraps the target function.
Note
By default, atom descriptors are generated by PathFingerprintGenerator::DefAtomDescriptorFunctor.

◆ setBondDescriptorFunction()

void CDPL::Descr::PathFingerprintGenerator::setBondDescriptorFunction ( const BondDescriptorFunction func)

Allows to specify a custom function for the generation of bond descriptors.

Parameters
funcA PathFingerprintGenerator::BondDescriptorFunction instance that wraps the target function.
Note
By default, bond descriptors are generated by PathFingerprintGenerator::DefBondDescriptorFunctor.

◆ setMinPathLength()

void CDPL::Descr::PathFingerprintGenerator::setMinPathLength ( std::size_t  min_length)

Allows to specify the minimum length a path must have to contribute to the generated fingerprint.

Any path whose length (in number of bonds) is lower than the specified minimum length will not be represented by a corresponding bit in the generated fingerprint.

Parameters
min_lengthThe minimum path length in number of bonds.
Note
By default, the minimum path length is set to 0.

◆ getMinPathLength()

std::size_t CDPL::Descr::PathFingerprintGenerator::getMinPathLength ( ) const

Returns the minimum length a path must have to contribute to the generated fingerprint.

Returns
The minimum path length in number of bonds.
See also
setMinPathLength()

◆ setMaxPathLength()

void CDPL::Descr::PathFingerprintGenerator::setMaxPathLength ( std::size_t  max_length)

Allows to specify the maximum considered path length.

Any path whose length (in number of bonds) is greater than the specified maximum length will not be represented by a corresponding bit in the generated fingerprint.

Parameters
max_lengthThe maximum path length in number of bonds.
Note
By default, the maximum considered path length is 5.

◆ getMaxPathLength()

std::size_t CDPL::Descr::PathFingerprintGenerator::getMaxPathLength ( ) const

Returns the maximum considered path length.

Returns
The maximum path length in number of bonds.
See also
setMaxPathLength()

◆ includeHydrogens()

void CDPL::Descr::PathFingerprintGenerator::includeHydrogens ( bool  include)

Specifies whether hydrogens shall be considered during path enumeration.

Parameters
includeIf true, hydrogens are considered as regular atoms during fingerprint generation.
Since
1.3

◆ hydrogensIncluded()

bool CDPL::Descr::PathFingerprintGenerator::hydrogensIncluded ( ) const

Tells whether hydrogens are considered during path enumeration.

Returns
true if hydrogens are considered, and false otherwise.
Since
1.3

◆ generate()

void CDPL::Descr::PathFingerprintGenerator::generate ( const Chem::MolecularGraph molgraph,
Util::BitSet fp 
)

Generates the fingerprint of the molecular graph molgraph.

Parameters
molgraphThe molecular graph for which to generate the fingerprint.
fpThe generated fingerprint.

Member Data Documentation

◆ DEF_ATOM_PROPERTY_FLAGS

constexpr unsigned int CDPL::Descr::PathFingerprintGenerator::DEF_ATOM_PROPERTY_FLAGS
staticconstexpr
Initial value:
=
constexpr unsigned int FORMAL_CHARGE
Specifies the formal charge of an atom.
Definition: Chem/AtomPropertyFlag.hpp:73
constexpr unsigned int AROMATICITY
Specifies the membership of an atom in aromatic rings.
Definition: Chem/AtomPropertyFlag.hpp:93
constexpr unsigned int ISOTOPE
Specifies the isotopic mass of an atom.
Definition: Chem/AtomPropertyFlag.hpp:68
constexpr unsigned int TYPE
Specifies the generic type or element of an atom.
Definition: Chem/AtomPropertyFlag.hpp:63

Specifies the default set of atomic properties considered in the generation of atom descriptors by PathFingerprintGenerator::DefAtomDescriptorFunction.

◆ DEF_BOND_PROPERTY_FLAGS

constexpr unsigned int CDPL::Descr::PathFingerprintGenerator::DEF_BOND_PROPERTY_FLAGS
staticconstexpr
Initial value:
=
constexpr unsigned int AROMATICITY
Specifies the membership of a bond in aromatic rings.
Definition: BondPropertyFlag.hpp:73
constexpr unsigned int ORDER
Specifies the order of a bond.
Definition: BondPropertyFlag.hpp:63
constexpr unsigned int TOPOLOGY
Specifies the ring/chain topology of a bond.
Definition: BondPropertyFlag.hpp:68

Specifies the default set of bond properties considered in the generation of bond descriptors by PathFingerprintGenerator::DefBondDescriptorFunction.


The documentation for this class was generated from the following file: