Chemical Data Processing Library C++ API - Version 1.4.0
Classes | Functions
CDPL::Descr Namespace Reference

Contains classes and functions related to the generation and processing of pharmacophore and molecule descriptors. More...

Classes

class  AtomAutoCorrelation3DVectorCalculator
 AtomAutoCorrelation3DVectorCalculator. More...
 
class  AtomRDFCodeCalculator
 RDFCodeCalculator implementation for the calculation of atom-centered radial distribution function (RDF) codes of chemical structures. More...
 
class  AutoCorrelation2DVectorCalculator
 AutoCorrelation2DVectorCalculator. More...
 
class  AutoCorrelation3DVectorCalculator
 AutoCorrelation3DVectorCalculator. More...
 
class  BCUTDescriptorCalculator
 BCUTDescriptorCalculator. More...
 
class  BulkSimilarityCalculator
 
class  BurdenMatrixGenerator
 BurdenMatrixGenerator. More...
 
class  CircularFingerprintGenerator
 CircularFingerprintGenerator. More...
 
class  FeatureAutoCorrelation3DVectorCalculator
 FeatureAutoCorrelation3DVectorCalculator. More...
 
class  FeatureRDFCodeCalculator
 RDFCodeCalculator implementation for the calculation of feature-centered radial distribution function (RDF) codes of pharmacophores. More...
 
class  KuvekPocketDescriptorCalculator
 Implements the algorithm devised by Kuvek et al. [KBPD] for the calculation of receptor binding pocket shape and surface electrostatics descriptors. More...
 
class  MACCSFingerprintGenerator
 Generation of 166 bit MACCS key fingerprints. More...
 
class  MolecularComplexityCalculator
 MolecularComplexityCalculator. More...
 
class  MoleculeAutoCorr2DDescriptorCalculator
 MoleculeAutoCorr2DDescriptorCalculator. More...
 
class  MoleculeAutoCorr3DDescriptorCalculator
 MoleculeAutoCorr3DDescriptorCalculator. More...
 
class  MoleculeRDFDescriptorCalculator
 MoleculeRDFDescriptorCalculator. More...
 
class  NPoint2DPharmacophoreFingerprintGenerator
 NPoint2DPharmacophoreFingerprintGenerator. More...
 
class  NPoint3DPharmacophoreFingerprintGenerator
 NPoint3DPharmacophoreFingerprintGenerator. More...
 
class  NPointPharmacophoreFingerprintGenerator
 NPointPharmacophoreFingerprintGenerator. More...
 
class  PathFingerprintGenerator
 PathFingerprintGenerator. More...
 
class  PharmacophoreAutoCorr3DDescriptorCalculator
 PharmacophoreAutoCorr3DDescriptorCalculator. More...
 
class  PharmacophoreRDFDescriptorCalculator
 PharmacophoreRDFDescriptorCalculator. More...
 
class  PubChemFingerprintGenerator
 Generation of 881 bit PubChem fingerprints. More...
 
class  RDFCodeCalculator
 Generic implementation of the radial distribution function (RDF) code calculation for sequences of entities of arbitrary type. More...
 
class  TanimotoSimilarity
 Functor class for calculating Tanimoto Similarities [CITB] of bitsets and vectors. More...
 
class  CosineSimilarity
 Functor class for calculating Cosine Similarities [WCOS] of bitsets and vectors. More...
 
class  EuclideanSimilarity
 Functor class for calculating the Euclidean Similarity [GSIM] of bitsets. More...
 
class  ManhattanSimilarity
 Functor class for calculating the Manhattan Similarity [GSIM] of bitsets. More...
 
class  DiceSimilarity
 Functor class for calculating the Dice Similarity [GSIM] of bitsets. More...
 
class  TverskySimilarity
 Functor class for calculating the Tversky Similarity [GSIM] of bitsets. More...
 
class  HammingDistance
 Functor class for calculating the Hamming Distance [WHAM, CITB] between bitsets. More...
 
class  ManhattanDistance
 Functor class for calculating the Manhattan Distance [MADI] between bitsets and vectors. More...
 
class  EuclideanDistance
 Functor class for calculating the Euclidean Distance [CITB] between bitsets and vectors. More...
 

Functions

CDPL_DESCR_API double calcGeometricalRadius (const Chem::AtomContainer &cntnr, const Chem::Atom3DCoordinatesFunction &coords_func)
 Calculates the geometrical radius of the atoms in cntnr. More...
 
CDPL_DESCR_API double calcGeometricalDiameter (const Chem::AtomContainer &cntnr, const Chem::Atom3DCoordinatesFunction &coords_func)
 Calculates the geometrical diameter of the atoms in cntnr. More...
 
CDPL_DESCR_API double calcGeometricalRadius (const Chem::Entity3DContainer &cntnr)
 
CDPL_DESCR_API double calcGeometricalDiameter (const Chem::Entity3DContainer &cntnr)
 
CDPL_DESCR_API std::size_t calcTopologicalRadius (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API std::size_t calcTopologicalDiameter (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcRingComplexity (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcMolecularComplexity (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcKierShape1 (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcKierShape2 (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcKierShape3 (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API std::size_t calcWienerIndex (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcRandicIndex (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API std::size_t calcZagrebIndex1 (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API std::size_t calcZagrebIndex2 (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API std::size_t calcTotalWalkCount (const Chem::MolecularGraph &molgraph)
 
CDPL_DESCR_API double calcTanimotoSimilarity (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Tanimoto Similarity [CITB] of the bitsets bs1 and bs2. More...
 
template<typename V >
double calcTanimotoSimilarity (const V &v1, const V &v2)
 Calculates the Tanimoto Similarity [CITB] of the vectors v1 and v2. More...
 
CDPL_DESCR_API double calcCosineSimilarity (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Cosine Similarity [WCOS] of the bitsets bs1 and bs2. More...
 
template<typename V >
double calcCosineSimilarity (const V &v1, const V &v2)
 Calculates the Cosine Similarity [WCOS] of the vectors v1 and v2. More...
 
CDPL_DESCR_API double calcEuclideanSimilarity (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Euclidean Similarity [GSIM] of the bitsets bs1 and bs2. More...
 
CDPL_DESCR_API double calcManhattanSimilarity (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Manhattan Similarity [GSIM] of the bitsets bs1 and bs2. More...
 
CDPL_DESCR_API double calcDiceSimilarity (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Dice Similarity [GSIM] of the bitsets bs1 and bs2. More...
 
CDPL_DESCR_API double calcTverskySimilarity (const Util::BitSet &bs1, const Util::BitSet &bs2, double a, double b)
 Calculates the Tversky Similarity [GSIM] of the bitsets bs1 and bs2. More...
 
CDPL_DESCR_API std::size_t calcHammingDistance (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Hamming Distance [WHAM, CITB] between the bitsets bs1 and bs2. More...
 
template<typename V >
double calcManhattanDistance (const V &v1, const V &v2)
 Calculates the Manhattan Distance [MADI] between the vectors v1 and v2. More...
 
CDPL_DESCR_API double calcEuclideanDistance (const Util::BitSet &bs1, const Util::BitSet &bs2)
 Calculates the Euclidean Distance [CITB] between the bitsets bs1 and bs2. More...
 
template<typename V >
double calcEuclideanDistance (const V &v1, const V &v2)
 Calculates the Euclidean Distance [CITB] between the vectors v1 and v2. More...
 

Detailed Description

Contains classes and functions related to the generation and processing of pharmacophore and molecule descriptors.

Function Documentation

◆ calcGeometricalRadius() [1/2]

CDPL_DESCR_API double CDPL::Descr::calcGeometricalRadius ( const Chem::AtomContainer cntnr,
const Chem::Atom3DCoordinatesFunction coords_func 
)

Calculates the geometrical radius of the atoms in cntnr.

The geometrical radius is the minimum, taken over all atoms, of the maximum distance from a given atom to any other atom in the container. If cntnr contains at most one atom, 0 is returned.

Parameters
cntnrThe container with the atoms for which to calculate the geometrical radius.
coords_funcA function that provides the 3D coordinates of an atom.
Returns
The calculated geometrical radius.

◆ calcGeometricalDiameter() [1/2]

CDPL_DESCR_API double CDPL::Descr::calcGeometricalDiameter ( const Chem::AtomContainer cntnr,
const Chem::Atom3DCoordinatesFunction coords_func 
)

Calculates the geometrical diameter of the atoms in cntnr.

The geometrical diameter is the maximum distance between any pair of atoms in the container. If cntnr contains at most one atom, 0 is returned.

Parameters
cntnrThe container with the atoms for which to calculate the geometrical diameter.
coords_funcA function that provides the 3D coordinates of an atom.
Returns
The calculated geometrical diameter.

◆ calcGeometricalRadius() [2/2]

CDPL_DESCR_API double CDPL::Descr::calcGeometricalRadius ( const Chem::Entity3DContainer cntnr)

◆ calcGeometricalDiameter() [2/2]

CDPL_DESCR_API double CDPL::Descr::calcGeometricalDiameter ( const Chem::Entity3DContainer cntnr)

◆ calcTopologicalRadius()

CDPL_DESCR_API std::size_t CDPL::Descr::calcTopologicalRadius ( const Chem::MolecularGraph molgraph)

◆ calcTopologicalDiameter()

CDPL_DESCR_API std::size_t CDPL::Descr::calcTopologicalDiameter ( const Chem::MolecularGraph molgraph)

◆ calcRingComplexity()

CDPL_DESCR_API double CDPL::Descr::calcRingComplexity ( const Chem::MolecularGraph molgraph)

◆ calcMolecularComplexity()

CDPL_DESCR_API double CDPL::Descr::calcMolecularComplexity ( const Chem::MolecularGraph molgraph)

◆ calcKierShape1()

CDPL_DESCR_API double CDPL::Descr::calcKierShape1 ( const Chem::MolecularGraph molgraph)

◆ calcKierShape2()

CDPL_DESCR_API double CDPL::Descr::calcKierShape2 ( const Chem::MolecularGraph molgraph)

◆ calcKierShape3()

CDPL_DESCR_API double CDPL::Descr::calcKierShape3 ( const Chem::MolecularGraph molgraph)

◆ calcWienerIndex()

CDPL_DESCR_API std::size_t CDPL::Descr::calcWienerIndex ( const Chem::MolecularGraph molgraph)

◆ calcRandicIndex()

CDPL_DESCR_API double CDPL::Descr::calcRandicIndex ( const Chem::MolecularGraph molgraph)

◆ calcZagrebIndex1()

CDPL_DESCR_API std::size_t CDPL::Descr::calcZagrebIndex1 ( const Chem::MolecularGraph molgraph)

◆ calcZagrebIndex2()

CDPL_DESCR_API std::size_t CDPL::Descr::calcZagrebIndex2 ( const Chem::MolecularGraph molgraph)

◆ calcTotalWalkCount()

CDPL_DESCR_API std::size_t CDPL::Descr::calcTotalWalkCount ( const Chem::MolecularGraph molgraph)

◆ calcTanimotoSimilarity() [1/2]

CDPL_DESCR_API double CDPL::Descr::calcTanimotoSimilarity ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Tanimoto Similarity [CITB] of the bitsets bs1 and bs2.

The Tanimoto Similarity \( S_{ab} \) is calculated by:

\[ S_{ab} = \frac{N_{ab}}{N_a + N_b - N_{ab}} \]

where \( N_{ab} \) is the number of bits that are set in both bitsets, \( N_a \) is the number of bits that are set in the first bitset and \( N_b \) is the number of bits that are set in the second bitset.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated similarity measure.

◆ calcTanimotoSimilarity() [2/2]

template<typename V >
double CDPL::Descr::calcTanimotoSimilarity ( const V &  v1,
const V &  v2 
)
inline

Calculates the Tanimoto Similarity [CITB] of the vectors v1 and v2.

The Tanimoto Similarity \( S_{12} \) is calculated by:

\[ S_{12} = \frac{\vec{v}_1 \cdot \vec{v}_2}{{\left \| \vec{v}_1 \right \|}^2 + {\left \| \vec{v}_2 \right \|}^2 - \vec{v}_1 \cdot \vec{v}_2} \]

Parameters
v1The first vector.
v2The second vector.
Returns
The calculated similarity measure.
Since
1.2.3

◆ calcCosineSimilarity() [1/2]

CDPL_DESCR_API double CDPL::Descr::calcCosineSimilarity ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Cosine Similarity [WCOS] of the bitsets bs1 and bs2.

The Cosine Similarity \( S_{ab} \) is calculated by:

\[ S_{ab} = \frac{N_{ab}}{\sqrt{N_a * N_b}} \]

where \( N_{ab} \) is the number of bits that are set in both bitsets, \( N_a \) is the number of bits that are set in the first bitset and \( N_b \) is the number of bits that are set in the second bitset.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated similarity measure.

◆ calcCosineSimilarity() [2/2]

template<typename V >
double CDPL::Descr::calcCosineSimilarity ( const V &  v1,
const V &  v2 
)
inline

Calculates the Cosine Similarity [WCOS] of the vectors v1 and v2.

The Cosine Similarity \( S_{12} \) is calculated by:

\[ S_{12} = \frac{\vec{v}_1 \cdot \vec{v}_2}{{\left \| \vec{v}_1 \right \|}{\left \| \vec{v}_2 \right \|}} \]

Parameters
v1The first vector.
v2The second vector.
Returns
The calculated similarity measure.
Since
1.2.3

◆ calcEuclideanSimilarity()

CDPL_DESCR_API double CDPL::Descr::calcEuclideanSimilarity ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Euclidean Similarity [GSIM] of the bitsets bs1 and bs2.

The Euclidean Similarity \( S_{ab} \) is calculated by:

\[ S_{ab} = \sqrt{\frac{N_{ab} + N_{!ab}}{N_a + N_b + N_{ab} + N_{!ab}}} \]

where \( N_{ab} \) is the number of bits that are set in both bitsets, \( N_a \) is the number of bits that are set only in the first bitset, \( N_b \) is the number of bits that are set only in the second bitset and \( N_{!ab} \) is the number of bits that are not set in both bitsets.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated similarity measure.

◆ calcManhattanSimilarity()

CDPL_DESCR_API double CDPL::Descr::calcManhattanSimilarity ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Manhattan Similarity [GSIM] of the bitsets bs1 and bs2.

The Manhattan Similarity \( S_{ab} \) is calculated by:

\[ S_{ab} = 1 - \frac{N_a + N_b}{N_a + N_b + N_{ab} + N_{!ab}} \]

where \( N_{ab} \) is the number of bits that are set in both bitsets, \( N_a \) is the number of bits that are set only in the first bitset, \( N_b \) is the number of bits that are set only in the second bitset and \( N_{!ab} \) is the number of bits that are not set in both bitsets.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated similarity measure.

◆ calcDiceSimilarity()

CDPL_DESCR_API double CDPL::Descr::calcDiceSimilarity ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Dice Similarity [GSIM] of the bitsets bs1 and bs2.

The Dice Similarity \( S_{ab} \) is calculated by:

\[ S_{ab} = \frac{2 * N_{ab}}{N_a + N_b + 2 * N_{ab}} \]

where \( N_{ab} \) is the number of bits that are set in both bitsets, \( N_a \) is the number of bits that are only set in the first bitset and \( N_b \) is the number of bits that are only set in the second bitset.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated similarity measure.

◆ calcTverskySimilarity()

CDPL_DESCR_API double CDPL::Descr::calcTverskySimilarity ( const Util::BitSet bs1,
const Util::BitSet bs2,
double  a,
double  b 
)

Calculates the Tversky Similarity [GSIM] of the bitsets bs1 and bs2.

The Tversky Similarity \( S_{ab} \) is calculated by:

\[ S_{ab} = \frac{N_{ab}}{a * N_a + b * N_b + N_{ab}} \]

where \( N_{ab} \) is the number of bits that are set in both bitsets, \( N_a \) is the number of bits that are only set in the first bitset and \( N_b \) is the number of bits that are only set in the second bitset. \( a \) and \( b \) are bitset contribution weighting factors.

The Tversky measure is asymmetric. Setting the parameters \( a = b = 1.0 \) makes it identical to the Tanimoto measure.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
aWeights the contribution of the first bitset.
bWeights the contribution of the second bitset.
Returns
The calculated similarity measure.

◆ calcHammingDistance()

CDPL_DESCR_API std::size_t CDPL::Descr::calcHammingDistance ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Hamming Distance [WHAM, CITB] between the bitsets bs1 and bs2.

The Hamming Distance \( D_{ab} \) is calculated by:

\[ D_{ab} = N_a + N_b \]

where \( N_a \) is the number of bits that are set in the first bitset but not in the second bitset and \( N_b \) is the number of bits that are set in the second bitset but not in the first one.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated distance.

◆ calcManhattanDistance()

template<typename V >
double CDPL::Descr::calcManhattanDistance ( const V &  v1,
const V &  v2 
)
inline

Calculates the Manhattan Distance [MADI] between the vectors v1 and v2.

The Manhattan Distance \( D_{12} \) is calculated by:

\[ D_{12} = {\left \| \vec{v}_1 - \vec{v}_2 \right \|}_1 \]

Parameters
v1The first vector.
v2The second vector.
Returns
The calculated distance measure.
Since
1.2.3

◆ calcEuclideanDistance() [1/2]

CDPL_DESCR_API double CDPL::Descr::calcEuclideanDistance ( const Util::BitSet bs1,
const Util::BitSet bs2 
)

Calculates the Euclidean Distance [CITB] between the bitsets bs1 and bs2.

The Euclidean Distance \( D_{ab} \) is calculated by:

\[ D_{ab} = \sqrt{N_a + N_b} \]

where \( N_a \) is the number of bits that are set in the first bitset but not in the second bitset and \( N_b \) is the number of bits that are set in the second bitset but not in the first one.

If the specified bitsets bs1 and bs2 are of different size, missing bits at the end of the smaller bitset are assumed to be zero.

Parameters
bs1The first bitset.
bs2The second bitset.
Returns
The calculated distance.

◆ calcEuclideanDistance() [2/2]

template<typename V >
double CDPL::Descr::calcEuclideanDistance ( const V &  v1,
const V &  v2 
)
inline

Calculates the Euclidean Distance [CITB] between the vectors v1 and v2.

The Euclidean Distance \( D_{12} \) is calculated by:

\[ D_{12} = {\left \| \vec{v}_1 - \vec{v}_2 \right \|} \]

Parameters
v1The first vector.
v2The second vector.
Returns
The calculated distance measure.
Since
1.2.3