Chemical Data Processing Library C++ API - Version 1.4.0
Classes | Public Types | Public Member Functions | List of all members
CDPL::Chem::SubstructureHistogramCalculator Class Reference

Counts occurrences of registered SMARTS substructure queries in a molecular graph, emitting the per-pattern hit counts into a user-supplied histogram container. More...

#include <SubstructureHistogramCalculator.hpp>

Classes

class  Pattern
 Holds a single SMARTS query pattern, its histogram ID, its priority and match-handling flags. More...
 

Public Types

typedef std::shared_ptr< SubstructureHistogramCalculatorSharedPointer
 A reference-counted smart pointer [SHPTR] for dynamically allocated SubstructureHistogramCalculator instances. More...
 
typedef PatternList::const_iterator ConstPatternIterator
 A constant iterator over the registered patterns. More...
 
typedef PatternList::iterator PatternIterator
 A mutable iterator over the registered patterns. More...
 

Public Member Functions

 SubstructureHistogramCalculator ()
 Constructs an empty SubstructureHistogramCalculator instance. More...
 
 SubstructureHistogramCalculator (const SubstructureHistogramCalculator &gen)
 Constructs a copy of the SubstructureHistogramCalculator instance gen. More...
 
void addPattern (const MolecularGraph::SharedPointer &molgraph, std::size_t id, std::size_t priority=0, bool all_matches=true, bool unique_matches=true)
 Registers a new pattern by its query molecular graph and per-pattern settings. More...
 
void addPattern (const Pattern &pattern)
 Appends a copy of the pre-built pattern pattern. More...
 
const PatterngetPattern (std::size_t idx) const
 Returns the registered pattern at index idx. More...
 
void removePattern (std::size_t idx)
 Removes the registered pattern at index idx. More...
 
void removePattern (const PatternIterator &it)
 Removes the registered pattern referenced by it. More...
 
void clear ()
 Removes all registered patterns. More...
 
std::size_t getNumPatterns () const
 Returns the number of registered patterns. More...
 
ConstPatternIterator getPatternsBegin () const
 Returns a constant iterator pointing to the first registered pattern. More...
 
ConstPatternIterator getPatternsEnd () const
 Returns a constant iterator pointing one past the last registered pattern. More...
 
PatternIterator getPatternsBegin ()
 Returns a mutable iterator pointing to the first registered pattern. More...
 
PatternIterator getPatternsEnd ()
 Returns a mutable iterator pointing one past the last registered pattern. More...
 
ConstPatternIterator begin () const
 Returns a constant iterator pointing to the first registered pattern (range-based for support). More...
 
ConstPatternIterator end () const
 Returns a constant iterator pointing one past the last registered pattern (range-based for support). More...
 
PatternIterator begin ()
 Returns a mutable iterator pointing to the first registered pattern (range-based for support). More...
 
PatternIterator end ()
 Returns a mutable iterator pointing one past the last registered pattern (range-based for support). More...
 
template<typename T >
void calculate (const MolecularGraph &molgraph, T &histo)
 Counts substructure occurrences in molgraph and writes the per-pattern hit counts to histo. More...
 
SubstructureHistogramCalculatoroperator= (const SubstructureHistogramCalculator &gen)
 Replaces the state of this calculator by a copy of the state of gen. More...
 

Detailed Description

Counts occurrences of registered SMARTS substructure queries in a molecular graph, emitting the per-pattern hit counts into a user-supplied histogram container.

Patterns are added via addPattern() (each pattern carries a numeric ID, a priority and match-handling flags). On calculate() the registered patterns are run in priority order against the input molecular graph; matched atom/bond regions are masked so that subsequent lower-priority patterns cannot re-count overlapping substructures. The per-pattern hit count is then forwarded to the histogram via the expression histo[id]++ for every accepted match.

Member Typedef Documentation

◆ SharedPointer

A reference-counted smart pointer [SHPTR] for dynamically allocated SubstructureHistogramCalculator instances.

◆ ConstPatternIterator

A constant iterator over the registered patterns.

◆ PatternIterator

A mutable iterator over the registered patterns.

Constructor & Destructor Documentation

◆ SubstructureHistogramCalculator() [1/2]

CDPL::Chem::SubstructureHistogramCalculator::SubstructureHistogramCalculator ( )

Constructs an empty SubstructureHistogramCalculator instance.

◆ SubstructureHistogramCalculator() [2/2]

CDPL::Chem::SubstructureHistogramCalculator::SubstructureHistogramCalculator ( const SubstructureHistogramCalculator gen)

Constructs a copy of the SubstructureHistogramCalculator instance gen.

Parameters
genThe SubstructureHistogramCalculator to copy.

Member Function Documentation

◆ addPattern() [1/2]

void CDPL::Chem::SubstructureHistogramCalculator::addPattern ( const MolecularGraph::SharedPointer molgraph,
std::size_t  id,
std::size_t  priority = 0,
bool  all_matches = true,
bool  unique_matches = true 
)

Registers a new pattern by its query molecular graph and per-pattern settings.

Parameters
molgraphThe SMARTS query molecular graph.
idThe histogram-bin ID to which matches of this pattern contribute.
priorityThe pattern's priority; higher-priority patterns are evaluated first.
all_matchesIf true, every match of the query is processed; otherwise only the first.
unique_matchesIf true, only one of multiple equivalent substructure mappings is processed per match.

◆ addPattern() [2/2]

void CDPL::Chem::SubstructureHistogramCalculator::addPattern ( const Pattern pattern)

Appends a copy of the pre-built pattern pattern.

Parameters
patternThe pattern to copy and register.

◆ getPattern()

const Pattern& CDPL::Chem::SubstructureHistogramCalculator::getPattern ( std::size_t  idx) const

Returns the registered pattern at index idx.

Parameters
idxThe zero-based pattern index.
Returns
A const reference to the pattern.
Exceptions
Base::IndexErrorif the number of patterns is zero or idx is not in the range [0, getNumPatterns() - 1].

◆ removePattern() [1/2]

void CDPL::Chem::SubstructureHistogramCalculator::removePattern ( std::size_t  idx)

Removes the registered pattern at index idx.

Parameters
idxThe zero-based pattern index.
Exceptions
Base::IndexErrorif the number of patterns is zero or idx is not in the range [0, getNumPatterns() - 1].

◆ removePattern() [2/2]

void CDPL::Chem::SubstructureHistogramCalculator::removePattern ( const PatternIterator it)

Removes the registered pattern referenced by it.

Parameters
itIterator referencing the pattern to remove.
Exceptions
Base::IndexErrorif it is not in the range [getPatternsBegin(), getPatternsEnd() - 1].

◆ clear()

void CDPL::Chem::SubstructureHistogramCalculator::clear ( )

Removes all registered patterns.

◆ getNumPatterns()

std::size_t CDPL::Chem::SubstructureHistogramCalculator::getNumPatterns ( ) const

Returns the number of registered patterns.

Returns
The pattern count.

◆ getPatternsBegin() [1/2]

ConstPatternIterator CDPL::Chem::SubstructureHistogramCalculator::getPatternsBegin ( ) const

Returns a constant iterator pointing to the first registered pattern.

Returns
A constant iterator pointing to the first pattern.

◆ getPatternsEnd() [1/2]

ConstPatternIterator CDPL::Chem::SubstructureHistogramCalculator::getPatternsEnd ( ) const

Returns a constant iterator pointing one past the last registered pattern.

Returns
A constant iterator pointing one past the last pattern.

◆ getPatternsBegin() [2/2]

PatternIterator CDPL::Chem::SubstructureHistogramCalculator::getPatternsBegin ( )

Returns a mutable iterator pointing to the first registered pattern.

Returns
A mutable iterator pointing to the first pattern.

◆ getPatternsEnd() [2/2]

PatternIterator CDPL::Chem::SubstructureHistogramCalculator::getPatternsEnd ( )

Returns a mutable iterator pointing one past the last registered pattern.

Returns
A mutable iterator pointing one past the last pattern.

◆ begin() [1/2]

ConstPatternIterator CDPL::Chem::SubstructureHistogramCalculator::begin ( ) const

Returns a constant iterator pointing to the first registered pattern (range-based for support).

Returns
A constant iterator pointing to the first pattern.

◆ end() [1/2]

ConstPatternIterator CDPL::Chem::SubstructureHistogramCalculator::end ( ) const

Returns a constant iterator pointing one past the last registered pattern (range-based for support).

Returns
A constant iterator pointing one past the last pattern.

◆ begin() [2/2]

PatternIterator CDPL::Chem::SubstructureHistogramCalculator::begin ( )

Returns a mutable iterator pointing to the first registered pattern (range-based for support).

Returns
A mutable iterator pointing to the first pattern.

◆ end() [2/2]

PatternIterator CDPL::Chem::SubstructureHistogramCalculator::end ( )

Returns a mutable iterator pointing one past the last registered pattern (range-based for support).

Returns
A mutable iterator pointing one past the last pattern.

◆ calculate()

template<typename T >
void CDPL::Chem::SubstructureHistogramCalculator::calculate ( const MolecularGraph molgraph,
T &  histo 
)

Counts substructure occurrences in molgraph and writes the per-pattern hit counts to histo.

For every accepted match, the histogram is updated via histo[id] += 1 with the ID being the histogram-bin ID of the matching pattern.

Template Parameters
TThe histogram type (must support operator[] returning an arithmetic value).
Parameters
molgraphThe molecular graph to be analyzed.
histoThe histogram receiving the hit counts.

◆ operator=()

SubstructureHistogramCalculator& CDPL::Chem::SubstructureHistogramCalculator::operator= ( const SubstructureHistogramCalculator gen)

Replaces the state of this calculator by a copy of the state of gen.

Parameters
genThe source SubstructureHistogramCalculator.
Returns
A reference to itself.

The documentation for this class was generated from the following file: