RecordLinkageRule#
- class RecordLinkageRule#
RecordLinkageRule class. Describes a rule in the record linkage process.
- get_field_rule_type(self: pyhelayers.RecordLinkageRule, field_name: str) pyhelayers.RecordLinkageRuleType #
Get the rule type of a given field
- Parameters:
field_name – name of the field
- get_field_shingles_size(self: pyhelayers.RecordLinkageRule, field_name: str) int #
Get the shingles size generated for a given field
- Parameters:
field_name – name of the field
- get_field_shingles_weight(self: pyhelayers.RecordLinkageRule, field_name: str) int #
Get the shingles weight generated for a given field
- Parameters:
field_name – name of the field
- set_field(self: pyhelayers.RecordLinkageRule, field_name: str, type: pyhelayers.RecordLinkageRuleType, shingles_weight: int = 1, shingles_size: int = 5) None #
Set a rule for a specific field
- Parameters:
field_name – the name of the field to be set. The field must appear in the RecordLinkageConfig object given to the c’tor
type – rule type to apply for the given field
shingles_weight – if given rule is RL_RULE_SIMILAR, sets the weight of the shingle generated by that rule
shingles_size – if given rule is RL_RULE_SIMILAR, sets the size of the shingles generated by that rule
-
class RecordLinkageRule : public helayers::SaveableBasic#
RecordLinkageRule class.
Describes a rule in the record linkage process.
Public Functions
-
inline RecordLinkageRule()#
Construct an empty RecordLinkageRule object to be loaded with content with the load function.
-
inline RecordLinkageRule(const RecordLinkageConfig &config)#
Construct a new RecordLinkageRule object.
- Parameters:
config – The Record-Linkage configuration with record field definitions and various tunings of the Record-Linkage algorithm and heuristics.
-
void setField(const std::string &fieldName, RecordLinkageRuleType type, int shinglesWeight = 1, int shinglesSize = 5)#
set a rule for a specific field
- Parameters:
fieldName – the name of the field to be set. The field must appear in the RecordLinkageConfig object given to the c’tor
type – rule type to apply for the given field
shinglesWeight – if given rule is RL_RULE_SIMILAR, sets the weight of the shingle generated by that rule
shinglesSize – if given rule is RL_RULE_SIMILAR, sets the size of the shingles generated by that rule
- Throws:
runtime_error – if given field does not appear in the RecordLinkageConfig object given to the c’tor
-
RecordLinkageRuleType getFieldRuleType(const std::string &fieldName) const#
get the rule type of a given field
- Parameters:
fieldName – name of the field
-
int getFieldShinglesWeight(const std::string &fieldName) const#
get the shingles weight generated for a given field
- Parameters:
fieldName – name of the field
-
int getFieldShinglesSize(const std::string &fieldName) const#
get the shingles size generated for a given field
- Parameters:
fieldName – name of the field
-
virtual std::streamoff save(std::ostream &stream) const override#
Saves this object to a stream in binary form.
Returns the number of bytes written to the output stream.
- Parameters:
stream – [in] output stream to write to
-
virtual std::streamoff load(std::istream &stream) override#
Loads this object from the given stream.
Returns the number of bytes read from the input stream.
- Parameters:
stream – [in] input stream to read from
-
virtual void debugPrint(const std::string &title = "", Verbosity verbosity = VERBOSITY_REGULAR, std::ostream &out = std::cout) const override#
Prints the content of this object.
- Parameters:
title – Text to add to the print
verbosity – Verbosity level
out – Output stream
Friends
-
friend bool operator==(const RecordLinkageRule &r1, const RecordLinkageRule &r2)#
-
inline RecordLinkageRule()#