src.llm.pattern_detection.buffered_processor_normalized.AhoCorasickBufferedProcessorNormalized
Bases: BaseBufferedProcessor
A buffered processor that performs normalized pattern matching ignoring whitespace.
This class implements pattern matching that is insensitive to whitespace variations by normalizing both patterns and input text. It uses the Aho-Corasick algorithm for efficient multiple pattern matching.
Attributes:
Name | Type | Description |
---|---|---|
automaton |
An instance of AhoCorasickAutomatonNormalized for pattern matching. |
|
max_pattern_len |
The length of the longest pattern in the normalized patterns. |
|
tool_call_message |
Message to include when a tool call is detected. |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
yaml_path
|
str
|
Path to the YAML file containing pattern definitions. |
required |
tool_call_message
|
str
|
Optional message to use when a tool call is detected. Defaults to "Tool call detected." |
'Tool call detected.'
|
Source code in src/llm/pattern_detection/buffered_processor_normalized.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
|
process_chunk_impl(combined_original)
Processes a chunk of text to find pattern matches while ignoring whitespace.
This method normalizes the input text, performs pattern matching, and returns the earliest match found along with any safe text that can be output.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`combined_original`
|
The original text chunk to process. |
required |
Returns:
Type | Description |
---|---|
A tuple containing:
|
processor = AhoCorasickBufferedProcessorNormalized('patterns.yaml')
result, trailing = processor.process_chunk_impl('some text')
print(result.matched, result.pattern_name) # False None
Source code in src/llm/pattern_detection/buffered_processor_normalized.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
|