Package net.loomchild.segment.srx
Class RuleManager
java.lang.Object
net.loomchild.segment.srx.RuleManager
Represents segmentation rules manager.
Responsible for constructing and storing break and exception rules.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionRuleManager
(SrxDocument document, List<LanguageRule> languageRuleList, int maxLookbehindConstructLength) Constructor. -
Method Summary
Modifier and TypeMethodDescriptionprivate String
Creates exception pattern string that can be matched in the place where break rule was matched.getExceptionPattern
(Rule breakRule)
-
Field Details
-
document
-
maxLookbehindConstructLength
private int maxLookbehindConstructLength -
breakRuleList
-
exceptionPatternMap
-
-
Constructor Details
-
RuleManager
public RuleManager(SrxDocument document, List<LanguageRule> languageRuleList, int maxLookbehindConstructLength) Constructor. Responsible for retrieving rules from SRX document for given language code, constructing patterns and storing them in quick accessible format. Adds break rules tobreakRuleList
and constructs corresponding exception patterns inexceptionPatternMap
. Uses document cache to store rules and patterns.- Parameters:
document
- SRX documentlanguageRuleList
- list of language rulesmaxLookbehindConstructLength
- Maximum length of regular expression in lookbehind (seeUtil.finitize(String, int)
).
-
-
Method Details
-
getBreakRuleList
- Returns:
- break rule list
-
getExceptionPattern
- Parameters:
breakRule
-- Returns:
- exception pattern corresponding to give break rule
-
createExceptionPatternString
Creates exception pattern string that can be matched in the place where break rule was matched. Both parts of the rule (beforePattern and afterPattern) are incorporated into one pattern. beforePattern is used in lookbehind, therefore it needs to be modified so it matches finite string (contains no *, + or {n,}).- Parameters:
rule
- exception rule- Returns:
- string containing exception pattern
-