artifician.processors.normalizer module
Copyright 2021 Plato Solutions, Inc.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
- class artifician.processors.normalizer.KeyValuesNormalizer
Bases:
NormalizerStrategy
split by delimiter into a format that preserves value and label association found.
- normalize(feature_raw, delimiter)
split by delimiter into a format that preserves value and label association found.
- Args:
feature_raw (string): feature_raw delimiter: delimiter is used for breaking string
- Return:
feature_normalized (list): list of tuple of normalized feature text values
- static normalize_key_values(key_values, assignment)
break down text using assignment into key value pair
- Args:
key_values (list): list of strings assignment (string): string that separates key and values
- Return:
feature_normalized (list): list of tuple of normalized feature text values
- class artifician.processors.normalizer.Normalizer(strategy=None, delimiter=None)
Bases:
Processor
Normalize the given string value
- Attributes:
strategy (NormalizerStrategy): strategy for normalizing string delimiter (dictionary): delimiter for splitting the string
- process(publisher, feature_raw)
Normalize the feature_raw value Note : publisher.feature_value is updated instead of returning the value as normalizer being a processor
- Args:
publisher (object): instance of the publisher feature_raw (string): feature value
- Returns:
None
- subscribe(publisher, pool_scheduler=None)
Defines logic for subscribing to an event in publisher
- Args:
publisher (object): instance of the publisher pool_scheduler (rx.scheduler.ThreadPoolScheduler): scheduler instance for concurrency
- Returns:
None
- class artifician.processors.normalizer.NormalizerStrategy
Bases:
ABC
interface for normalizer strategies
- abstract normalize(feature_raw, delimiter)
- class artifician.processors.normalizer.PathsNormalizer
Bases:
NormalizerStrategy
split by delimiter into a format that preserves position within tree of each value found
- static get_path_values(feature_raw_values, delimiter)
gets path values sequentially
- Args:
feature_raw_values (list): list of strings delimiter (string): delimiter is used for breaking string
- Return:
feature_normalized (list): list of tuple of normalized feature text values
- normalize(feature_raw, delimiter)
split by delimiter into a format that preserves position within tree of each value found
- Args:
feature_raw (string): feature text delimiter (dict): delimiter is used for breaking string
- Return:
feature_normalized (list): list of tuple of normalized feature text values
- class artifician.processors.normalizer.PropertiesNormalizer
Bases:
NormalizerStrategy
Split by delimiter into a format that preserves the sequential position of each value found.
- normalize(feature_raw, delimiter)
- split by delimiter into format that preserves sequential position
of each value in feature text found
- Args:
delimiter: delimiter is used for breaking string feature_raw (string): feature_raw
- Return:
feature_normalized (list): list of tuple of normalized feature raw
- class artifician.processors.normalizer.StrategySelector
Bases:
object
Based on the text input select the appropriate normalizer strategy to normalize the text
- get_key_values_delimiter(texts)
Identify whether the given texts is a key values string if yes return the appropriate delimiter to normalize text
- Args:
texts (str): list of strings
- Returns:
Bool (True/False): True if the given texts is identified as key:values text else returns false
- get_paths_delimiter(texts)
Identify whether the given texts is a paths string if yes return the appropriate delimiter to normalize text
- Args:
texts (list): list of strings
- Returns:
Bool (True/False): True if the given texts is identified as paths texts
- get_properties_delimiter(texts)
Identify whether the given texts is a properties string if yes return the appropriate delimiter to normalize text
- Args:
texts (str): list of strings
- Returns:
delimiter (dict): delimiter to normalize the string
- select(texts)
- Args:
texts(list): list of strings
- Returns:
strategy (NormalizerStrategy): NormalizerStrategy instance