anonymity.tools package#
Subpackages#
Module contents#
- anonymity.tools.data_fly(table: DataFrame, ident: List | ndarray, qi: List | ndarray, k: int, supp_threshold: int, hierarchies: dict = {}) DataFrame #
Data-fly generalization algorithm for k-anonymity.
- Parameters:
table (pandas dataframe) – dataframe with the data under study.
ident (list of strings) – list with the name of the columns of the dataframe. that are identifiers.
qi (list of strings) – list with the name of the columns of the dataframe. that are quasi-identifiers.
k (int) – desired level of k-anonymity.
supp_threshold (int) – level of suppression allowed.
hierarchies (dictionary) – hierarchies for generalization of columns.
- Returns:
anonymized table.
- Return type:
pandas dataframe
- anonymity.tools.incognito(table: DataFrame, ident: List | ndarray, qi: List | ndarray, k: int, supp_threshold: int, hierarchies: dict) DataFrame #
Incognito generalization algorithm for k-anonymity.
- Parameters:
table (pandas dataframe) – dataframe with the data under study.
ident (list of strings) – list with the name of the columns of the dataframe. that are identifiers.
qi (list of strings) – list with the name of the columns of the dataframe. that are quasi-identifiers.
k (int) – desired level of k-anonymity.
supp_threshold (int) – level of suppression allowed.
hierarchies (dictionary) – hierarchies for generalization of columns.
- Returns:
anonymized table.
- Return type:
pandas dataframe
- anonymity.tools.k_anonymity(table: DataFrame, hierarchies: dict, k: int, qi: List | ndarray, supp_threshold: int, ident: List | ndarray, method: str) DataFrame #
Generalization algorithm for k-anonymity. Applies data-fly for default in case we don’t specify correctly.
- Parameters:
table (pandas dataframe) – dataframe with the data under study.
ident (list of strings) – list with the name of the columns of the dataframe. that are identifiers.
qi (list of strings) – list with the name of the columns of the dataframe. that are quasi-identifiers.
k (int) – desired level of k-anonymity.
supp_threshold (int) – level of suppression allowed.
hierarchies (dictionary) – hierarchies for generalization of columns.
method (string) – name of the anonymization method that we want to use.
- Returns:
anonymized table.
- Return type:
pandas dataframe
- anonymity.tools.l_diversity(table: DataFrame, sa: List | ndarray, qi: List | ndarray, k_method: str, l: int, ident: List | ndarray, supp_threshold: int, hierarchies: dict, k: int) DataFrame #
Apply l-diversity to an anonymized dataset.
- Parameters:
table (pandas dataframe) – dataframe with the data under study.
sa (list of strings) – list with the name of the columns of the dataframe. that are sensitive attributes.
ident (list of strings) – list with the name of the columns of the dataframe. that are identifiers.
qi (list of strings) – list with the name of the columns of the dataframe. that are quasi-identifiers.
k (int) – desired level of k-anonymity.
k_method (string) – desired algorithm for anonymization.
l (int) – desired level of l-diversity.
supp_threshold (int) – level of suppression allowed.
hierarchies (dictionary) – hierarchies for generalization of columns.
- Returns:
returns a list containing the value of l-diversity of the new table and the
anonymized table that satisfies l-diversity. :rtype: list
- anonymity.tools.t_closeness(table: DataFrame, sa: List | ndarray, qi: List | ndarray, t: float, k_method: str, ident: List | ndarray, supp_threshold: int, hierarchies: dict) DataFrame #
Apply t-closeness to an anonymized dataset.
- Parameters:
table (pandas dataframe) – dataframe with the data under study.
sa (list of strings) – list with the name of the columns of the dataframe. that are sensitive attributes.
qi (list of strings) – list with the name of the columns of the dataframe. that are quasi-identifiers.
t (float) – threshold for t-closeness
k_method (string) – string that specifies the type of k-anonymization we want to use
ident (list of strings) – list with the name of the columns of the dataframe. that are identifiers.
supp_threshold (int) – level of suppression allowed.
hierarchies (dictionary) – hierarchies for generalization of columns.
- Returns:
list which contains the value of t for the anonymized table, the current table that after applying t-closeness and true or false whether t-closeness is actually satisfied.
- Return type:
list
- anonymity.tools.t_closeness_supp(table: DataFrame, sa: List | ndarray, qi: List | ndarray, t: float, supp_lim: float = 1) DataFrame #
Apply t-closeness to an anonymized dataset using suppressing up to the established percentage allowed as input.
- Parameters:
table (pandas dataframe) – dataframe with the data under study.
sa (list of strings) – list with the name of the columns of the dataframe. that are sensitive attributes.
qi (list of strings) – list with the name of the columns of the dataframe. that are quasi-identifiers.
t (float) – threshold for t-closeness
supp_lim (float) – percentage of suppressed rows allowed
- Returns:
table that covers t-closeness.
- Return type:
pandas dataframe