abacusai.feature_group

Module Contents

Classes

FeatureGroup

param client

An authenticated API Client instance

class abacusai.feature_group.FeatureGroup(client, modificationLock=None, featureGroupId=None, name=None, featureGroupSourceType=None, tableName=None, sql=None, datasetId=None, functionSourceCode=None, functionName=None, sourceTables=None, createdAt=None, description=None, featureGroupType=None, sqlError=None, latestVersionOutdated=None, tags=None, primaryKey=None, updateTimestampKey=None, lookupKeys=None, streamingEnabled=None, featureGroupUse=None, incremental=None, mergeConfig=None, transformConfig=None, samplingConfig=None, cpuSize=None, memory=None, streamingReady=None, featureTags=None, moduleName=None, templateBindings=None, featureExpression=None, features={}, duplicateFeatures={}, pointInTimeGroups={}, concatenationConfig={}, indexingConfig={}, codeSource={}, featureGroupTemplate={}, latestFeatureGroupVersion={})

Bases: abacusai.return_class.AbstractApiClass

Parameters
__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns

The dict value representation of the class parameters

Return type

dict

add_to_project(project_id, feature_group_type='CUSTOM_TABLE', feature_group_use=None)

Adds a feature group to a project,

Parameters
  • project_id (str) – The unique ID associated with the project.

  • feature_group_type (str) – The feature group type of the feature group. The type is based on the use case under which the feature group is being created. For example, Catalog Attributes can be a feature group type under personalized recommendation use case.

  • feature_group_use (str) – The user assigned feature group use which allows for organizing project feature groups DATA_WRANGLING, TRAINING_INPUT, BATCH_PREDICTION_INPUT

remove_from_project(project_id)

Removes a feature group from a project.

Parameters

project_id (str) – The unique ID associated with the project.

set_type(project_id, feature_group_type='CUSTOM_TABLE')

Update the feature group type in a project. The feature group must already be added to the project.

Parameters
  • project_id (str) – The unique ID associated with the project.

  • feature_group_type (str) – The feature group type to set the feature group as. The type is based on the use case under which the feature group is being created. For example, Catalog Attributes can be a feature group type under personalized recommendation use case.

use_for_training(project_id, use_for_training=True)

Use the feature group for model training input

Parameters
  • project_id (str) – The unique ID associated with the project.

  • use_for_training (bool) – Boolean variable to include or exclude a feature group from a model’s training. Only one feature group per type can be used for training

create_sampling(table_name, sampling_config, description=None)

Creates a new feature group defined as a sample of rows from another feature group.

For efficiency, sampling is approximate unless otherwise specified. (E.g. the number of rows may vary slightly from what was requested).

Parameters
  • table_name (str) – The unique name to be given to this sampling feature group.

  • sampling_config (dict) – JSON object (aka map) defining the sampling method and its parameters.

  • description (str) – A human-readable description of this feature group.

Returns

The created feature group.

Return type

FeatureGroup

set_sampling_config(sampling_config)

Set a FeatureGroup’s sampling to the config values provided, so that the rows the FeatureGroup returns will be a sample of those it would otherwise have returned.

Currently, sampling is only for Sampling FeatureGroups, so this API only allows calling on that kind of FeatureGroup.

Parameters

sampling_config (dict) – A json object string specifying the sampling method and parameters specific to that sampling method. Empty sampling_config means no sampling.

Returns

The updated feature group.

Return type

FeatureGroup

set_merge_config(merge_config)

Set a MergeFeatureGroup’s merge config to the values provided, so that the feature group only returns a bounded range of an incremental dataset.

Parameters

merge_config (dict) – A json object string specifying the merge rule. An empty mergeConfig will default to only including the latest Dataset Version.

set_transform_config(transform_config)

Set a TransformFeatureGroup’s transform config to the values provided.

Parameters

transform_config (dict) – A json object string specifying the pre-defined transformation.

set_schema(schema)

Creates a new schema and points the feature group to the new feature group schema id.

Parameters

schema (list) – An array of json objects with ‘name’ and ‘dataType’ properties.

get_schema(project_id=None)

Returns a schema given a specific FeatureGroup in a project.

Parameters

project_id (str) – The unique ID associated with the project.

Returns

An array of objects for each column in the specified feature group.

Return type

Feature

create_feature(name, select_expression)

Creates a new feature in a Feature Group from a SQL select statement

Parameters
  • name (str) – The name of the feature to add

  • select_expression (str) – SQL select expression to create the feature

Returns

A feature group object with the newly added feature.

Return type

FeatureGroup

add_tag(tag)

Adds a tag to the feature group

Parameters

tag (str) – The tag to add to the feature group

remove_tag(tag)

Removes a tag from the feature group

Parameters

tag (str) – The tag to add to the feature group

add_feature_tag(feature, tag)
Parameters
  • feature (str) –

  • tag (str) –

remove_feature_tag(feature, tag)
Parameters
  • feature (str) –

  • tag (str) –

create_nested_feature(nested_feature_name, table_name, using_clause, where_clause=None, order_clause=None)

Creates a new nested feature in a feature group from a SQL statements to create a new nested feature.

Parameters
  • nested_feature_name (str) – The name of the feature.

  • table_name (str) – The table name of the feature group to nest

  • using_clause (str) – The SQL join column or logic to join the nested table with the parent

  • where_clause (str) – A SQL where statement to filter the nested rows

  • order_clause (str) – A SQL clause to order the nested rows

Returns

A feature group object with the newly added nested feature.

Return type

FeatureGroup

update_nested_feature(nested_feature_name, table_name=None, using_clause=None, where_clause=None, order_clause=None, new_nested_feature_name=None)

Updates a previously existing nested feature in a feature group.

Parameters
  • nested_feature_name (str) – The name of the feature to be updated.

  • table_name (str) – The name of the table.

  • using_clause (str) – The SQL join column or logic to join the nested table with the parent

  • where_clause (str) – A SQL where statement to filter the nested rows

  • order_clause (str) – A SQL clause to order the nested rows

  • new_nested_feature_name (str) – New name for the nested feature.

Returns

A feature group object with the updated nested feature.

Return type

FeatureGroup

delete_nested_feature(nested_feature_name)

Delete a nested feature.

Parameters

nested_feature_name (str) – The name of the feature to be updated.

Returns

A feature group object without the deleted nested feature.

Return type

FeatureGroup

create_point_in_time_feature(feature_name, history_table_name, aggregation_keys, timestamp_key, historical_timestamp_key, expression, lookback_window_seconds=None, lookback_window_lag_seconds=0, lookback_count=None, lookback_until_position=0)

Creates a new point in time feature in a feature group using another historical feature group, window spec and aggregate expression.

We use the aggregation keys, and either the lookbackWindowSeconds or the lookbackCount values to perform the window aggregation for every row in the current feature group. If the window is specified in seconds, then all rows in the history table which match the aggregation keys and with historicalTimeFeature >= lookbackStartCount and < the value of the current rows timeFeature are considered. An option lookbackWindowLagSeconds (+ve or -ve) can be used to offset the current value of the timeFeature. If this value is negative, we will look at the future rows in the history table, so care must be taken to make sure that these rows are available in the online context when we are performing a lookup on this feature group. If window is specified in counts, then we order the historical table rows aligning by time and consider rows from the window where the rank order is >= lookbackCount and includes the row just prior to the current one. The lag is specified in term of positions using lookbackUntilPosition.

Parameters
  • feature_name (str) – The name of the feature to create

  • history_table_name (str) – The table name of the history table.

  • aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation.

  • timestamp_key (str) – Name of feature which contains the timestamp value for the point in time feature

  • historical_timestamp_key (str) – Name of feature which contains the historical timestamp.

  • expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.

  • lookback_window_seconds (float) – If window is specified in terms of time, number of seconds in the past from the current time for start of the window.

  • lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.

  • lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)

  • lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.

Returns

A feature group object with the newly added nested feature.

Return type

FeatureGroup

update_point_in_time_feature(feature_name, history_table_name=None, aggregation_keys=None, timestamp_key=None, historical_timestamp_key=None, expression=None, lookback_window_seconds=None, lookback_window_lag_seconds=None, lookback_count=None, lookback_until_position=None, new_feature_name=None)

Updates an existing point in time feature in a feature group. See createPointInTimeFeature for detailed semantics.

Parameters
  • feature_name (str) – The name of the feature.

  • history_table_name (str) – The table name of the history table. If not specified, we use the current table to do a self join.

  • aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation.

  • timestamp_key (str) – Name of feature which contains the timestamp value for the point in time feature

  • historical_timestamp_key (str) – Name of feature which contains the historical timestamp.

  • expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.

  • lookback_window_seconds (float) – If window is specified in terms of time, number of seconds in the past from the current time for start of the window.

  • lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.

  • lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)

  • lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.

  • new_feature_name (str) – New name for the point in time feature.

Returns

A feature group object with the newly added nested feature.

Return type

FeatureGroup

create_point_in_time_group(group_name, window_key, aggregation_keys, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=0, lookback_count=None, lookback_until_position=0)

Create point in time group

Parameters
  • group_name (str) – The name of the point in time group

  • window_key (str) – Name of feature to use for ordering the rows on the source table

  • aggregation_keys (list) – List of keys to perform on the source table for the window aggregation.

  • history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used

  • history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used

  • history_aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys

  • lookback_window (float) – Number of seconds in the past from the current time for start of the window. If 0, the lookback will include all rows.

  • lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.

  • lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)

  • lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.

Returns

The feature group after the point in time group has been created

Return type

FeatureGroup

update_point_in_time_group(group_name, window_key=None, aggregation_keys=None, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=None, lookback_count=None, lookback_until_position=None)

Update point in time group

Parameters
  • group_name (str) – The name of the point in time group

  • window_key (str) – Name of feature which contains the timestamp value for the point in time feature

  • aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation.

  • history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used

  • history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used

  • history_aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys

  • lookback_window (float) – Number of seconds in the past from the current time for start of the window.

  • lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.

  • lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)

  • lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.

Returns

The feature group after the update has been applied

Return type

FeatureGroup

delete_point_in_time_group(group_name)

Delete point in time group

Parameters

group_name (str) – The name of the point in time group

Returns

The feature group after the point in time group has been deleted

Return type

FeatureGroup

create_point_in_time_group_feature(group_name, name, expression)

Create point in time group feature

Parameters
  • group_name (str) – The name of the point in time group

  • name (str) – The name of the feature to add to the point in time group

  • expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.

Returns

The feature group after the update has been applied

Return type

FeatureGroup

update_point_in_time_group_feature(group_name, name, expression)

Update a feature’s SQL expression in a point in time group

Parameters
  • group_name (str) – The name of the point in time group

  • name (str) – The name of the feature to add to the point in time group

  • expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.

Returns

The feature group after the update has been applied

Return type

FeatureGroup

set_feature_type(feature, feature_type)

Set a feature’s type in a feature group/. Specify the feature group ID, feature name and feature type, and the method will return the new column with the resulting changes reflected.

Parameters
  • feature (str) – The name of the feature.

  • feature_type (str) – The machine learning type of the data in the feature. CATEGORICAL, CATEGORICAL_LIST, NUMERICAL, TIMESTAMP, TEXT, EMAIL, LABEL_LIST, JSON, OBJECT_REFERENCE Refer to the (guide on feature types)[https://api.abacus.ai/app/help/class/FeatureType] for more information. Note: Some FeatureMappings will restrict the options or explicitly set the FeatureType.

Returns

The feature group after the data_type is applied

Return type

Schema

invalidate_streaming_data(invalid_before_timestamp)

Invalidates all streaming data with timestamp before invalidBeforeTimestamp

Parameters

invalid_before_timestamp (int) – The unix timestamp, any data which has a timestamp before this time will be deleted

concatenate_data(source_feature_group_id, merge_type='UNION', replace_until_timestamp=None, skip_materialize=False)

Concatenates data from one feature group to another. Feature groups can be merged if their schema’s are compatible and they have the special updateTimestampKey column and if set, the primaryKey column. The second operand in the concatenate operation will be appended to the first operand (merge target).

Parameters
  • source_feature_group_id (str) – The feature group to concatenate with the destination feature group.

  • merge_type (str) – UNION or INTERSECTION

  • replace_until_timestamp (int) – The unix timestamp to specify the point till which we will replace data from the source feature group.

  • skip_materialize (bool) – If true, will not materialize the concatenated feature group

remove_concatenation_config()

Removes the concatenation config on a destination feature group.

Parameters

feature_group_id (str) – Removes the concatenation configuration on a destination feature group

refresh()

Calls describe and refreshes the current object’s fields

Returns

The current object

Return type

FeatureGroup

describe()

Describe a Feature Group.

Parameters

feature_group_id (str) – The unique ID associated with the feature group.

Returns

The feature group object.

Return type

FeatureGroup

set_indexing_config(primary_key=None, update_timestamp_key=None, lookup_keys=None)

Sets various attributes of the feature group used for deployment lookups and streaming updates.

Parameters
  • primary_key (str) – Name of feature which defines the primary key of the feature group.

  • update_timestamp_key (str) – Name of feature which defines the update timestamp of the feature group - used in concatenation and primary key deduplication.

  • lookup_keys (list) – List of feature names which can be used in the lookup api to restrict the computation to a set of dataset rows. These feature names have to correspond to underlying dataset columns.

update(description=None)

Modifies an existing feature group

Parameters

description (str) – The description about the feature group.

Returns

The updated feature group object.

Return type

FeatureGroup

detach_from_template()

Update a feature group to detach it from a template.

Currently, this converts the feature group into a SQL feature group rather than a template feature group.

Parameters

feature_group_id (str) – The unique ID associated with the feature group.

Returns

The updated feature group

Return type

FeatureGroup

update_template_bindings(template_bindings=None)

Update the feature group template bindings for a template feature group.

Parameters

template_bindings (list) – Values in these bindings override values set in the template.

Returns

The updated feature group

Return type

FeatureGroup

update_sql_definition(sql)

Updates the SQL statement for a feature group.

Parameters

sql (str) – Input SQL statement for the feature group.

Returns

The updated feature group

Return type

FeatureGroup

update_dataset_feature_expression(feature_expression)

Updates the SQL feature expression for a dataset feature group’s custom features

Parameters

feature_expression (str) – Input SQL statement for the feature group.

Returns

The updated feature group

Return type

FeatureGroup

update_function_definition(function_source_code=None, function_name=None, input_feature_groups=None, cpu_size=None, memory=None, package_requirements=None)

Updates the function definition for a feature group created using createFeatureGroupFromFunction

Parameters
  • function_source_code (str) – Contents of a valid source code file in a supported Feature Group specification language (currently only Python). The source code should contain a function called function_name. A list of allowed import and system libraries for each language is specified in the user functions documentation section.

  • function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.

  • input_feature_groups (list) – List of feature groups that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • cpu_size (str) – Size of the cpu for the feature group function

  • memory (int) – Memory (in GB) for the feature group function

  • package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency

Returns

The updated feature group

Return type

FeatureGroup

update_zip(function_name, module_name, input_feature_groups=None, cpu_size=None, memory=None, package_requirements=None)

Updates the zip for a feature group created using createFeatureGroupFromZip

Parameters
  • function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.

  • module_name (str) – Path to the file with the feature group function.

  • input_feature_groups (list) – List of feature groups that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • cpu_size (str) – Size of the cpu for the feature group function

  • memory (int) – Memory (in GB) for the feature group function

  • package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency

Returns

The Upload to upload the zip file to

Return type

Upload

update_git(application_connector_id=None, branch_name=None, python_root=None, function_name=None, module_name=None, input_feature_groups=None, cpu_size=None, memory=None, package_requirements=None)

Updates a feature group created using createFeatureGroupFromGit

Parameters
  • application_connector_id (str) – The unique ID associated with the git application connector.

  • branch_name (str) – Name of the branch in the git repository to be used for training.

  • python_root (str) – Path from the top level of the git repository to the directory containing the Python source code. If not provided, the default is the root of the git repository.

  • function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.

  • module_name (str) – Path to the file with the feature group function.

  • input_feature_groups (list) – List of feature groups that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • cpu_size (str) – Size of the cpu for the feature group function

  • memory (int) – Memory (in GB) for the feature group function

  • package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency

Returns

The updated FeatureGroup

Return type

FeatureGroup

update_feature(name, select_expression=None, new_name=None)

Modifies an existing feature in a feature group. A user needs to specify the name and feature group ID and either a SQL statement or new name tp update the feature.

Parameters
  • name (str) – The name of the feature to be updated.

  • select_expression (str) – Input SQL statement for modifying the feature.

  • new_name (str) – The new name of the feature.

Returns

The updated feature group object.

Return type

FeatureGroup

list_exports()

Lists all of the feature group exports for a given feature group

Parameters

feature_group_id (str) – The ID of the feature group

Returns

The feature group exports

Return type

FeatureGroupExport

set_modifier_lock(locked=True)

To lock a feature group to prevent it from being modified.

Parameters

locked (bool) – True or False to disable or enable feature group modification.

list_modifiers()

To list users who can modify a feature group.

Parameters

feature_group_id (str) – The unique ID associated with the feature group.

Returns

Modification lock status and groups and organizations added to the feature group.

Return type

ModificationLockInfo

add_user_to_modifiers(email)

Adds user to a feature group.

Parameters

email (str) – The email address of the user to be removed.

add_organization_group_to_modifiers(organization_group_id)

Add Organization to a feature group.

Parameters

organization_group_id (str) – The unique ID associated with the organization group.

remove_user_from_modifiers(email)

Removes user from a feature group.

Parameters

email (str) – The email address of the user to be removed.

remove_organization_group_from_modifiers(organization_group_id)

Removes Organization from a feature group.

Parameters

organization_group_id (str) – The unique ID associated with the organization group.

delete_feature(name)

Removes an existing feature from a feature group. A user needs to specify the name of the feature to be deleted and the feature group ID.

Parameters

name (str) – The name of the feature to be deleted.

Returns

The updated feature group object.

Return type

FeatureGroup

delete()

Removes an existing feature group.

Parameters

feature_group_id (str) – The unique ID associated with the feature group.

create_version(variable_bindings=None)

Creates a snapshot for a specified feature group.

Parameters

variable_bindings (dict) – (JSON Object): JSON object (aka map) defining variable bindings that override parent feature group values.

Returns

A feature group version.

Return type

FeatureGroupVersion

list_versions(limit=100, start_after_version=None)

Retrieves a list of all feature group versions for the specified feature group.

Parameters
  • limit (int) – The max length of the returned versions

  • start_after_version (str) – Results will start after this version

Returns

An array of feature group version.

Return type

FeatureGroupVersion

create_template(name, template_sql, template_variables, description=None, template_bindings=None, should_attach_feature_group_to_template=False)

Create a feature group template.

Parameters
  • name (str) – The user-friendly of for this feature group template.

  • template_sql (str) – The template sql that will be resolved by applying values from the template variables to generate sql for a feature group.

  • template_variables (list) – The template variables for resolving the template.

  • description (str) – A description of this feature group template

  • template_bindings (list) – If the feature group will be attached to the newly created template, set these variable bindings on that feature group.

  • should_attach_feature_group_to_template (bool) – Set to True to convert the feature group to a template feature group and attach it to the newly created template.

Returns

The created feature group template

Return type

FeatureGroupTemplate

suggest_template_for()

Suggest values for a feature gruop template, based on a feature group.

Parameters

feature_group_id (str) – The unique ID associated with the feature group to use for suggesting values to use for the template.

Returns

None

Return type

FeatureGroupTemplate

get_recent_streamed_data()

Returns recently streamed data to a streaming feature group.

Parameters

feature_group_id (str) – The unique ID associated with the feature group.

create_prediction_metric(prediction_metric_config, project_id=None)

Create a prediction metric job description for the given prediction and actual-labels data.

Parameters
  • prediction_metric_config (dict) – Specification for prediction metric to run in this job.

  • project_id (str) – Project to use for the prediction metrics. Defaults to the project for the input feature_group, if the feature_group has exactly one project.

Returns

The Prediction Metric job description.

Return type

PredictionMetric

list_prediction_metrics(limit=100, should_include_latest_version_description=True, start_after_id=None)

List the prediction metrics for a feature group.

Parameters
  • limit (int) – The the number of prediction metrics to be retrieved.

  • should_include_latest_version_description (bool) – include the description of the latest prediction metric version for each prediction metric

  • start_after_id (str) – An offset parameter to exclude all prediction metrics till the specified prediction metric ID.

Returns

The prediction metrics for this feature group.

Return type

PredictionMetric

query_prediction_metrics(project_id=None, limit=100, should_include_latest_version_description=True, start_after_id=None)

Query and return prediction metrics and extra data needed by the UI, constrained by the parameters provided.

feature_group_id (Unique String Identifier): [optional] The feature group used as input to the prediction metrics.

project_id (Unique String Identifier): [optional] The project_id of the prediction metrics. limit (Integer): The the number of prediction metrics to be retrieved. should_include_latest_version_description (Boolean): include the description of the latest prediction metric version for each prediction metric start_after_id (Unique String Identifier): An offset parameter to exclude all prediction metrics till the specified prediction metric ID.

Parameters
  • project_id (str) –

  • limit (int) –

  • should_include_latest_version_description (bool) –

  • start_after_id (str) –

Returns

The prediction metrics for this feature group.

Return type

PredictionMetric

upsert_data(streaming_token, data)

Updates new data into the feature group for a given lookup key recordId if the recordID is found otherwise inserts new data into the feature group.

Parameters
  • streaming_token (str) – The streaming token for authenticating requests

  • data (dict) – The data to record

append_data(streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters
  • streaming_token (str) – The streaming token for authenticating requests

  • data (dict) – The data to record

upsert_multiple_data(streaming_token, data)

Updates new data into the feature group for a given lookup key recordId if the recordID is found otherwise inserts new data into the feature group.

Parameters
  • streaming_token (str) – The streaming token for authenticating requests

  • data (dict) – The data to record, as an array of JSON Objects

append_multiple_data(streaming_token, data)

Appends new data into the feature group for a given lookup key recordId.

Parameters
  • streaming_token (str) – The streaming token for authenticating requests

  • data (list) – The data to record, as an array of JSON objects

wait_for_dataset(timeout=7200)

A waiting call until the feature group’s dataset, if any, is ready for use.

Parameters

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.

wait_for_upload(timeout=7200)

Waits for a feature group created from a dataframe to be ready for materialization and version creation.

Parameters

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.

wait_for_materialization(timeout=7200)

A waiting call until feature group is materialized.

Parameters

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.

wait_for_streaming_ready(timeout=600)

Waits for the feature group indexing config to be applied for streaming

Parameters

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 600 seconds.

get_status(streaming_status=False)

Gets the status of the feature group.

Returns

A string describing the status of a feature group (pending, complete, etc.).

Return type

str

Parameters

streaming_status (bool) –

load_as_pandas()

Loads the feature groups into a python pandas dataframe.

Returns

A pandas dataframe with annotations and text_snippet columns.

Return type

DataFrame

describe_dataset()

Displays the dataset attached to a feature group.

Returns

A dataset object with all the relevant information about the dataset.

Return type

Dataset