Skip to main content

Module h2o_wave_ml.ml

Functions

build_model

def build_model(*, target_column: str, train_file_path: str = '', train_df: Optional[pandas.core.frame.DataFrame] = None, model_metric: ModelMetric = ModelMetric.AUTO, task_type: Optional[TaskType] = None, model_type: Optional[ModelType] = None, categorical_columns: Optional[List[str]] = None, feature_columns: Optional[List[str]] = None, drop_columns: Optional[List[str]] = None, validation_file_path: str = '', validation_df: Optional[pandas.core.frame.DataFrame] = None, access_token: str = '', refresh_token: str = '', **kwargs) ‑> Model

Trains a model.

The function has to be called with target_column and train_file_path or train_df at least to be functionable. If model_type is not specified, it is inferred from the current environment. Defaults to an H2O-3 model.

Args
target_column
The name of the target column (the column to be predicted).
train_file_path
The path to the training dataset.
train_df
Pandas DataFrame as a training set instead of file.
model_metric
Optional evaluation metric to be used for modeling.
task_type
Optional task type. Will be automatically determined if it's not specified.
model_type
Optional model type.
categorical_columns
Optional list of column names to be converted (from numeric) to categorical.
feature_columns
Optional list of column names to be used for modeling.
drop_columns
Optional list of column names to be dropped before modeling.
validation_file_path
Optional path to a validation dataset.
validation_df
Optional Pandas DataFrame as a validation dataset.
access_token
Optional access token if engine needs to be authenticated.
refresh_token
Optional refresh token if model needs to be authenticated.
kwargs
Optional parameters to be passed to the model builder, Steam or MLOps.
Kwargs

The list of the supported DAI parameters. The parameters description can be found here.

_dai_accuracy
_dai_time
_dai_interpretability
_dai_scorer
_dai_models
_dai_transformers
_dai_weight_column
_dai_fold_column
_dai_time_column
_dai_time_groups_columns
_dai_unavailable_at_prediction_time_columns
_dai_enable_gpus
_dai_reproducible
_dai_time_period_in_seconds
_dai_num_prediction_periods
_dai_num_gap_periods
_dai_config_overrides

The list of the supported H2O-3 parameters. The parameters description can be found here.

_h2o3_max_runtime_secs
_h2o3_max_models
_h2o3_nfolds
_h2o3_balance_classes
_h2o3_class_sampling_factors
_h2o3_max_after_balance_size
_h2o3_max_runtime_secs_per_model
_h2o3_stopping_metric
_h2o3_stopping_tolerance
_h2o3_stopping_rounds
_h2o3_seed
_h2o3_exclude_algos
_h2o3_include_algos
_h2o3_modeling_plan
_h2o3_preprocessing
_h2o3_exploitation_ratio
_h2o3_monotone_constraints
_h2o3_keep_cross_validation_predictions
_h2o3_keep_cross_validation_models
_h2o3_keep_cross_validation_fold_assignment
_h2o3_verbosity
_h2o3_export_checkpoints_dir

The list of the supported Steam options.

_steam_dai_instance_name
_steam_dai_multinode_name

The list of the supported MLOps options.

_mlops_deployment_env

Returns

The Wave model.

get_model

def get_model(model_id: str = '', endpoint_url: str = '', model_type: Optional[ModelType] = None, access_token: str = '', refresh_token: str = '') ‑> Optional[Model]

Retrieves a remote model using its ID or url.

Args
model_id
The unique ID of the model.
endpoint_url
The endpoint url for deployed model.
model_type
Optional type of the model.
access_token
Optional access token if model needs to be authenticated.
refresh_token
Optional refresh token if model needs to be authenticated.
Returns

The Wave model.

load_model

def load_model(file_path: str) ‑> Model

Loads a saved model from the given location.

Args
file_path
Path to the saved model.
Returns

The Wave model.

save_model

def save_model(model: Model, *, output_dir_path: str) ‑> str

Saves a model to the given location.

Args
model
The model to store.
output_dir_path
A directory where the model will be saved.
Returns

The file path to the saved model.