pipelinex.mlflow_on_kedro.hooks.mlflow package
Submodules
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_artifacts_logger module
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_artifacts_logger.MLflowArtifactsLoggerHook(filepaths_before_pipeline_run=None, filepaths_after_pipeline_run=None, datasets_after_node_run=None, enable_mlflow=True)[source]
Bases:
objectLogs artifacts of specified file paths and dataset names to MLflow
- __init__(filepaths_before_pipeline_run=None, filepaths_after_pipeline_run=None, datasets_after_node_run=None, enable_mlflow=True)[source]
- Parameters:
filepaths_before_pipeline_run (
List[str]) – The file paths of artifacts to log before the pipeline is run.filepaths_after_pipeline_run (
List[str]) – The file paths of artifacts to log after the pipeline is run.datasets_after_node_run (
List[str]) – The dataset names to log after the node is run.enable_mlflow (
bool) – Enable logging to MLflow.
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_basic_logger module
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_basic_logger.MLflowBasicLoggerHook(uri=None, experiment_name=None, artifact_location=None, run_name=None, run_id=None, nested=False, tags=None, offset_hours=0, enable_logging_time_begin=True, enable_logging_time_end=True, enable_logging_time=True, logging_kedro_run_params=[], enable_mlflow=True)[source]
Bases:
objectConfigures and logs duration time for the pipeline to MLflow
- __init__(uri=None, experiment_name=None, artifact_location=None, run_name=None, run_id=None, nested=False, tags=None, offset_hours=0, enable_logging_time_begin=True, enable_logging_time_end=True, enable_logging_time=True, logging_kedro_run_params=[], enable_mlflow=True)[source]
- Parameters:
uri (
str) – The MLflow tracking server URI. uri arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_tracking_uriexperiment_name (
str) – The experiment name. name arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experimentartifact_location (
str) – artifact_location arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experimentrun_name (
str) – Shown as ‘Run Name’ in MLflow UI.run_id (
str) – An existing MLflow experiment run UUID instead of letting MLflow create a new run under the experiment_name. run_id arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_runnested (
bool) – nested arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_runtags (
Optional[Dict[str,Any]]) – tags arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_runoffset_hours (
float) – The offset hour (e.g. 0 for UTC+00:00) to log in MLflow. 0 in default.enable_logging_time_begin (
bool) – Enable logging the time the Kedro pipeline began. True in default.enable_logging_time_end (
bool) – Enable logging the time the Kedro pipeline ended. True in default.enable_logging_time (
bool) – Enable logging the time duration the Kedro pipeline ran. True in default.logging_kedro_run_params (
Union[List[str],str]) – List of Kedro Run Params to log to MLflow or “__ALL__” to log all. [] (Empty) in default.enable_mlflow (
bool) – Enable configuring and logging to MLflow.
- pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_basic_logger.get_timestamp(dt=None, offset_hours=0, fmt='%Y-%m-%dT%H:%M:%S')[source]
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_catalog_logger module
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_catalog_logger.MLflowCatalogLoggerHook(auto=True, mlflow_catalog={}, enable_mlflow=True)[source]
Bases:
objectLogs datasets to MLflow
- __init__(auto=True, mlflow_catalog={}, enable_mlflow=True)[source]
- Parameters:
auto (
bool) – If True, each dataset (Python func input/output) not listed in the catalogbelow. (will be logged following the same rule as "a" option)
mlflow_catalog (
Dict[str,Union[str,AbstractDataset]]) – [Deprecated in favor of MLflowDataSet] Specify how to log each datasetinput/output). ((Python func) –
If set to “p”, the value will be saved/loaded as an MLflow parameter (string).
If set to “m”, the value will be saved/loaded as an MLflow metric (numeric).
If set to “a”, the value will be saved/loaded based on the data type.
If the data type is either {float, int}, the value will be saved/loaded as an MLflow metric.
If the data type is either {str, list, tuple, set}, the value will be saved/load as an MLflow parameter.
If the data type is dict, the value will be flattened with dot (“.”) as the separator and then saved/loaded as either an MLflow metric or parameter based on each data type as explained above.
If set to either {“json”, “csv”, “xls”, “parquet”, “png”, “jpg”, “jpeg”, “img”, “pkl”, “txt”, “yml”, “yaml”}, the backend dataset instance will be created accordingly to save/load as an MLflow artifact.
If set to a Kedro DataSet object or a dictionary, it will be used as the backend dataset to save/load as an MLflow artifact.
If set to None (default), MLflow logging will be skipped.
enable_mlflow (
bool) – Enable logging to MLflow.
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger module
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger.MLflowDataSetsLoggerHook(enable_mlflow=True)[source]
Bases:
objectLogs datasets of (list of) float/int and str classes to MLflow
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger.MLflowOutputsLoggerHook(enable_mlflow=True)[source]
Bases:
MLflowDataSetsLoggerHookDeprecated alias for MLflowOutputsLoggerHook
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_env_vars_logger module
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_env_vars_logger.MLflowEnvVarsLoggerHook(param_env_vars=None, metric_env_vars=None, prefix=None, enable_mlflow=True)[source]
Bases:
objectLogs environment variables to MLflow
- __init__(param_env_vars=None, metric_env_vars=None, prefix=None, enable_mlflow=True)[source]
- Parameters:
param_env_vars (
List[str]) – Environment variables to log to MLflow as parametersmetric_env_vars (
List[str]) – Environment variables to log to MLflow as metricsprefix (
str) – Prefix to add to each name of MLflow parameters and metrics (“env..” in default)enable_mlflow (
bool) – Enable logging to MLflow.
- pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_env_vars_logger.env_vars_to_dict(env_vars=[], prefix='')[source]
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_time_logger module
- class pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_time_logger.MLflowTimeLoggerHook(gantt_filepath=None, gantt_params={}, metric_name_prefix='_time_to_run ', task_name_func=<function _get_task_name>, time_log_filepath=None, enable_plotly=True, enable_mlflow=True)[source]
Bases:
objectLogs duration time to run each node (task) to MLflow. Optionally, the execution logs can be visualized as a Gantt chart by plotly.figure_factory.create_gantt (https://plotly.github.io/plotly.py-docs/generated/plotly.figure_factory.create_gantt.html) if plotly is installed.
- __init__(gantt_filepath=None, gantt_params={}, metric_name_prefix='_time_to_run ', task_name_func=<function _get_task_name>, time_log_filepath=None, enable_plotly=True, enable_mlflow=True)[source]
- Parameters:
gantt_filepath (
str) – File path to save the generated gantt chart.gantt_params (
Dict[str,Any]) – Args fed to: https://plotly.github.io/plotly.py-docs/generated/plotly.figure_factory.create_gantt.htmlmetric_name_prefix (
str) – Prefix for the metric names. The metric names are metric_name_prefix concatenated with the string returned by task_name_func.task_name_func (
Callable[[Node],str]) – Callable to return the task name usingkedro.pipeline.node.Nodeobject.time_log_filepath (
str) – File path to save the time log in JSON format.enable_plotly (
bool) – Enable visualization of logged time as a gantt chart.enable_mlflow (
bool) – Enable logging to MLflow.
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils module
- pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils.mlflow_log_artifacts(paths, artifact_path=None, enable_mlflow=True)[source]
- pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils.mlflow_log_metrics(metrics, step=None, enable_mlflow=True)[source]
- pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils.mlflow_log_params(params, enable_mlflow=True)[source]