pipelinex.mlflow_on_kedro.hooks.mlflow package¶
Submodules¶
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_artifacts_logger module¶
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_artifacts_logger.
MLflowArtifactsLoggerHook
(filepaths_before_pipeline_run=None, filepaths_after_pipeline_run=None, datasets_after_node_run=None, enable_mlflow=True)[source]¶ Bases:
object
Logs artifacts of specified file paths and dataset names to MLflow
-
__init__
(filepaths_before_pipeline_run=None, filepaths_after_pipeline_run=None, datasets_after_node_run=None, enable_mlflow=True)[source]¶ - Parameters:
filepaths_before_pipeline_run (
Optional
[List
[str
]]) – The file paths of artifacts to log before the pipeline is run.filepaths_after_pipeline_run (
Optional
[List
[str
]]) – The file paths of artifacts to log after the pipeline is run.datasets_after_node_run (
Optional
[List
[str
]]) – The dataset names to log after the node is run.enable_mlflow (
bool
) – Enable logging to MLflow.
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_basic_logger module¶
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_basic_logger.
MLflowBasicLoggerHook
(uri=None, experiment_name=None, artifact_location=None, run_name=None, run_id=None, nested=False, tags=None, offset_hours=0, enable_logging_time_begin=True, enable_logging_time_end=True, enable_logging_time=True, logging_kedro_run_params=[], enable_mlflow=True)[source]¶ Bases:
object
Configures and logs duration time for the pipeline to MLflow
-
__init__
(uri=None, experiment_name=None, artifact_location=None, run_name=None, run_id=None, nested=False, tags=None, offset_hours=0, enable_logging_time_begin=True, enable_logging_time_end=True, enable_logging_time=True, logging_kedro_run_params=[], enable_mlflow=True)[source]¶ - Parameters:
uri (
Optional
[str
]) – The MLflow tracking server URI. uri arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.set_tracking_uriexperiment_name (
Optional
[str
]) – The experiment name. name arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experimentartifact_location (
Optional
[str
]) – artifact_location arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.create_experimentrun_name (
Optional
[str
]) – Shown as ‘Run Name’ in MLflow UI.run_id (
Optional
[str
]) – An existing MLflow experiment run UUID instead of letting MLflow create a new run under the experiment_name. run_id arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_runnested (
bool
) – nested arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_runtags (
Optional
[Dict
[str
,Any
]]) – tags arg fed to: https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.start_runoffset_hours (
float
) – The offset hour (e.g. 0 for UTC+00:00) to log in MLflow. 0 in default.enable_logging_time_begin (
bool
) – Enable logging the time the Kedro pipeline began. True in default.enable_logging_time_end (
bool
) – Enable logging the time the Kedro pipeline ended. True in default.enable_logging_time (
bool
) – Enable logging the time duration the Kedro pipeline ran. True in default.logging_kedro_run_params (
Union
[List
[str
],str
]) – List of Kedro Run Params to log to MLflow or “__ALL__” to log all. [] (Empty) in default.enable_mlflow (
bool
) – Enable configuring and logging to MLflow.
-
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_basic_logger.
get_timestamp
(dt=None, offset_hours=0, fmt='%Y-%m-%dT%H:%M:%S')[source]¶
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_catalog_logger module¶
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_catalog_logger.
MLflowCatalogLoggerHook
(auto=True, mlflow_catalog={}, enable_mlflow=True)[source]¶ Bases:
object
Logs datasets to MLflow
-
__init__
(auto=True, mlflow_catalog={}, enable_mlflow=True)[source]¶ - Parameters:
auto (
bool
) – If True, each dataset (Python func input/output) not listed in the catalogbe logged following the same rule as "a" option below. (will) –
mlflow_catalog (
Dict
[str
,Union
[str
,AbstractDataSet
]]) – [Deprecated in favor of MLflowDataSet] Specify how to log each datasetfunc input/output) ((Python) –
If set to “p”, the value will be saved/loaded as an MLflow parameter (string).
If set to “m”, the value will be saved/loaded as an MLflow metric (numeric).
If set to “a”, the value will be saved/loaded based on the data type.
If the data type is either {float, int}, the value will be saved/loaded as an MLflow metric.
If the data type is either {str, list, tuple, set}, the value will be saved/load as an MLflow parameter.
If the data type is dict, the value will be flattened with dot (“.”) as the separator and then saved/loaded as either an MLflow metric or parameter based on each data type as explained above.
If set to either {“json”, “csv”, “xls”, “parquet”, “png”, “jpg”, “jpeg”, “img”, “pkl”, “txt”, “yml”, “yaml”}, the backend dataset instance will be created accordingly to save/load as an MLflow artifact.
If set to a Kedro DataSet object or a dictionary, it will be used as the backend dataset to save/load as an MLflow artifact.
If set to None (default), MLflow logging will be skipped.
enable_mlflow (
bool
) – Enable logging to MLflow.
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger module¶
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger.
MLflowDataSetsLoggerHook
(enable_mlflow=True)[source]¶ Bases:
object
Logs datasets of (list of) float/int and str classes to MLflow
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger.
MLflowOutputsLoggerHook
(enable_mlflow=True)[source]¶ Bases:
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_datasets_logger.MLflowDataSetsLoggerHook
Deprecated alias for MLflowOutputsLoggerHook
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_env_vars_logger module¶
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_env_vars_logger.
MLflowEnvVarsLoggerHook
(param_env_vars=None, metric_env_vars=None, prefix=None, enable_mlflow=True)[source]¶ Bases:
object
Logs environment variables to MLflow
-
__init__
(param_env_vars=None, metric_env_vars=None, prefix=None, enable_mlflow=True)[source]¶ - Parameters:
param_env_vars (
Optional
[List
[str
]]) – Environment variables to log to MLflow as parametersmetric_env_vars (
Optional
[List
[str
]]) – Environment variables to log to MLflow as metricsprefix (
Optional
[str
]) – Prefix to add to each name of MLflow parameters and metrics (“env..” in default)enable_mlflow (
bool
) – Enable logging to MLflow.
-
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_env_vars_logger.
env_vars_to_dict
(env_vars=[], prefix='')[source]¶
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_time_logger module¶
-
class
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_time_logger.
MLflowTimeLoggerHook
(gantt_filepath=None, gantt_params={}, metric_name_prefix='_time_to_run ', task_name_func=<function _get_task_name>, time_log_filepath=None, enable_plotly=True, enable_mlflow=True)[source]¶ Bases:
object
Logs duration time to run each node (task) to MLflow. Optionally, the execution logs can be visualized as a Gantt chart by plotly.figure_factory.create_gantt (https://plotly.github.io/plotly.py-docs/generated/plotly.figure_factory.create_gantt.html) if plotly is installed.
-
__init__
(gantt_filepath=None, gantt_params={}, metric_name_prefix='_time_to_run ', task_name_func=<function _get_task_name>, time_log_filepath=None, enable_plotly=True, enable_mlflow=True)[source]¶ - Parameters:
gantt_filepath (
Optional
[str
]) – File path to save the generated gantt chart.gantt_params (
Dict
[str
,Any
]) – Args fed to: https://plotly.github.io/plotly.py-docs/generated/plotly.figure_factory.create_gantt.htmlmetric_name_prefix (
str
) – Prefix for the metric names. The metric names are metric_name_prefix concatenated with the string returned by task_name_func.task_name_func (
Callable
[[Node
],str
]) – Callable to return the task name usingkedro.pipeline.node.Node
object.time_log_filepath (
Optional
[str
]) – File path to save the time log in JSON format.enable_plotly (
bool
) – Enable visualization of logged time as a gantt chart.enable_mlflow (
bool
) – Enable logging to MLflow.
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils module¶
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils.
mlflow_log_artifacts
(paths, artifact_path=None, enable_mlflow=True)[source]¶
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils.
mlflow_log_metrics
(metrics, step=None, enable_mlflow=True)[source]¶
-
pipelinex.mlflow_on_kedro.hooks.mlflow.mlflow_utils.
mlflow_log_params
(params, enable_mlflow=True)[source]¶