pipelinex.flex_kedro.pipeline package
Submodules
pipelinex.flex_kedro.pipeline.pipeline module
- class pipelinex.flex_kedro.pipeline.pipeline.FlexiblePipeline(nodes, *, parameters_in_inputs=False, module='', decorator=[], **kwargs)[source]
Bases:
Pipeline- __init__(nodes, *, parameters_in_inputs=False, module='', decorator=[], **kwargs)[source]
Initialise
Pipelinewith a list ofNodeinstances.- Parameters:
nodes – The iterable of nodes the
Pipelinewill be made of. If you provide pipelines among the list of nodes, those pipelines will be expanded and all their nodes will become part of this new pipeline.inputs – A name or collection of input names to be exposed as connection points to other pipelines upstream. This is optional; if not provided, the pipeline inputs are automatically inferred from the pipeline structure. When str or set[str] is provided, the listed input names will stay the same as they are named in the provided pipeline. When dict[str, str] is provided, current input names will be mapped to new names. Must only refer to the pipeline’s free inputs.
outputs – A name or collection of names to be exposed as connection points to other pipelines downstream. This is optional; if not provided, the pipeline outputs are automatically inferred from the pipeline structure. When str or set[str] is provided, the listed output names will stay the same as they are named in the provided pipeline. When dict[str, str] is provided, current output names will be mapped to new names. Can refer to both the pipeline’s free outputs, as well as intermediate results that need to be exposed.
parameters – A name or collection of parameters to namespace. When str or set[str] are provided, the listed parameter names will stay the same as they are named in the provided pipeline. When dict[str, str] is provided, current parameter names will be mapped to new names. The parameters can be specified without the params: prefix.
tags – Optional set of tags to be applied to all the pipeline nodes.
namespace – A prefix to give to all dataset names, except those explicitly named with the inputs/outputs arguments, and parameter references (params: and parameters).
prefix_datasets_with_namespace – A flag to specify if the inputs, outputs, and parameters of the nodes should be prefixed with the namespace. It is set to True by default. It is useful to turn off when namespacing is used for grouping nodes for deployment purposes.
- Raises:
ValueError – When an empty list of nodes is provided, or when not all nodes have unique names.
CircularDependencyError – When visiting all the nodes is not possible due to the existence of a circular dependency.
OutputNotUniqueError – When multiple
Nodeinstances produce the same output.ConfirmNotUniqueError – When multiple
Nodeinstances attempt to confirm the same dataset.PipelineError – When inputs, outputs or parameters are incorrectly specified, or they do not exist on the original pipeline.
Example: ``` python from kedro.pipeline import Pipeline from kedro.pipeline import node
# In the following scenario first_ds and second_ds # are datasets provided by io. Pipeline will pass these # datasets to first_node function and provides the result # to the second_node as input.
- def first_node(first_ds, second_ds):
return dict(third_ds=first_ds + second_ds)
- def second_node(third_ds):
return third_ds
- pipeline = Pipeline(
- [
node(first_node, [“first_ds”, “second_ds”], [“third_ds”]), node(second_node, dict(third_ds=”third_ds”), “fourth_ds”),
]
)
pipelinex.flex_kedro.pipeline.sub_pipeline module
- class pipelinex.flex_kedro.pipeline.sub_pipeline.SubPipeline(inputs=None, outputs=None, func=None, module='', decorator=None, intermediate_node_name_fmt='{}__{:03d}', **kwargs)[source]
Bases:
Pipeline- __init__(inputs=None, outputs=None, func=None, module='', decorator=None, intermediate_node_name_fmt='{}__{:03d}', **kwargs)[source]
Initialise
Pipelinewith a list ofNodeinstances.- Parameters:
nodes – The iterable of nodes the
Pipelinewill be made of. If you provide pipelines among the list of nodes, those pipelines will be expanded and all their nodes will become part of this new pipeline.inputs (
Union[str,List[str],Dict[str,str]]) – A name or collection of input names to be exposed as connection points to other pipelines upstream. This is optional; if not provided, the pipeline inputs are automatically inferred from the pipeline structure. When str or set[str] is provided, the listed input names will stay the same as they are named in the provided pipeline. When dict[str, str] is provided, current input names will be mapped to new names. Must only refer to the pipeline’s free inputs.outputs (
Union[str,List[str],Dict[str,str]]) – A name or collection of names to be exposed as connection points to other pipelines downstream. This is optional; if not provided, the pipeline outputs are automatically inferred from the pipeline structure. When str or set[str] is provided, the listed output names will stay the same as they are named in the provided pipeline. When dict[str, str] is provided, current output names will be mapped to new names. Can refer to both the pipeline’s free outputs, as well as intermediate results that need to be exposed.parameters – A name or collection of parameters to namespace. When str or set[str] are provided, the listed parameter names will stay the same as they are named in the provided pipeline. When dict[str, str] is provided, current parameter names will be mapped to new names. The parameters can be specified without the params: prefix.
tags – Optional set of tags to be applied to all the pipeline nodes.
namespace – A prefix to give to all dataset names, except those explicitly named with the inputs/outputs arguments, and parameter references (params: and parameters).
prefix_datasets_with_namespace – A flag to specify if the inputs, outputs, and parameters of the nodes should be prefixed with the namespace. It is set to True by default. It is useful to turn off when namespacing is used for grouping nodes for deployment purposes.
- Raises:
ValueError – When an empty list of nodes is provided, or when not all nodes have unique names.
CircularDependencyError – When visiting all the nodes is not possible due to the existence of a circular dependency.
OutputNotUniqueError – When multiple
Nodeinstances produce the same output.ConfirmNotUniqueError – When multiple
Nodeinstances attempt to confirm the same dataset.PipelineError – When inputs, outputs or parameters are incorrectly specified, or they do not exist on the original pipeline.
Example: ``` python from kedro.pipeline import Pipeline from kedro.pipeline import node
# In the following scenario first_ds and second_ds # are datasets provided by io. Pipeline will pass these # datasets to first_node function and provides the result # to the second_node as input.
- def first_node(first_ds, second_ds):
return dict(third_ds=first_ds + second_ds)
- def second_node(third_ds):
return third_ds
- pipeline = Pipeline(
- [
node(first_node, [“first_ds”, “second_ds”], [“third_ds”]), node(second_node, dict(third_ds=”third_ds”), “fourth_ds”),
]
)