pipelinex.extras.datasets.requests package

Submodules

pipelinex.extras.datasets.requests.api_dataset module

APIDataSet loads the data from HTTP(S) APIs and returns them into either as string or json Dict. It uses the python requests library: https://requests.readthedocs.io/en/master/

class pipelinex.extras.datasets.requests.api_dataset.APIDataSet(url=None, method='GET', data=None, params=None, headers=None, auth=None, timeout=60, attribute='', skip_errors=False, transforms=[], session_config={}, pool_config={'http://': {'max_retries': 0, 'pool_block': False, 'pool_connections': 10, 'pool_maxsize': 10}, 'https://': {'max_retries': 0, 'pool_block': False, 'pool_connections': 10, 'pool_maxsize': 10}})[source]

Bases: AbstractDataset

APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/master/

Example:

from kedro.extras.datasets.api import APIDataSet


data_set = APIDataSet(
    url="https://quickstats.nass.usda.gov"
    params={
        "key": "SOME_TOKEN",
        "format": "JSON",
        "commodity_desc": "CORN",
        "statisticcat_des": "YIELD",
        "agg_level_desc": "STATE",
        "year": 2000
    }
)
data = data_set.load()
__init__(url=None, method='GET', data=None, params=None, headers=None, auth=None, timeout=60, attribute='', skip_errors=False, transforms=[], session_config={}, pool_config={'http://': {'max_retries': 0, 'pool_block': False, 'pool_connections': 10, 'pool_maxsize': 10}, 'https://': {'max_retries': 0, 'pool_block': False, 'pool_connections': 10, 'pool_maxsize': 10}})[source]

Creates a new instance of APIDataSet to fetch data from an API endpoint.

Parameters:
load()

Loads data by delegation to the provided load method.

Return type:

Any

Returns:

Data returned by the provided load method.

Raises:

DatasetError – When underlying load method raises error.

save(data)

Saves data by delegation to the provided save method.

Parameters:

data (Any) – the value to be saved by provided save method.

Raises:
  • DatasetError – when underlying save method raises error.

  • FileNotFoundError – when save method got file instead of dir, on Windows.

  • NotADirectoryError – when save method got file instead of dir, on Unix.

Return type:

None