dkutils.datakitchen_api.order_run_monitor module

class dkutils.datakitchen_api.order_run_monitor.EventInfoProvider(dk_client: 'DataKitchenClient', customer_code: 'str', pipeline_key: 'str', order_run_id: 'str', run_name: 'str' = None)[source]

Bases: object

get_event_info(**kwargs) dict[source]
classmethod init(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, pipeline_key: str, order_run_id: str, run_name: str) dkutils.datakitchen_api.order_run_monitor.EventInfoProvider[source]
customer_code: str
dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient
order_run_id: str
pipeline_key: str
run_name: str = None
class dkutils.datakitchen_api.order_run_monitor.Node(events_api_client: 'EventsApi', event_info_provider: 'EventInfoProvider', name: 'str', info: 'dict' = None, started_event_published: 'bool' = False, status: 'str' = None, start_time: 'int' = None, end_time: 'int' = None)[source]

Bases: object

publish_tests() None[source]
update(info: dict) None[source]
end_time: int = None
event_info_provider: dkutils.datakitchen_api.order_run_monitor.EventInfoProvider
events_api_client: events_ingestion_client.api.events_api.EventsApi
property failed: bool
info: dict = None
name: str
property running: bool
start_time: int = None
started_event_published: bool = False
status: str = None
property stopped: bool
property succeeded: bool
class dkutils.datakitchen_api.order_run_monitor.OrderRunMonitor(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, events_api_key: str, pipeline_name: str, order_run_id: str, run_name: Optional[str] = None, nodes_to_ignore: Optional[list] = None, sleep_time_secs: int = 10, host: str = 'https://dev-api.datakitchen.io')[source]

Bases: object

This class is for use in monitoring a DataKitchen Order Run and reporting its status to DataKitchen’s Observability module Events Ingestion API. It will report the start/stop of each node as an individual task and close the run when finished.

Parameters
  • dk_client (DataKitchenClient) –

  • events_api_key (str) – Events Ingestion API key.

  • pipeline_name (str) – Name of the pipeline being monitored

  • order_run_id (str) – Id of the Order Run being monitored.

  • run_name (str, optional) – Human readable name for the pipeline execution being monitored (default: None).

  • nodes_to_ignore (list or None, optional) – List of nodes to ignore. If the monitor node is named Order_Run_Monitor, it is added to the ignore list by default and there is no need to add it here (default: None).

  • sleep_time_secs (int, optional) – Polling interval for monitoring the run in seconds (default: 10).

  • host (str, optional) – URL of the Events Ingestion API.

__init__(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, events_api_key: str, pipeline_name: str, order_run_id: str, run_name: Optional[str] = None, nodes_to_ignore: Optional[list] = None, sleep_time_secs: int = 10, host: str = 'https://dev-api.datakitchen.io')[source]

This class is for use in monitoring a DataKitchen Order Run and reporting its status to DataKitchen’s Observability module Events Ingestion API. It will report the start/stop of each node as an individual task and close the run when finished.

Parameters
  • dk_client (DataKitchenClient) –

  • events_api_key (str) – Events Ingestion API key.

  • pipeline_name (str) – Name of the pipeline being monitored

  • order_run_id (str) – Id of the Order Run being monitored.

  • run_name (str, optional) – Human readable name for the pipeline execution being monitored (default: None).

  • nodes_to_ignore (list or None, optional) – List of nodes to ignore. If the monitor node is named Order_Run_Monitor, it is added to the ignore list by default and there is no need to add it here (default: None).

  • sleep_time_secs (int, optional) – Polling interval for monitoring the run in seconds (default: 10).

  • host (str, optional) – URL of the Events Ingestion API.

get_conditional_nodes() list[source]

Retrieve a list of the conditional node names present in this Order Run.

Returns

List of conditional node names.

Return type

list

get_nodes_info() dict[source]

Extract and return the node information from the order run details, excluding the nodes that should be ignored (e.g. conditional nodes and the Order_Run_Monitor node itself).

Returns

Dictionary keyed by node name and valued by a dictionary of node details.

Return type

dict

get_order_run_details(**kwargs) dict[source]

Retrieve order run details for the associated order run. The provided kwargs may be used to augment the returned value with more granular details.

Parameters

kwargs – Optional keyword arguments as found in DataKitchenClient’s get_order_run_details()

Returns

Dictionary of order run details

Return type

dict

monitor() tuple[source]

Poll the DataKitchen platform API for the status of the associated Order Run. Report the status of each node until all the nodes have completed or if the run has failed and nodes stopped processing. If this order run is for an ingredient, monitoring is disabled.

Returns

Contains two lists. The first list contains names of the nodes that succeeded, whereas the second list contains names of the nodes that failed. Both are empty if this is an ingredient order run.

Return type

tuple

static parse_log_entry(log_entry: dict) dict[source]

Parse a log entry dictionary to derive the fields required for the MessageLogEventApiSchema.

Parameters

log_entry (dict) –

Dictionary of log details for a single log entry of the form:

{
    'datetime': '2022-08-16T14:38:58.611000',
    'disk_used': '5.8984375 MB',
    'exc_desc': None,
    'exc_type': None,
    'hostname': '0d4e31d0-1d9b-11ed-971d-621c363ef06a-lqq9t',
    'mem_usage': '127.33 MB',
    'message': 'Test Fail: DKDataTestFailed',
    'node': 'Fail_Node',
    'order_run_id': '115d3e42-1d9b-11ed-b495-c216e4cc8e61',
    'pid': 22,
    'priority': 27,
    'record_type': 'ERROR',
    'syslogts': '2022-08-16T14:38:58-05:00',
    'thread_name': 'NodeExecutorThread:0',
    'traceback': None
}

Returns

Dictionary of required and optional fields for the MessageLogEventApiSchema

Return type

dict

process_log_entries() None[source]

Send MessageLog events for WARNING and ERROR log messages.

class dkutils.datakitchen_api.order_run_monitor.RunStatus(value)[source]

Bases: enum.Enum

An enumeration.

COMPLETED = 'COMPLETED'
COMPLETED_WITH_WARNINGS = 'COMPLETED_WITH_WARNINGS'
FAILED = 'FAILED'
RUNNING = 'RUNNING'
dkutils.datakitchen_api.order_run_monitor.get_customer_code(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient) str[source]

Retrieve the customer code from the authenticated user associated with the provided DataKitchen client.

Parameters

dk_client (DataKitchenClient) – Client for making requests to the DataKitchen platform API.

Returns

Customer code - typically two or three letters.

Return type

str

dkutils.datakitchen_api.order_run_monitor.get_ingredient_owner_order_run_id(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient)[source]

If this order run is for an ingredient, then return the parent order run id. Otherwise, return None.

Parameters

dk_client (DataKitchenClient) – Client for making requests to the DataKitchen platform API.

Returns

Return the parent order run id if the current order run is for an ingredient, otherwise return None.

Return type

str or None

dkutils.datakitchen_api.order_run_monitor.get_order_run_url(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, customer_code: str, order_run_id: str) str[source]

Retrieve the URL for navigating to the Order Run Details page in the DataKitchen platform for the provided order_run_id.

Parameters
  • dk_client (DataKitchenClient) – Client for making requests to the DataKitchen platform API

  • customer_code (str) – Customer code required for constructing the URL

  • order_run_id – Order run id the URL will link to

Returns

URL for navigating to the Order Run Details page in the DataKitchen platform for the provided order_run_id.

Return type

str