dkutils.datakitchen_api.order_run_monitor module¶
- class dkutils.datakitchen_api.order_run_monitor.EventInfoProvider(dk_client: 'DataKitchenClient', customer_code: 'str', pipeline_key: 'str', order_run_id: 'str', run_name: 'str' = None)[source]¶
Bases:
object
- classmethod init(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, pipeline_key: str, order_run_id: str, run_name: str) dkutils.datakitchen_api.order_run_monitor.EventInfoProvider [source]¶
- customer_code: str¶
- order_run_id: str¶
- pipeline_key: str¶
- run_name: str = None¶
- class dkutils.datakitchen_api.order_run_monitor.Node(events_api_client: 'EventsApi', event_info_provider: 'EventInfoProvider', name: 'str', info: 'dict' = None, started_event_published: 'bool' = False, status: 'str' = None, start_time: 'int' = None, end_time: 'int' = None)[source]¶
Bases:
object
- end_time: int = None¶
- event_info_provider: dkutils.datakitchen_api.order_run_monitor.EventInfoProvider¶
- events_api_client: events_ingestion_client.api.events_api.EventsApi¶
- property failed: bool¶
- info: dict = None¶
- name: str¶
- property running: bool¶
- start_time: int = None¶
- started_event_published: bool = False¶
- status: str = None¶
- property stopped: bool¶
- property succeeded: bool¶
- class dkutils.datakitchen_api.order_run_monitor.OrderRunMonitor(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, events_api_key: str, pipeline_name: str, order_run_id: str, run_name: Optional[str] = None, nodes_to_ignore: Optional[list] = None, sleep_time_secs: int = 10, host: str = 'https://dev-api.datakitchen.io')[source]¶
Bases:
object
This class is for use in monitoring a DataKitchen Order Run and reporting its status to DataKitchen’s Observability module Events Ingestion API. It will report the start/stop of each node as an individual task and close the run when finished.
- Parameters
dk_client (DataKitchenClient) –
events_api_key (str) – Events Ingestion API key.
pipeline_name (str) – Name of the pipeline being monitored
order_run_id (str) – Id of the Order Run being monitored.
run_name (str, optional) – Human readable name for the pipeline execution being monitored (default: None).
nodes_to_ignore (list or None, optional) – List of nodes to ignore. If the monitor node is named Order_Run_Monitor, it is added to the ignore list by default and there is no need to add it here (default: None).
sleep_time_secs (int, optional) – Polling interval for monitoring the run in seconds (default: 10).
host (str, optional) – URL of the Events Ingestion API.
- __init__(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, events_api_key: str, pipeline_name: str, order_run_id: str, run_name: Optional[str] = None, nodes_to_ignore: Optional[list] = None, sleep_time_secs: int = 10, host: str = 'https://dev-api.datakitchen.io')[source]¶
This class is for use in monitoring a DataKitchen Order Run and reporting its status to DataKitchen’s Observability module Events Ingestion API. It will report the start/stop of each node as an individual task and close the run when finished.
- Parameters
dk_client (DataKitchenClient) –
events_api_key (str) – Events Ingestion API key.
pipeline_name (str) – Name of the pipeline being monitored
order_run_id (str) – Id of the Order Run being monitored.
run_name (str, optional) – Human readable name for the pipeline execution being monitored (default: None).
nodes_to_ignore (list or None, optional) – List of nodes to ignore. If the monitor node is named Order_Run_Monitor, it is added to the ignore list by default and there is no need to add it here (default: None).
sleep_time_secs (int, optional) – Polling interval for monitoring the run in seconds (default: 10).
host (str, optional) – URL of the Events Ingestion API.
- get_conditional_nodes() list [source]¶
Retrieve a list of the conditional node names present in this Order Run.
- Returns
List of conditional node names.
- Return type
list
- get_nodes_info() dict [source]¶
Extract and return the node information from the order run details, excluding the nodes that should be ignored (e.g. conditional nodes and the Order_Run_Monitor node itself).
- Returns
Dictionary keyed by node name and valued by a dictionary of node details.
- Return type
dict
- get_order_run_details(**kwargs) dict [source]¶
Retrieve order run details for the associated order run. The provided kwargs may be used to augment the returned value with more granular details.
- Parameters
kwargs – Optional keyword arguments as found in DataKitchenClient’s
get_order_run_details()
- Returns
Dictionary of order run details
- Return type
dict
- monitor() tuple [source]¶
Poll the DataKitchen platform API for the status of the associated Order Run. Report the status of each node until all the nodes have completed or if the run has failed and nodes stopped processing. If this order run is for an ingredient, monitoring is disabled.
- Returns
Contains two lists. The first list contains names of the nodes that succeeded, whereas the second list contains names of the nodes that failed. Both are empty if this is an ingredient order run.
- Return type
tuple
- static parse_log_entry(log_entry: dict) dict [source]¶
Parse a log entry dictionary to derive the fields required for the MessageLogEventApiSchema.
- Parameters
log_entry (dict) –
Dictionary of log details for a single log entry of the form:
{ 'datetime': '2022-08-16T14:38:58.611000', 'disk_used': '5.8984375 MB', 'exc_desc': None, 'exc_type': None, 'hostname': '0d4e31d0-1d9b-11ed-971d-621c363ef06a-lqq9t', 'mem_usage': '127.33 MB', 'message': 'Test Fail: DKDataTestFailed', 'node': 'Fail_Node', 'order_run_id': '115d3e42-1d9b-11ed-b495-c216e4cc8e61', 'pid': 22, 'priority': 27, 'record_type': 'ERROR', 'syslogts': '2022-08-16T14:38:58-05:00', 'thread_name': 'NodeExecutorThread:0', 'traceback': None }
- Returns
Dictionary of required and optional fields for the MessageLogEventApiSchema
- Return type
dict
- class dkutils.datakitchen_api.order_run_monitor.RunStatus(value)[source]¶
Bases:
enum.Enum
An enumeration.
- COMPLETED = 'COMPLETED'¶
- COMPLETED_WITH_WARNINGS = 'COMPLETED_WITH_WARNINGS'¶
- FAILED = 'FAILED'¶
- RUNNING = 'RUNNING'¶
- dkutils.datakitchen_api.order_run_monitor.get_customer_code(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient) str [source]¶
Retrieve the customer code from the authenticated user associated with the provided DataKitchen client.
- Parameters
dk_client (DataKitchenClient) – Client for making requests to the DataKitchen platform API.
- Returns
Customer code - typically two or three letters.
- Return type
str
- dkutils.datakitchen_api.order_run_monitor.get_ingredient_owner_order_run_id(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient)[source]¶
If this order run is for an ingredient, then return the parent order run id. Otherwise, return None.
- Parameters
dk_client (DataKitchenClient) – Client for making requests to the DataKitchen platform API.
- Returns
Return the parent order run id if the current order run is for an ingredient, otherwise return None.
- Return type
str or None
- dkutils.datakitchen_api.order_run_monitor.get_order_run_url(dk_client: dkutils.datakitchen_api.datakitchen_client.DataKitchenClient, customer_code: str, order_run_id: str) str [source]¶
Retrieve the URL for navigating to the Order Run Details page in the DataKitchen platform for the provided order_run_id.
- Parameters
dk_client (DataKitchenClient) – Client for making requests to the DataKitchen platform API
customer_code (str) – Customer code required for constructing the URL
order_run_id – Order run id the URL will link to
- Returns
URL for navigating to the Order Run Details page in the DataKitchen platform for the provided order_run_id.
- Return type
str