A data engineering team manages a complex data platform. Job B must run only after Job A completes, and Job C must run after Job B. If Job B fails, it should be retried three times before an alert is sent. Which tool is best suited for managing this entire process?
-
A
A stream processing framework like Apache Flink
-
B
A distributed query engine like Trino
-
C
A set of cron jobs on a Linux server
-
D
A workflow orchestration engine like Apache Airflow