A financial services company needs to process massive volumes of historical transaction data for end-of-day reporting. The processing involves complex aggregations and joins. The key requirements are cost-effectiveness for handling petabyte-scale data and reliability, even with hardware failures. Which technology is most suitable for this large-scale batch processing task?
-
A
Apache Kafka
-
B
Apache Spark
-
C
Hadoop MapReduce
-
D
Redis