Apache Spark Test 2

Question 1

Spark supports which cluster managers?

Accepted Answer

All of the above

Answer

MESOS

Answer

YARN

Answer

Standalone Cluster Manager

Question 2

Which of the following statements about Spark MLlib is correct?

Accepted Answer

It is the scalable machine learning library which delivers efficiencies

Answer

Enables powerful interactive and data analytics application across live streaming data

Answer

Provides an execution platform for all the Spark applications

Answer

All of the above

Question 3

RDDs are immutable and fault-tolerant.

Accepted Answer

True

Answer

False

Question 4

Which algorithm is not a solution for the regression problem?

Accepted Answer

Logistic Regression

Answer

Gradient-Boosted Trees

Answer

Decision Trees

Answer

Ridge Regression

Question 5

Which of the following statements about Spark R is correct?

Accepted Answer

It allows data scientists to analyze large datasets and interactively run jobs

Answer

It enables users to run SQL / HQL queries on the top of Spark.

Answer

It is the kernel of Spark

Answer

It is the scalable machine learning library which delivers efficiencies

Question 6

Which of the following statements regarding DataFrame is correct?

Accepted Answer

DataFrames provide a more user-friendly API than RDDs.

Answer

DataFrame API have provision for compile-time type safety

Answer

Both the above

Answer

None of the above

Question 7

Which of the following statements about Spark Shell is correct?

Accepted Answer

All of the above

Answer

It allows reading from many types of data sources

Answer

It helps Spark applications to easily run on the command line of the system

Answer

It runs/tests application code interactively

Question 8

Is MLlib a deprecated library?

Accepted Answer

No

Answer

Yes

Question 9

On RDD, the read operation is

Accepted Answer

Either fine-grained or coarse-grained

Answer

Coarse-grained

Answer

Fine-grained

Answer

Neither fine-grained nor coarse-grained