Deep Learning (Data Scientist) Practice Test

The human brain is thought to have inspired a neural network model. The neural network is made up of many different components. Each neuron receives an input, processes it, and then outputs.
Which of the following statements represents a genuine neuron correctly?

A) A neuron has several inputs but only one output.

B) A neuron has only one input and only one output.

C) Multiple inputs and outputs are found in a neuron.

D) Although a neuron has only one input, it has several outputs.

E) All of the following statements are correct.

Correct! Wrong!

Explanation:
A neuron might have a single or multiple inputs and outputs.

The most critical stage in constructing a neural network is determining each neuron's weight and bias. You can estimate any function if you can figure out how to get the right weight and bias for each neuron. What is the best strategy for dealing with this?

A) Find the best value by combining as many weights and biases as possible.

B) Iteratively check how far you are from the optimal values after assigning a value, then make little changes to the assigned values to improve them.

C) Assign random values and pray to God that they are accurate.

D) None of the above

Correct! Wrong!

Explanation:
Gradient descent is described in Option B.

What are the steps involved in employing a gradient descent algorithm? 1. Reiterate until you find the best weights of network
2. Go to each neurons which contributes to the error and change its respective values to reduce the error
3. Calculate error between the actual value and the predicted value
4.Pass an input through the network and get values from output layer
5. Initialize random weight and bias

A) 5, 4, 3, 2, 1

B) 1, 2, 3, 4, 5

C) 4, 3, 1, 5, 2

D) 3, 2, 1, 5, 4

Correct! Wrong!

Explanation:
The correct answer
5, 4, 3, 2, 1

"Convolutional Neural Networks can change an input in a variety of ways (rotations or scaling)." True or False:
Is the statement correct?

A) False

B) True

Correct! Wrong!

Explanation:
Before you submit the data to the neural network, you must do data preprocessing processes (such as rotation and scaling) because the neural network cannot do it on its own.

Which of the following strategies is similar to dropout in a neural network in terms of operations?

A) Boosting

B) Stacking

C) Bagging

D) None of the above

Correct! Wrong!

Explanation:
Dropout is a type of bagging in which each model is trained on a single example and each parameter is substantially regularized by sharing it with the corresponding parameter in all other models.

A neural network's non-linearity is caused by which of the following?

A) Rectified Linear Unit

B) Convolution function

C) Stochastic Gradient Descent

D) None of these

Correct! Wrong!

Explanation:
Rectified Linear Units (RLUs) are a type of activation function that is frequently employed in deep learning models. In other words, if the function receives a negative value, it returns 0; if it receives a positive value, it returns the same positive value.

Which of the following statements concerning model capacity (the ability of a neural network to approximate complex functions) is correct?

A) As learning rate increases, model capacity increases

B) As number of hidden layers increase, model capacity increases

C) As dropout ratio increases, model capacity increases

D) None of the above

Correct! Wrong!

Explanation:
The correct answer
As number of hidden layers increase, model capacity increases

The classification error of test data always lowers as the number of hidden layers in a Multi Layer Perceptron increases. Is this statement true or false?

A) False

B) True

Correct! Wrong!

Explanation:
This is not always true. Overfitting may cause the error to increase.

In a perceptron, what is the sequence of the following tasks?
1. For a sample input, compute an output
2. If the prediction does not match the output, change the weights
3. Initialize weights of perceptron randomly
4.Go to the next batch of dataset

A) 3, 1, 2, 4

B) 1, 2, 3, 4

C) 1, 4, 3, 2

D) 4, 3, 2, 1

Correct! Wrong!

Explanation:
The correct answer
3, 1, 2, 4

Assume you have to change the settings to minimize the cost function. Which of the following methods could be utilized in this situation?

A) Bayesian Optimization

B) Exhaustive Search

C) Random Search

D) Any of these

Correct! Wrong!

Explanation:
To alter settings, any of the techniques indicated above can be utilized.

Weight sharing occurs in which neural net architecture?

A) Fully Connected Neural Network

B) Convolutional neural Network

C) Recurrent Neural Network

D) Both B and C

Correct! Wrong!

Explanation:
The correct answer
Both B and C

Normalization by batch is helpful because

A) It is a very efficient backpropagation technique

B) It normalizes (changes) all the input before sending it to the next layer

C) It returns back the normalized mean and standard deviation of weights

D) None of the above

Correct! Wrong!

Explanation:
Batch Normalizing is a normalization technique that is applied between the layers of a Neural Network rather than in the raw data. Instead of using the entire data set, it is done in mini-batches. Its purpose is to facilitate learning by speeding up training and utilizing higher learning rates.

We specify a metric called bayes error, which is the error we expect to attain, instead of trying to reach absolute zero error. What is the rationale for employing Bayes error?

A) Limited training data

B) Input variables may not contain complete information about the output variable

C) System (that creates input-output mapping) may be stochastic

D) All the above

Correct! Wrong!

Explanation:
Accurate prediction is a myth, not a reality. As a result, we should aim for a "achievable result."

Which of the following strategies is used to cope with overfitting in a neural network?

A) Regularization

B) Batch Normalization

C) Dropout

D) All of the above

Correct! Wrong!

Explanation:
To cope with overfitting, all of the strategies can be used.

In a supervised learning task, the number of neurons in the output layer should match the number of classes (where the number of classes is larger than 2). Is this statement true or false?

A) False

B) True

Correct! Wrong!

Explanation:
It is determined by the output encoding. It's true if it's one-shot encoding. However, you can have two outputs for four classes and use binary values to represent the four classes (00,01,10,11).

Y = ax^2 + bx + c (polynomial equation of degree 2)

Is it possible to represent this equation using a neural network with a single hidden layer and a linear threshold?

A) No

B) Yes

Correct! Wrong!

Explanation:
The answer is no, because a linear threshold constrains your neural network, thus turning it into a linear transformation function.

Which of the following statements describes early stopping the best?

A) Simulate the network on a test dataset after every epoch of training. Stop training when the generalization error starts to increase

B) A faster version of backpropagation, such as the `Quickprop’ algorithm

C) Train the network until a local minimum in the error function is reached

D) Add a momentum term to the weight update in the Generalized Delta Rule, so that training converges more quickly

Correct! Wrong!

Explanation:
The correct answer
Simulate the network on a test dataset after every epoch of training. Stop training when the generalization error starts to increase

In a neural network, what is a dead unit?

A) A unit which does not respond completely to any of the training patterns

B) The unit which produces the biggest sum-squared error

C) A unit which doesn’t update during training by any of its neighbour

D) None of the above

Correct! Wrong!

Explanation:
The correct answer
A unit which doesn’t update during training by any of its neighbour

What if we utilize an excessively high learning rate?

A) Can’t Say

B) Network will converge

C) Network will not converge

Correct! Wrong!

Explanation:
Option C is the best option because the error rate would become unpredictable and explode.

Translation invariance is kept when a pooling layer is added to a convolutional neural network. Is this statement true or false?

A) False

B) True

Correct! Wrong!

Explanation:
When you employ pooling, you get translation invariance.

Assume an ImageNet dataset is used to train a convolutional neural network (Object recognition dataset). The trained model is then given a fully white image as an input, with the output probabilities for all classes being equal. Is this statement true or false?

A) False

B) True

Correct! Wrong!

Explanation:
There may be some neurons that do not activate when white pixels are used as input. As a result, the classes will not be equal.

When the data is too large to handle in RAM at the same time, which gradient technique is more advantageous?

A) Stochastic Gradient Descent

B) Full Batch Gradient Descent

Correct! Wrong!

Explanation:
The iterative approach of stochastic gradient descent (commonly abbreviated SGD) is used to optimize an objective function with sufficient smoothness criteria (e.g. differentiable or subdifferentiable).

Which architecture of neural network would be more suited to address an image identification challenge (recognizing a cat in a photo)?

A) Perceptron

B) Multi Layer Perceptron

C) Convolutional Neural Network

D) Recurrent Neural network

Correct! Wrong!

Explanation:
Because of its fundamental nature of taking into consideration changes in neighboring locations of an image, the Convolutional Neural Network would be more suited for image-related problems.

Consider the following scenario. There isn't a lot of data in the problem you're trying to address. You have a pre-trained neural network that was trained on a similar problem, which is fortunate. Which of the following approaches would you use to put this pre-trained network to work for you?

A) Freeze all the layers except the last, re-train the last layer

B) Fine tune the last couple of layers only

C) Re-train the model for the new dataset

D) Assess on every layer how the model performs and only select a few of them

Correct! Wrong!

Explanation:
If the dataset is mostly similar, training simply the last layer is the best option, as the prior levels serve as feature extractors.

The performance of a convolutional network would always improve as the size of the convolutional kernel was increased.

A) False

B) True

Correct! Wrong!

Explanation:
Increasing kernel size would not necessarily increase performance. This depends heavily on the dataset.

The human brain is thought to have inspired a neural network model. The neural network is made up of many different components. Each neuron receives an input, processes it, and then outputs. Which of the following statements represents a genuine neuron correctly?

The most critical stage in constructing a neural network is determining each neuron's weight and bias. You can estimate any function if you can figure out how to get the right weight and bias for each neuron. What is the best strategy for dealing with this?

"Convolutional Neural Networks can change an input in a variety of ways (rotations or scaling)." True or False: Is the statement correct?

Which of the following strategies is similar to dropout in a neural network in terms of operations?

A neural network's non-linearity is caused by which of the following?

Which of the following statements concerning model capacity (the ability of a neural network to approximate complex functions) is correct?

The classification error of test data always lowers as the number of hidden layers in a Multi Layer Perceptron increases. Is this statement true or false?

In a perceptron, what is the sequence of the following tasks? 1. For a sample input, compute an output 2. If the prediction does not match the output, change the weights 3. Initialize weights of perceptron randomly 4.Go to the next batch of dataset

Assume you have to change the settings to minimize the cost function. Which of the following methods could be utilized in this situation?

Weight sharing occurs in which neural net architecture?

Normalization by batch is helpful because

We specify a metric called bayes error, which is the error we expect to attain, instead of trying to reach absolute zero error. What is the rationale for employing Bayes error?

Which of the following strategies is used to cope with overfitting in a neural network?

In a supervised learning task, the number of neurons in the output layer should match the number of classes (where the number of classes is larger than 2). Is this statement true or false?

Y = ax^2 + bx + c (polynomial equation of degree 2) Is it possible to represent this equation using a neural network with a single hidden layer and a linear threshold?

Which of the following statements describes early stopping the best?

In a neural network, what is a dead unit?

What if we utilize an excessively high learning rate?

Translation invariance is kept when a pooling layer is added to a convolutional neural network. Is this statement true or false?

Assume an ImageNet dataset is used to train a convolutional neural network (Object recognition dataset). The trained model is then given a fully white image as an input, with the output probabilities for all classes being equal. Is this statement true or false?

When the data is too large to handle in RAM at the same time, which gradient technique is more advantageous?

Which architecture of neural network would be more suited to address an image identification challenge (recognizing a cat in a photo)?

Consider the following scenario. There isn't a lot of data in the problem you're trying to address. You have a pre-trained neural network that was trained on a similar problem, which is fortunate. Which of the following approaches would you use to put this pre-trained network to work for you?

The performance of a convolutional network would always improve as the size of the convolutional kernel was increased.

The human brain is thought to have inspired a neural network model. The neural network is made up of many different components. Each neuron receives an input, processes it, and then outputs.
Which of the following statements represents a genuine neuron correctly?

"Convolutional Neural Networks can change an input in a variety of ways (rotations or scaling)." True or False:
Is the statement correct?

In a perceptron, what is the sequence of the following tasks?
1. For a sample input, compute an output
2. If the prediction does not match the output, change the weights
3. Initialize weights of perceptron randomly
4.Go to the next batch of dataset

Y = ax^2 + bx + c (polynomial equation of degree 2)

Is it possible to represent this equation using a neural network with a single hidden layer and a linear threshold?