Deep Learning (Data Scientist) Practice Test

0%

The human brain is thought to have inspired a neural network model. The neural network is made up of many different components. Each neuron receives an input, processes it, and then outputs.
Which of the following statements represents a genuine neuron correctly?

Correct! Wrong!

Explanation:
A neuron might have a single or multiple inputs and outputs.

The most critical stage in constructing a neural network is determining each neuron's weight and bias. You can estimate any function if you can figure out how to get the right weight and bias for each neuron. What is the best strategy for dealing with this?

Correct! Wrong!

Explanation:
Gradient descent is described in Option B.

What are the steps involved in employing a gradient descent algorithm? 1. Reiterate until you find the best weights of network
2. Go to each neurons which contributes to the error and change its respective values to reduce the error
3. Calculate error between the actual value and the predicted value
4.Pass an input through the network and get values from output layer
5. Initialize random weight and bias

Correct! Wrong!

Explanation:
The correct answer
5, 4, 3, 2, 1

"Convolutional Neural Networks can change an input in a variety of ways (rotations or scaling)." True or False:
Is the statement correct?

Correct! Wrong!

Explanation:
Before you submit the data to the neural network, you must do data preprocessing processes (such as rotation and scaling) because the neural network cannot do it on its own.

Which of the following strategies is similar to dropout in a neural network in terms of operations?

Correct! Wrong!

Explanation:
Dropout is a type of bagging in which each model is trained on a single example and each parameter is substantially regularized by sharing it with the corresponding parameter in all other models.

A neural network's non-linearity is caused by which of the following?

Correct! Wrong!

Explanation:
Rectified Linear Units (RLUs) are a type of activation function that is frequently employed in deep learning models. In other words, if the function receives a negative value, it returns 0; if it receives a positive value, it returns the same positive value.

Which of the following statements concerning model capacity (the ability of a neural network to approximate complex functions) is correct?

Correct! Wrong!

Explanation:
The correct answer
As number of hidden layers increase, model capacity increases

The classification error of test data always lowers as the number of hidden layers in a Multi Layer Perceptron increases. Is this statement true or false?

Correct! Wrong!

Explanation:
This is not always true. Overfitting may cause the error to increase.

In a perceptron, what is the sequence of the following tasks?
1. For a sample input, compute an output
2. If the prediction does not match the output, change the weights
3. Initialize weights of perceptron randomly
4.Go to the next batch of dataset

Correct! Wrong!

Explanation:
The correct answer
3, 1, 2, 4

Assume you have to change the settings to minimize the cost function. Which of the following methods could be utilized in this situation?

Correct! Wrong!

Explanation:
To alter settings, any of the techniques indicated above can be utilized.

Weight sharing occurs in which neural net architecture?

Correct! Wrong!

Explanation:
The correct answer
Both B and C

Normalization by batch is helpful because

Correct! Wrong!

Explanation:
Batch Normalizing is a normalization technique that is applied between the layers of a Neural Network rather than in the raw data. Instead of using the entire data set, it is done in mini-batches. Its purpose is to facilitate learning by speeding up training and utilizing higher learning rates.

We specify a metric called bayes error, which is the error we expect to attain, instead of trying to reach absolute zero error. What is the rationale for employing Bayes error?

Correct! Wrong!

Explanation:
Accurate prediction is a myth, not a reality. As a result, we should aim for a "achievable result."

Which of the following strategies is used to cope with overfitting in a neural network?

Correct! Wrong!

Explanation:
To cope with overfitting, all of the strategies can be used.

In a supervised learning task, the number of neurons in the output layer should match the number of classes (where the number of classes is larger than 2). Is this statement true or false?

Correct! Wrong!

Explanation:
It is determined by the output encoding. It's true if it's one-shot encoding. However, you can have two outputs for four classes and use binary values to represent the four classes (00,01,10,11).

Y = ax^2 + bx + c (polynomial equation of degree 2)

Is it possible to represent this equation using a neural network with a single hidden layer and a linear threshold?

Correct! Wrong!

Explanation:
The answer is no, because a linear threshold constrains your neural network, thus turning it into a linear transformation function.

Which of the following statements describes early stopping the best?

Correct! Wrong!

Explanation:
The correct answer
Simulate the network on a test dataset after every epoch of training. Stop training when the generalization error starts to increase

In a neural network, what is a dead unit?

Correct! Wrong!

Explanation:
The correct answer
A unit which doesn’t update during training by any of its neighbour

What if we utilize an excessively high learning rate?

Correct! Wrong!

Explanation:
Option C is the best option because the error rate would become unpredictable and explode.

Translation invariance is kept when a pooling layer is added to a convolutional neural network. Is this statement true or false?

Correct! Wrong!

Explanation:
When you employ pooling, you get translation invariance.

Assume an ImageNet dataset is used to train a convolutional neural network (Object recognition dataset). The trained model is then given a fully white image as an input, with the output probabilities for all classes being equal. Is this statement true or false?

Correct! Wrong!

Explanation:
There may be some neurons that do not activate when white pixels are used as input. As a result, the classes will not be equal.

When the data is too large to handle in RAM at the same time, which gradient technique is more advantageous?

Correct! Wrong!

Explanation:
The iterative approach of stochastic gradient descent (commonly abbreviated SGD) is used to optimize an objective function with sufficient smoothness criteria (e.g. differentiable or subdifferentiable).

Which architecture of neural network would be more suited to address an image identification challenge (recognizing a cat in a photo)?

Correct! Wrong!

Explanation:
Because of its fundamental nature of taking into consideration changes in neighboring locations of an image, the Convolutional Neural Network would be more suited for image-related problems.

Consider the following scenario. There isn't a lot of data in the problem you're trying to address. You have a pre-trained neural network that was trained on a similar problem, which is fortunate. Which of the following approaches would you use to put this pre-trained network to work for you?

Correct! Wrong!

Explanation:
If the dataset is mostly similar, training simply the last layer is the best option, as the prior levels serve as feature extractors.

The performance of a convolutional network would always improve as the size of the convolutional kernel was increased.

Correct! Wrong!

Explanation:
Increasing kernel size would not necessarily increase performance. This depends heavily on the dataset.