Neural Networks Practice Test Video Answers

1. A
A perceptron combines inputs with weights, adds bias, and passes the sum through an activation function.

2. B
Activation functions introduce non-linearity, enabling the network to learn complex patterns.

3. B
ReLU (Rectified Linear Unit) is the most common hidden-layer activation function.

4. B
Backpropagation adjusts weights using gradients from the output layer backward.

5. C
Feedforward networks pass data from input → hidden layers → output.

6. B
Gradient descent minimizes the loss function by adjusting weights.

7. B
Overfitting occurs when the network memorizes training data, reducing generalization.

8. B
Dropout randomly disables neurons during training to reduce overfitting.

9. B
Batch normalization stabilizes and speeds up training by normalizing layer inputs.

10. B
Softmax outputs probabilities that sum to 1 across all classes.

11. B
CNNs excel in image and spatial data analysis.

12. B
Pooling layers reduce spatial size, lowering computation and overfitting risk.

13. B
RNNs handle sequential data such as text or speech.

14. B
Very deep networks with sigmoid/tanh activations suffer vanishing gradients.

15. B
LSTMs address vanishing gradients in RNNs using memory cells and gates.

16. B
An epoch is one complete pass through the training dataset.

17. B
The loss function measures error between predictions and true labels.

18. B
Adam combines momentum and adaptive learning rates for optimization.

19. B
Overfitting is reduced with dropout, early stopping, and data augmentation.

20. A
Shallow networks have only input-output layers or one hidden layer.

21. B
Weight initialization helps convergence and prevents symmetry issues.

22. B
Transfer learning reuses a pre-trained model for a new, related task.

23. B
The universal approximation theorem states one hidden layer can approximate any continuous function.

24. B
Precision, recall, and F1-score are best for imbalanced datasets.

25. B
A policy network maps states to actions in reinforcement learning.

26. B
ReLU allows gradients for positive values, reducing vanishing gradients.

27. A
Autoencoders learn representations without labels → unsupervised learning.

28. B
Hyperparameter tuning adjusts settings like learning rate and batch size.

29. B
The bottleneck layer compresses data into a reduced representation.

30. A
GANs use generator vs. discriminator in a competitive setup.

31. B
Learning rate controls the step size in weight updates.

32. B
Too high learning rate → diverging or oscillating loss.

33. B
A kernel is a small matrix of weights for extracting features in CNNs.

Exit mobile version