An engineer is developing a system for real-time object detection on a mobile device with limited computational power. The highest priority is inference speed, even if it means a slight trade-off in accuracy, especially for very small objects. Which object detection model architecture is most suitable for this scenario?