Autonomous vehicle systems can be divided into two main parts, the perception system and the decision-making system. In this project, the goal was to focus on the perception system’s subsystem which was an object detector for road obstacle detection on the roads of Karachi. To do this, firstly a dataset of 3000 images was annotated with bounding box annotation for 12 different kinds of road obstacles. These images were extracted frames (every 10th second) from about 10 hours of dashcam footage from different areas of Karachi. In parallel three different models were trained on the Berkeley Deep Drive Dataset (BDD100k), which were YOLOv3, RetinaNet and Faster R-CNN. Due to computational constraints the models were trained on only 5,000 out of 70,000 images and validated on 1,000 out of 10,000 images present in the BDD100k dataset. The models trained had the following mAP on BDD100k; YOLOv3(29.47), RetinaNet(37.34) and Faster R-CNN(35.78). These models were then used as pretrained models for transfer learning on Karachi Dataset to create three new models. The models trained on this new dataset had the following mAP; YOLOv3(41.67), Retina Net(67.26), Faster R-CNN(65.80). Analysis suggests that transfer learning using the BDD100k dataset, is the most optimum technique for training an object detection model on Karachi Dataset