Published April 30, 2020 | Version v1
Journal article Open

Object Detector for Visually Impaired with Distance Calculation for Humans

  • 1. Asst. Professor, Department of Information Technology, Chaitanya Bharathi Institute of Technology (CBIT), Hyderabad, India
  • 2. Bachelor of Engineering, Information Technology, Chaitanya Bharathi Institute of Technology (CBIT), Hyderabad, India
  • 3. Bachelor of Engineering,Information Technology, Chaitanya Bharathi Institute of Technology (CBIT), Hyderabad, India
  • 1. Publisher


Object detection is a computer vision technique for locating instances of objects in videos. When we as humans look at images or videos, we can recognize and locate objects within a matter of moments. The main goal of this project is to clone the intelligence of humans in doing that using Deep Neural Networks and IOT, Raspberry Pi and a camera. This model could be used for visually disabled people for improved navigation and crash free motion. When we consider real time scenarios, numerous objects come into a single frame. To identify different items simultaneously as they are captured, a strong model needs to be developed. YOLO (You Only Look Once) is a clever convolutional neural network (CNN) that helps in reaching that objective. The algorithm applies a single neural network to the full image, and then divides the image into regions and predicts bounding boxes and probabilities for each region. The bounding boxes are nothing but weighted by the predicted probabilities. The second objective of this model is to calculate distance of humans from the camera, to achieve that haar classifier is created and used. This classifier also helps in enhancing human detection along with distance calculation. Haar is just like a kernel in CNN where the kernel values are determined by training while in Haar they are determined manually. Whenever a person is detected by both YOLO and Haar classifier, a formula which considers height and width of human contours is applied to calculate the distance of it from the camera. As the objects are identified they will be read out using a text-to-speech engine known as gTTS(google text-to-speech) and ,which stores the text in an mp3 file. The package known as Pygame will load and play the mp3 file dynamically as the objects are detected. This developed Deep Learning model is integrated with Raspberry Pi using OpenCV. Though this project is primarily developed to aid visually disabled people, it can have various other applications such as, self-driving cars, video surveillance, pedestrian detection, face detection.



Files (938.3 kB)

Name Size Download all
938.3 kB Preview Download

Additional details

Related works

Is cited by
Journal article: 2249-8958 (ISSN)


Retrieval Number