Realtime Face Mask Detector Project (Part 2 : OpenCV Implementation)

Aoishi Das
7 min readMay 6, 2022

--

Hey Guys!!

In the previous blog we have already learnt how to build a Face Mask Detector using CNN. Now we have the trained model architecture with us. We need to implement it using OpenCV so that we can detect if a person is wearing a face mask or not in real-time.

Link for the previous part :

For the OpenCV implementation part we are going to code in Jupyter Notebook.

Why aren’t we using Google Colab?

Google Colab runs on the cloud. As long as we don’t need to access any computer hardware Google Colab is a great choice. However for this project we need to access the webcam and Colab won’t be able to access it. Hence we are going to use Jupyter Notebook to code the OpenCV implementation.

Before we move on it’s very important to know :

What is OpenCV?

OpenCV is a library which has functions , tools and hardware for real-time computer vision problems.

Here we will take the image using a webcam and it will be sent to the Machine Learning model as an input. Now the model will return if the person is wearing a face mask or not and that information will be shown on the screen.

Before diving into the code let’s have a look at the output once this project is complete :

Now let’s dive into the code :

Step 1 : Importing the libraries

import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow.keras.models import model_from_json
import json

cv2 is the library for OpenCV

numpy is used for n-dimensional array operations

tensorflow is used to load the model and the weights

json is used to parse JSON files.

Step 2 :Reconstructing the saved model from json file and loading the weights

Now we need to load the model and the weights. If you guys remember when we created and trained the CNN model for facemask detection, we saved the architecture as a json file and the weights in a .h5 file. Now it’s time to load them.

# Model reconstruction from JSON file
with open(r"C:\Users\Suprakash\Downloads\model_architecture_FaceMask_Detection_second_try_three.json", 'r') as f:
model=model_from_json(f.read())

# Load weights into the new model
model.load_weights(r"C:\Users\Suprakash\Downloads\FaceMask_Detection_second_try_three.h5")

First we are loading the model architecture. Pass the path of the architecture file and open it using with open. Now using model_from_json() the file is read and the model is loaded and stored as model.

Next it’s time to load the weights. For this, .load weights() is used and the path of the weights file is passed.

If you want you can check the model summary using model.summary()

model.summary()

Step 3 : Realtime face mask detection

Now we create a dictionary for the labels

label = {0:"With Mask",1:"Without Mask"}
color_label = {0: (0,255,0),1 : (0,0,255)}

We have already seen that 0 corresponds to the class “With Mask” and 1 corresponds to “Without Mask”.

A color label is also associated with each class. We will use it later.

Now for With Mask the color is Green ( B=0, G=255, R=0) and for Without Mask the color is Red (B=0, G=0, R=255).

Wait a minute!! Confused about why B is coming first and R at the end?

Well that’s because OpenCV reads colors in the format BGR.

Now we will proceed towards capturing the video using the webcam.

cap = cv2.VideoCapture(0)

A VideoCapture object is created to capture the video. Here we select the first webcam for our system by passing 0.

cascade = cv2.CascadeClassifier(r'C:\Users\Suprakash\Downloads\haarcascade_frontalface_default.xml')

What is a Haar Cascade Classifier ?

Haar Cascade classifier is an object detection approach. It’s basically a machine learning based approach where a cascade function is trained from a lot of images both positive and negative. Based on the training it is then used to detect the objects in the other images. There are different classifiers like for detection of eyes, frontal face, vehicle detection, pedestrian detection and so on.

Now in our case we need to detect people’s faces. So we are going to use haarcascade_frontalface_default.xml.

You can download the file from this link :

So a CascadeClassifier object is created.

Now we want to keep on checking if the persons coming in front of the camera are wearing masks or not. So we write the rest of the part inside an infinite loop.

while True:

Don’t worry. We definitely have a way to exit from the loop. We will discuss that later. Till then let’s move forward.

(rval, frame) = cap.read()

cap.read() is used to read the video image from the webcam. Now it returns two things -

  1. rval holds the return value either True / False depending on whether the frame is read correctly or not
  2. frame holds the frame read from the webcam.

Confused about what a frame is?

Well a video is nothing but a sequence of images. Frame represents these images only. Thus the frame of a video is simply an image. So we capture one frame, send it to the model, get the output and move on to the next frame.

gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

Now we convert the color image into a grayscale image because grayscale image is easier to process and computationally less intensive as it contains only 1-channel of black-white. So we convert the frame captured from BGR to Grayscale.

faces = cascade.detectMultiScale(gray,1.1,4)

Now we will locate the features of our face. The parameters used are :

  1. Image : Pass the Grayscale image generated
  2. scaleFactor : Decides how much the image size is reduced at each image scale. In our case we are using 1.1 which means that it reduces the size by 10% at each scale.
  3. minNeighbours : Specifies how many neighbours each candidate should have to retain it. We are using 4.

The function detectMultiScale returns a list of rectangles for all the faces detected where each element of the list has 4 values — x-coordinate and y-coordinate of the top left corner of the rectangle and width(w) and height(h) of the rectangle.

Now we use a for loop and take one face at a time.

for x,y,w,h in faces:
face_image = frame[y:y+h,x:x+w]
resize_img = cv2.resize(face_image,(150,150))
normalized = resize_img/255.0
reshape = np.reshape(normalized,(1,150,150,3))
result = model.predict(reshape)
result = result[0][0]
  1. We slice out just the face from the frame and store it as face_image.
  2. A little bit of preprocessing is done because they will be sent to the CNN model that we trained previously. That model has an input shape of (150,150,3). Hence we resize the image to match that.
  3. Now we apply normalization : dividing the pixels by 255 to bring them in the range of 0–1.
  4. Now the image needs to be reshaped into 4d as the model expects as 4d input in the form (Batchsize, height, width, number of channels). Here since we are sending a single image we use : 1 as batchsize, 150 as height, 150 as width, 3 as the number of channels.
  5. Now the image is ready to be sent as an input to the model. The image is sent and .predict() is used to return the probability. If the probability is greater than 0.5 that means it belongs to class 1(i.e. Without Mask) else it belongs to class 0 (i.e. With Mask).
if result <= 0.5:
cv2.rectangle(frame,(x,y),(x+w,y+h),color_label[0],3)
cv2.rectangle(frame,(x,y-50),(x+w,y),color_label[0],-1)
cv2.putText(frame,label[0],(x,y-10),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,255),2)

This part deals with the case when the output is less than 0.5 i.e. the person is wearing a mask.

cv2.rectangle() is used to draw a rectangle on an image. The parameters used :

  • image: It is the image on which the rectangle is to be drawn.
  • start_point: It is the starting coordinates of the rectangle. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value).
  • end_point: It is the ending coordinates of the rectangle. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value).
  • color: It is the color of the border line of the rectangle to be drawn. Here we want the color_label[0] which holds the green color.
  • thickness: It is the thickness of the rectangle border line in px. Thickness of -1 px will fill the rectangle shape by the specified color.

Now we do this twice. First to draw the rectangle around the entire face and the next time to create a small rectangular portion filled with the colour on the top of the rectangle bounding the face so that the text of whether wearing a mask or not can be put.

cv2.putText() is used for drawing text strings on an image. The parameters are :

  • image: It is the image on which text is to be drawn.
  • text: Text string to be drawn.
  • org: It is the coordinates of the bottom-left corner of the text string in the image. The coordinates are represented as tuples of two values i.e. (X coordinate value, Y coordinate value).
  • font: It denotes the font type.
  • fontScale: Font scale factor that is multiplied by the font-specific base size.
  • color: It is the color of the text string to be drawn. Like here we are using the white colour so we pass (255,255,255)
  • thickness: It is the thickness of the line in px.

Now a similar code block is used for the else condition i.e. when the person isn’t wearing a mask.

elif result > 0.5:
cv2.rectangle(frame,(x,y),(x+w,y+h),color_label[1],3)
cv2.rectangle(frame,(x,y-50),(x+w,y),color_label[1],-1)
cv2.putText(frame,label[1],(x,y-10),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,255),2)

Here the only changes are in the label and color used.

cv2.imshow('LIVE',   frame)

Now using the cv2.imshow() function the frame is shown on the screen. The parameters are :

  • window_name: A string representing the name of the window in which image to be displayed.
  • image: It is the image that is to be displayed.
key = cv2.waitKey(10)

cv2.waitkey() waits for a specific duration in milliseconds until any key is pressed.

if key==27:
break

27 corresponds to the Escape key. So if the user presses the Escape key the break statement is executed and it comes out from the infinite loop.

So if you want to stop the process just press the Escape key on the keyboard.

cap.release()

Finally the capturing device is released.

cv2.destroyAllWindows()

It destroys all the windows opened.

Hope you guys enjoyed this project.

Project Link :

Feel free to comment down below if you have any doubts..

All the best and Happy Learning..!!

--

--

Aoishi Das

Just a small neuron trying to decode the world of Machine Learning and AI.