๐ Colab Notebook
๐ Blog Post
Description
In this project, I trained a neural network to localize key points on faces. Resnet-18 was used as the model with some slight modifications to the input and output layer. The model was trained on the official DLib Dataset containing 6666 images along with corresponding 68-point landmarks for each face. Additionally, I wrote a custom data preprocessing pipeline in PyTorch to increase variance in the input images to help the model generalize better. The neural network was trained for 30 epochs before it reached the optima.
During inference, OpenCV Harr Cascades are used to detect faces in the input images. Detected faces are then cropped, resized to (224, 224), and fed to our trained neural network to predict landmarks in them. The predicted landmarks in the cropped faces are then overlayed on top of the original image.