Face Recognition with MTCNN and FaceNet; RL with Proximal Policy Optimization
#CellStratAILab #disrupt4.0 #WeCreateAISuperstars #AlwaysUpskilling
Minutes from Saturday 7th March 2020 AI Lab meetup at BLR :-
Last Saturday, we had excellent sessions in the AI Lab meetup.
Face Recognition with MTCNN and FaceNet :-
First Amit Kumar presented a detailed overview of Face Recognition with MTCNN and FaceNet.
Face Recognition involves a pipeline of Face Detection, Feature Extraction and Face Classification. MTCNN is a face detector and has a series of three networks :-
- P-net : Proposal Network to propose candidate facial regions
- R-net : Refine Network to filter and refine the bounding boxes
- O-net : Further refines the bounding boxes and detects five facial landmarks on the face.
The Loss function is a weigted sum of three different losses :-
- L-det : Cross-entropy loss to indicate the probability whether a sample has a face or not
- L-box : bounding-box regression loss
- L-landmark : distance loss between predicted landmarks and ground truth
FaceNet takes the output of MTCNN and classifies the face. FaceNet is a unified embedding for face recognition and clustering. It extracts face features as an embedding vector. Then it measures the Euclidean distance between face image embeddings and checks how similar the faces are.
The loss function used in FaceNet is a triplet loss. Here a baseline input anchor is compared to a positive sample and negative sample. Triplet loss minimizes the distance between anchor and positive node and maximizes the distance between anchor and negative node.
Amit demonstrated a code demo for celebrity face detection in images using MTCNN-FaceNet architecture.
For a detailed discussion on MTCNN and FaceNet algorithms, click here.
RL with Policy Gradients and Proximal Policy Optimization :-
Next came a deep presentation on Proximal Policy Optimization (PPO) by Shubha M.
Reinforcement Learning (RL) broadly involves Value-based methods and Policy-based methods.
Policy methods discover optimal policy for an RL agent. Policy Gradients are a class of Policy Methods that estimate the weights of a policy by means of gradient ascent. Policy methods are well suited to continuous action spaces (such as a moving car).
A trajectory is a sequence of states and actions in the world (s0, a0, s1, a1,…). There can be numerous trajectories The objective function tries to pick the trajectory which maximizes the rewards. This is achieved by gradient ascent with respect to weights.
Vanilla Policy Gradient (VPG) is one kind of PG algo. Compared to the basic PG algo called REINFORCE, the VPG uses a baseline reward (b), which is compared to current reward in each iteration. The baseline is adjusted in subsequent iterations to move towards a more optimum policy goal.
Another kind of policy algo is Proximal Policy Optimization or PPO. It uses Importance Sampling, where one calculates the probabilities of recreating old trajectory with old and new policy. PPO has many more aspects such as Surrogate Function, Clipped Objective, Advantage Function etc. For more details on entire PPO algorithm, refer to this blog here.
CellStrat AI Lab is the benchmark of AI innovation in India. Our AI ML courses are training expert level professionals all of whom got gainfully placed in top firms.
Visit our AI Lab this Saturday in BLR to check out our AI projects and training programs.
BLR AI Lab :
Register : https://bit.ly/39EtKeq
Topic : Hands-on Workshop on Stock Market predictions with Actor Critic RL
Date : Saturday 14th Mar 2020, 10:30 AM – 5:00 PM
Presenters : Shubha M. (CellStrat AI Researcher)
See you this Saturday for the AI Lab meetup in BLR ! Lets disrupt the world with AI !
Questions ? Call me at +91-9742800566 !
Co-Founder & Chief Data Scientist, CellStrat