For a project in our MSc Artificial Intelligence at the University of Amsterdam, Gilles de Hollander and me implemented a new system for automatically fixing group portraits. You are probably familiar with the situation of trying to take a good picture of a group of people, what usually happens is that not everyone is smiling at the same time, people look away, they have their eyes closed, etc. We used computer vision techniques to automatically combine a series of ‘bad’ group portraits into a single good one, where everybody smiles.
Step One: Finding Faces
This was the easiest part of the project. Face detection has pretty much been solved. The Viola-Jones object detection works more than well enough for our needs. You can download many different implementations of the Viola-Jones algorithms, we used the one included in OpenCV.
Step Two: Rating Smiles
Now that we know where the faces are, it’s time for the real A.I. task: judging whether the faces are smiling or not. I’ll give a quick rundown of how that works. We collected and labeled about 6000 images of smiling and “not smiling” people and we trained a so-called support vector machine. As preprocessing, we computed the histogram of oriented gradients of each image (refer to the report for more details). The SVM can then assign scores or ratings to each face. We use these ratings to sort the faces, and pick the most smiling face for each person in the image.
Step Three: Stitching it all back together
The third and final step is to take the areas from the photos that were indicated to be smiling faces, and stitch these back together into a neat looking composite. We have to find places where we can cut the image without creating a visible seam. Obviously, just pasting the smiling faces onto the non-smiling faces will give conspicuous artifacts, as you can see in the following image.
The state-of-the-art methods for image segmentation use a method called minimum graph cut. As an input you can define a map of “costs” that indicates where the seams can be put without being obtrusive. The min-cut max-flow algorithm then finds the best “cut” through the image.
As a final touch we apply a tiny bit of blending to the edges. We applied the procedure to a set that includes the two pictures at the beginning of the post plus two more images and the result is shown below.
This whole procedure requires no setup or intervention, you can just put in the group shots in one end, and a neat composite comes out the other end in less than a minute. We feel that this method could be used for a special “group shot mode” in digital cameras. This mode would take multiple shots and composite them all at the press of a single button. We are also looking at the feasibility of turning this project into an iPhone App. If you want to know more about the exact implementation, read the report.