Currently, digital systems for analyzing faces and working with them are widely used in various systems, for example, identification of individuals by photo or signature of documents. One branch of face analysis is face replacement systems. In this post, we will talk about this technology.
To begin with, it is worth answering the question: what does face replacement provide? Face swap is a system that, for the selected existing photo/video stream, replaces the real existing face in the original image with the selected one from another. Where can it be applied? These systems are already in operation for entertainment purposes, such as celebrity masks in popular social services, but also supporting software for video recording studios, video editing, etc. Each of the face swapping apps or systems has its own implementation, and the question arises of how to make the system cheaper and less costly to implement by automating the approach as much as possible.
How do these systems work, and what do you need to swap your photos and make a new realistic photo or video?
We begin our discussion with a method based on affine transformations of a triangular face mesh. On both images, key points are searched, the next step is the designation of a grid in a shape of a triangle that will make up the complete picture of each photo, and the final step is the transformation of the corresponding triangles from the original image into a given image according to their position. You can witness this process real-time. Check how face swapping looks in action.
The advantage of this method is the speed of work, in fact, the most energy-intensive process is a neural network that will search for key points of faces; the rest of the transformations do not require large computing power, so we can use this face swap solution in many real-time devices without using cloud computing, unlike the solutions below.
This approach is implemented using a relatively simple idea to train an autoencoder neural network consisting of 2 parts, an encoder and a decoder , the encoder is used to compress the input image and obtain a hidden representation of the image, and the decoder to reconstruct the original image, a feature of this network is that we will use a common encoder, and everyone will have their own decoder.
The operation of this algorithm is as follows, suppose we want to replace the first person’s face with the face of a person from the second image, we run the face of the second person through the encoder and run the resulting latent representation through the decoder of the first. Essentially, the purpose of the encoder is to get a hidden representation that describes the facial expressions, the direction of the gaze, the turn of the face, skin tones, and so on., and the decoder, in turn, having learned a specific face, applies these features to reconstruct the original image.
It is one of the most effective face swapping systems in terms of the quality of work, the main drawback of which in certain tasks can be a decisive factor is manual tuning and large computing power. If you want to use this face swap system, an internet connection is required. This algorithm is similar to the solution of auto encoders, and is equipped with additional features such as manual frame correction on processed images. It should be noted that the high-quality work of the algorithm requires numerous source images for each of the transferred personalities, as well as the heterogeneity of image data such as the angle of rotation of the face, lightness, and even moving mouth or eyes. Simply choose anything from your camera’s video feed, and let the fun begin!
In general, this face swapping system is a good option for those for whom the quality of work is a determining factor in other cases, it is necessary to use alternative solutions.
In the previous approach, we considered a model that requires significant intervention in the operation of the algorithm by a specialist, but what if we want a fully automated process? For such a task, there is a solution using only one image: First Order Motion Model. The project has an open-source code on GitHub, the idea of this solution is not to generate new frames but by studying the position of key points, apply affine transformations to the target image.
This face swap model has a few significant drawbacks, and the main one is poor quality when working with face rotations; the problem is that the hidden parts of the body at the zero point are difficult to reconstruct by the generator, as well as the lack of image adaptation to the target skin color.
We have made an overview of different ways to solve the face swapping problem based on neural networks, from the simplest to SOTA solutions that currently exist. Of course, there are a lot of face swap apps available on the market, but they all rely on the above or similar technologies.
We hope that you have learned more about face swapping systems and how they differ. A better understanding of them can help you with efficient use. Start swapping faces from static photos and make your own unique pictures!
As you can read, there are a few face swap systems based on similar technologies. Nevertheless, they differ a lot. Hopefully, this piece will be a good start to expanding your knowledge, learning how to manage this kind of technology and choosing the best face swapping app.
Build something new
Improve existing project
Extend my team