What exactly is ‘CFG (distributor-free guidance)’ that determines how many prompts/spelling instructions are followed in the ‘Stable Diffusion’ AI image generation? -GIGAZINE


AI that generates images based on input suggestionsFixed Diffusion” has been attracting attention from people all over the world since it was released to the public.instrumentorApplication methodalso published. The setup items for producing images with such Fixed Diffusion include “CFG scale (non-distributor guidelines).” value.

DISTRIBUTION TWEETING GUIDELINES.pdf
https://arxiv.org/pdf/2207.12598.pdf

[Sylwadau papur]Understanding OpenAI “GLIDE” | A fun introduction to AI and machine learning
https://data-analytics.fun/2022/02/13/openai-glide/#toc5

The Way to Realistic Full Body Deepfakes – Metaphysic.ai
https://metaphysic.ai/the-road-to-realistic-full-body-deepfakes/

When generating images and sounds with AI, instead of just inputting labeled data and learning, a separate classification model (classifier) ​​is prepared at sampling time, and this is relied upon to lead to a realistic image.Distributor guidanceThere is a method called A method that improves upon this method and trains a diffusion model and a distribution model at the same time instead of preparing a separate distribution model is called “CFG (non-distributor guidelines)“is.

By introducing CFG, it is possible to improve quality while reducing the sample variance in the diffusion model. The following images released by Google researchers published CFG are ‘images generated without using CFG’ on the left and ‘images generated using CFG’ on the right. You can see that the images produced using CFG are very similar in terms of composition and objects, but also of high quality.


CFG is also one of the main parameters in Stable Diffusion, the larger the CFG scale, the more tips and “img2img, a new image can be generated along with the input image, but there is a high possibility that the image will be distorted. On the other hand, the smaller the CFG scale, the more likely it is to move away from the prompt or input image, but the better the quality.


of AI development companiesMetaphysicsnotes, “The LAION dataset on which the model was trained is reliable, and even short and simple img2img instructions give valid results, so large CFG scales are often unnecessary.” However, if you want to produce something that Stable Diffusion isn’t pre-trained for, have difficulty producing coherently for the AI, or want to combine multiple people or concepts, then you may want to rescale CFG or undo strength. be raised.

One of the things that Stable Diffusion is not good at when generating images using img2img is “changing the color of clothes”. Below Metaphysic uses Stable Diffusion, inputting a picture of a woman in a red dress with img2img, and based on that, following the prompt ” Woman in a blue evening dress “This is the screen where I tried to generate the image. The CFG scale is set to “13.5”, which is slightly higher than the generally recommended value (around 7 to 11), and despite the inclusion of “red” in the forbidden word, the women’s evening dress continues to be red. It is


Metaphysic also “changed the color of the clothes the performers were wearing from dark blue to red, and changed the faceSalma HayekThe following is a video that attempted the process of “doing” with Stable Diffusion.

CFG Hayek Comparison – YouTube

The left side was produced with standard parameters of “CFG: 12, noise reduction: 0.46”, and the right side was produced with parameters close to the maximum value of “CFG: 28, noise reduction: 0.94”. The movement of the female performer, which is the input data, is shown in the lower left frame. The texture and pose are clearly different between the left side and the right side, and you can see that the left side is much more accurate as an image, but only the color of the dress on the right side is more in line with the prompter.


The left side has almost the same movement as the performer, but sometimes the right side stands completely different from the performer.


If the CFG scale is too large, the quality will suffer and the image will be completely different from time to time. The instructions to “change the color of the dress to red” were certainly followed faithfully, but other parts were badly damaged.


Metaphysics suggests using materials that are a little closer to what you want to end up producing as a way to cure these problems.


Copy the title and URL of this article

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.