Stable Diffusion: Unterschied zwischen den Versionen
Anlumo (Diskussion | Beiträge) (example++) |
Cerise (Diskussion | Beiträge) |
||
(7 dazwischenliegende Versionen von 5 Benutzern werden nicht angezeigt) | |||
Zeile 26: | Zeile 26: | ||
* Image-to-image: Take an image as input and modify it based on a text prompt. This can be used for style transfer for example, or taking the composition of another image for a new creation. ([https://www.reddit.com/r/StableDiffusion/comments/1196vyi example]) | * Image-to-image: Take an image as input and modify it based on a text prompt. This can be used for style transfer for example, or taking the composition of another image for a new creation. ([https://www.reddit.com/r/StableDiffusion/comments/1196vyi example]) | ||
* Inpainting: Same as image-to-image, but only modify a part of the image. This can be used to add or remove details in images, for example. ([https://www.reddit.com/r/StableDiffusion/comments/11gbijd example]) | * Inpainting: Same as image-to-image, but only modify a part of the image. This can be used to add or remove details in images, for example. ([https://www.reddit.com/r/StableDiffusion/comments/11gbijd example]) | ||
* Controlnet: Applicable to any of the above. Take a reference image, extract some property of it, like the pose of a person or a depth map, and nudge the AI to generate one of the above outputs with this extra information ([https://www.reddit.com/r/StableDiffusion/comments/11fn96y example]). This can also be used in text-to-image to convert a pencil sketch to a photorealistic image, for example. | * Outpainting: Same as image-to-image, but extending an existing image instead. For example, if you have an image of the upper half of a person, you can add the lower half or add more of the environment (based on the text prompt). | ||
* Controlnet: Applicable to any of the above. Take a reference image, extract some property of it, like the pose of a person or a depth map, and nudge the AI to generate one of the above outputs with this extra information ([https://www.reddit.com/r/StableDiffusion/comments/11fn96y example]). This can also be used in text-to-image to convert a pencil sketch to a photorealistic image, for example ([https://www.reddit.com/r/StableDiffusion/comments/11h0m9v example]). | |||
* Upscaling of images: This can increase the resolution of an image by adding details that weren't in the original image (like individual strands of hair). Usually this is used to increase the low resolution output of the techniques above to usable resolutions. | * Upscaling of images: This can increase the resolution of an image by adding details that weren't in the original image (like individual strands of hair). Usually this is used to increase the low resolution output of the techniques above to usable resolutions. | ||
Zeile 65: | Zeile 66: | ||
* [[User:ripper|ripper]] | * [[User:ripper|ripper]] | ||
* [[User:Nicole|ncl]] | * [[User:Nicole|ncl]] | ||
* [[User:Nicole|Qubit23]] | |||
* [[User:eest9|eesti]] | |||
* [[User:zentibel|zentibel]] | |||
* [[User:Sonstwer|Sonstwer]] | |||
* [[User:Cerise|Cerise]] | |||
* Your name could be here! | * Your name could be here! | ||
If an sufficient amount of people appear on this list, a date and time will be discussed/announced. | If an sufficient amount of people appear on this list, a date and time will be discussed/announced. |
Aktuelle Version vom 4. April 2023, 09:33 Uhr
Language: | English |
---|
Stable Diffusion | |
TBD | |
Metalab | |
anlumo | |
Vortrag | |
0 | |
planning | |
Open Source AI Image Generation | |
Zuletzt aktualisiert: | 04.04.2023 |
Stable Diffusion Introduction
Stable Diffusion (SD) is a new open source project for generating and manipulating images based on text prompts locally on the user's computer (with a powerful GPU). Even the training itself can be done with open source tools (and a lot of compute power).
anlumo has spent quite a significant amount of time learning the Ways of the Prompt Artist™ and wonders if anybody else would be interested in learning about this completely new field.
What Can Stable Diffusion Do?
Stable Diffusion is a collection of tools generally in the AI art generation. Among its features are the following capabilities:
- Text-to-image: Enter a text prompt (positive and negative) and generate a low-res image out of that.
- Image-to-image: Take an image as input and modify it based on a text prompt. This can be used for style transfer for example, or taking the composition of another image for a new creation. (example)
- Inpainting: Same as image-to-image, but only modify a part of the image. This can be used to add or remove details in images, for example. (example)
- Outpainting: Same as image-to-image, but extending an existing image instead. For example, if you have an image of the upper half of a person, you can add the lower half or add more of the environment (based on the text prompt).
- Controlnet: Applicable to any of the above. Take a reference image, extract some property of it, like the pose of a person or a depth map, and nudge the AI to generate one of the above outputs with this extra information (example). This can also be used in text-to-image to convert a pencil sketch to a photorealistic image, for example (example).
- Upscaling of images: This can increase the resolution of an image by adding details that weren't in the original image (like individual strands of hair). Usually this is used to increase the low resolution output of the techniques above to usable resolutions.
There were some recent attempts of applying these capabilities to video as well. Here is a YouTube Shorts demonstrating this (Full video of the end result).
So, SD is much more capable than the commercial offerings like Midjourney. However, it also has way more nobs to adjust and settings to optimize. Thus, the idea for this talk.
Plan
anlumo's plan would be to hold an interactive talk showing off how to install and use SD on a local machine.
Doing a workshop was considered, but the problem is that SD requires a beefy graphics card, which is hard to provide for a room full of people. If somebody has a gaming notebook with a 3000 or 4000 series GPU, they're free to participate.
Outline:
- Basics/Introduction
- Installation
- Text-to-Image generation
- Embeddings
- Upscaling
- Image-to-Image generation
- Inpainting
- ControlNet
Rating
This interactive talk shall be rated PEGI 16.
Non-Technical Aspects
Of course there's a lot to discuss about AI art replacing artists, legal aspects and related issues. These things can be discussed as part of this interactive talk, but anlumo would like to avoid having the whole time taken over by this. Also, he thinks that having direct insight into the techniques used while creating AI art can have a deep impact on opinions, so this could be more of a start to forming an opinion rather than butting heads between differing ones.
Interested?
The whole point of this page is to gauge interest, so if you'd like to participate, please add your name to this list:
If an sufficient amount of people appear on this list, a date and time will be discussed/announced.