Stable Diffusion: Unterschied zwischen den Versionen

(15 dazwischenliegende Versionen von 8 Benutzern werden nicht angezeigt)

Zeile 3:

{{Veranstaltung

|name=Stable Diffusion

|image=No-~~Logo~~.png

|image=Stable-diffusion-metalab.png

|involved=[[User:anlumo|anlumo]]

|when=TBD

Zeile 18:

[[User:anlumo|anlumo]] has spent quite a significant amount of time learning the ''Ways of the Prompt Artist™'' and wonders if anybody else would be interested in learning about this completely new field.

== What Can Stable Diffusion Do? ==

Stable Diffusion is a collection of tools generally in the AI art generation. Among its features are the following capabilities:

* Text-to-image: Enter a text prompt (positive and negative) and generate a low-res image out of that.

* Image-to-image: Take an image as input and modify it based on a text prompt. This can be used for style transfer for example, or taking the composition of another image for a new creation. ([https://www.reddit.com/r/StableDiffusion/comments/1196vyi example])

* Inpainting: Same as image-to-image, but only modify a part of the image. This can be used to add or remove details in images, for example. ([https://www.reddit.com/r/StableDiffusion/comments/11gbijd example])

* Outpainting: Same as image-to-image, but extending an existing image instead. For example, if you have an image of the upper half of a person, you can add the lower half or add more of the environment (based on the text prompt).

* Controlnet: Applicable to any of the above. Take a reference image, extract some property of it, like the pose of a person or a depth map, and nudge the AI to generate one of the above outputs with this extra information ([https://www.reddit.com/r/StableDiffusion/comments/11fn96y example]). This can also be used in text-to-image to convert a pencil sketch to a photorealistic image, for example ([https://www.reddit.com/r/StableDiffusion/comments/11h0m9v example]).

* Upscaling of images: This can increase the resolution of an image by adding details that weren't in the original image (like individual strands of hair). Usually this is used to increase the low resolution output of the techniques above to usable resolutions.

There were some recent attempts of applying these capabilities to video as well. [https://www.youtube.com/shorts/iZZdogrTBVE Here is a YouTube Shorts] demonstrating this ([https://www.youtube.com/watch?v=GVT3WUa-48Y Full video of the end result]).

So, SD is much more capable than the commercial offerings like Midjourney. However, it also has way more nobs to adjust and settings to optimize. Thus, the idea for this talk.

== Plan ==

Zeile 31:

Zeile 46:

* Text-to-Image generation

* Embeddings

* Upscaling

* Image-to-Image generation

* Inpainting

Zeile 47:

Zeile 63:

The whole point of this page is to gauge interest, so if you'd like to participate, please add your name to this list:

* min

* [[User:ripper|ripper]]

* [[User:Nicole|ncl]]

* [[User:Nicole|Qubit23]]

* [[User:eest9|eesti]]

* [[User:zentibel|zentibel]]

* [[User:Sonstwer|Sonstwer]]

* [[User:Cerise|Cerise]]

* Your name could be here!

If an sufficient amount of people appear on this list, a date and time will be discussed/announced.