Oct 04, 2022 21:00:00

Summary of various model data that can be used with various specializations using image generation AI 'Stable Diffusion'

Stable Diffusion , an image generation AI, is a 'latent diffusion model' that generates images by removing noise. It was developed as an open source and released to the public in August 2022, so it can be used by changing the training dataset. There are many fork models that specialize in generating specific images. We have summarized the specialized models derived from Stable Diffusion, their characteristics, and examples of generation.

Stable Diffusion Models
https://rentry.org/sdmodels

I have summarized the results of actually generating images with the same prompt, number of steps, and CFG scale using multiple models and seed values.

From the left, the models are Stable Diffusion v1.4, Waifu-Diffusion v1.2, Trinart Stable Diffusion, Hentai Diffusion, and Zack3D_Kinky v1. Although the general composition and colors are similar for each seed value, you can see at a glance that the patterns are quite different depending on the model.

◆Waifu-Diffusion
'Waifu-Diffusion' is a fork model of Stable Diffusion, and is trained on the dataset '

Danbooru2021 ', which consists of more than 4.9 million SFW (can be viewed without problems at work) images posted on the two-dimensional image site Danbooru. Masu. You can understand what it looks like when you actually generate an image using Waifu-Diffusion by reading the article below.

Summary of how to use model data 'Waifu-Diffusion' specialized for drawing illustrations with image generation AI 'Stable Diffusion' - GIGAZINE

Waifu-Diffusion has the following versions at the time of article creation. At the time of article creation, v1.2 is the official version, and v1.3 is only available as a beta version. Note that 'epoch' indicates the number of times the parameters were adjusted by repeating learning using the dataset, and the higher the number of epochs, the higher the accuracy can be expected.
・Waifu-Diffusion v1.2
・Waifu-Diffusion v1.3 beta epoch03
・Waifu-Diffusion v1.3 beta epoch04
・Waifu-Diffusion v1.3 beta epoch05
・Waifu-Diffusion v1.3 beta epoch06
・Waifu-Diffusion v1.3 beta epoch07
・Waifu-Diffusion v1.3 beta epoch08

◆Trinart Stable Diffusion
Trinart Stable Diffusion is a Japanese model. ' Torinsama AI', which is a model of novel generation AI's AI Berisu , has been specialized for two-dimensional illustrations such as anime and manga, inheriting the art style of Stable Diffusion v1.4 as much as possible. This is an improved version.

naclbit/trinart_stable_diffusion_v2 · Hugging Face
https://huggingface.co/naclbit/trinart_stable_diffusion_v2

Example of an image actually generated by Trinart Stable Diffusion

At the time of article creation, three types of models have been published depending on the number of learning steps (number of parameter updates per repeated learning).
・

trinart2_step60000.ckpt
・trinart2_step95000.ckpt
・trinart2_step115000.ckpt

◆gg1342_testrun1_pruned.ckpt
The model was trained on 280 NSFW (extreme content that should be viewed with caution at work) images and 80 SFW images of fictional characters, and is said to be suitable for generating live-action images.

The model can be downloaded from below. Please note that some models, including gg1342_testrun1_pruned.ckpt, are distributed as Torrents, and a client such as BitTorrent is required to download them.
・(BitTorrent required) gg1342_testrun1_pruned.ckpt

◆Hentai Diffusion
Hentai Diffusion is a fork model of Waifu-Diffusion v1.2 and is trained on 150,000 images uploaded to 2D image sites Rule34 and Gelbooru . Hentai Diffusion repositories are hosted on Hugging Face and GitHub, but GitHub was disabled at the time of writing because it violated the terms of use.

Deltaadams/Hentai-Diffusion at main
https://huggingface.co/Deltaadams/Hentai-Diffusion/tree/main

GitHub - Delcos/HentaiDiffusion
https://github.com/Delcos/HentaiDiffusion

The following image was generated by Hentai Diffusion.

There are two types of models at the time of article creation:
・RD1212.ckpt
・RD1412.ckpt

◆Bare Feet / Full Body b4_t16_noadd
Bare Feet / Full Body b4_t16_noadd is a model trained on a carefully selected dataset of images that include bare feet and full nudity. An image actually generated with Bare Feet / Full Body b4_t16_noadd has been posted on the image sharing site Imgur.

Imgur: The magic of the Internet
https://imgur.com/2sJGz3j

The fp16 and fp32 versions of the model are distributed as torrents.

・(BitTorrent required) bf_fb_v3_t4_b16_noadd-ema-pruned-fp16.ckpt
・(BitTorrent required) bf_fb_v3_t4_b16_noadd-ema-pruned-fp32.ckpt

The developer said, ``I noticed that regular SD and other models have many problems with ``drawing limbs.'' In particular, bare feet are probably the biggest weakness. I created a dataset and checkpoints that focused on Bare Feet / Full Body b4_t16_noadd seems to draw better than other models when it comes to bare feet, and as a bonus it also draws pretty well when it comes to genitals. It seems to work as a general NSFW image generation model as long as you lower the 'stylize' that emphasizes originality.It's not 100% consistent, but there are 100 usable images out of 100. What used to be 0-1 pieces has now become about 5 pieces. This model is still a work in progress, but I intend to complete it. Also, there was a huge amount of catastrophic oblivion in Bare Feet / Full Body b4_t16_noadd Also, because it has become too specialized, it has lost a lot of versatility and stylization.'

◆Lewd Diffusion
'Lewd' means 'obscene'. We use the same Danbooru21 as Waifu-Diffusion for the Waifu-Diffusion dataset, but the one used by Lewd Diffusion contains NSFW images. In other words, it is a specialized model for generating two-dimensional NSFW images.

There are three types of models: Lewd Diffusion v0 is trained with 20,000 images selected from the dataset, and Lewd Diffusion 70k 1e and Lewd Diffusion 70k 2e are trained with 70,000 images. Note that '1e' and '2e' each indicate the number of epochs.
・(BitTorrent required) Lewd Diffusion v0
・(BitTorrent required) Lewd Diffusion 70k 1e
・(BitTorrent required) Lewd Diffusion 70k 2e

◆Yiffy
Yiffy is a model developed mainly on Discord channels of overseas furry community. We are learning from a dataset of up to 70,000 images posted on the Kemoner image posting site e621 . Yiffy has three types of models depending on the number of epochs.
・yiffy-e13.ckpt
・yiffy-e15.ckpt
・yiffy-e18.ckpt

Images actually generated with Yiffy are posted on Twitter.

It's Friday night and I want to try Furry Diffusion.

First yiffy-e18.ckpt

Nuuuuu. pic.twitter.com/VeRY18loto
— Kyotaro Shibata (@sofia_2020_sen) September 30, 2022

yiffy epoch15 Output trial learning data seems to be using e621 pic.twitter.com/pfatchpALk
— Muramasa | MuramasA (@MuramasA__JP) September 29, 2022

◆Furry
Like Yiffy, Furry is also a chemoner-specific image generation model, and has been trained using a dataset of 300,000 images posted on e621. Two types of models are available depending on the number of epochs.
・Furry_epoch1.ckpt
・Furry_epoch4.ckpt

The image generated with Furry looks like this.

Then furry_epoch4.ckpt

Nuuuuuuu.
pic.twitter.com/vnaAnH4TJF
— Kyotaro Shibata (@sofia_2020_sen) September 30, 2022

Notice to those who are creating animal characters with Stable Diffusion: Finally, a model specialized for animal characters (furries/furries) has been created!?????????????
It was almost impossible with conventional SD and WaifuDiffusion, but with furry_epoch4.ckpt, you can now get this kind of picture.
(With a little more effort, you might be able to create a loose and cute character!) pic.twitter.com/mF7GwtReBF
— Pajoca⁰????Nya!???? (@Pajoca_) September 27, 2022

◆Zack3D_Kinky-v1.ckpt
Zack3D_Kinky-v1.ckpt is also a model trained using over 100,000 images uploaded to e621. The image actually generated with Zack3D_Kinky-v1.ckpt is below.

However, the Zack3D_Kinky-v1.ckpt dataset includes NSFW images, so NSFW images for chemoners can also be generated. It is possible to generate furry images that correspond to a wide variety of sexualities, such as 'Transformation,' 'latex,' 'tentacles,' 'ferals,' and 'bondage.' That's what he said. The model can be downloaded below.

Zack3D_Kinky-v1.ckpt ~ pixeldrain

https://pixeldrain.com/u/DEocAHsx

◆r34_150k_epoch0.ckpt
A model trained using 150,000 NFSW images uploaded to Rule34.
・r34_150k_epoch0.ckpt

◆pony-diffusion
A model trained on a dataset consisting of SFW images of ' My Little Pony, ' a popular anime aimed at girls overseas. Therefore, we specialize in image generation of characters appearing in My Little Pony.

Below is an example of an image generated with pony-diffusion.

◆mio-wd-v1.2-e24-ex-ad
Waifu-Diffusion folk model learned from approximately 500 images and 24 epochs of

Mio Naganohara , a character who appears in the manga ' Nichijou ' which was also made into a TV animation. It is a fairly small model, specialized in generating images of Mio Naganohara.

mio-wd-v1.2-e24-ex-ad can be downloaded from below as 'epoch=000023-pruned.ckpt'.

chavinlo/mio-naganohara-waifu-diffusion at main

https://huggingface.co/chavinlo/mio-naganohara-waifu-diffusion/tree/main

◆fubuki-ld-v1-e13-ex-ad
A medium-sized model trained with approximately 5000 images and 24 epochs of Shirakami Fubuki from the popular VTuber group Hololive .
・(BitTorrent required) fubuki-ld-v1-e13-ex-ad

◆asuka-ld-v1-e4-ex-ad
A pruned model trained with 17,000 images and 4 epochs of Asuka, a character from the popular anime ' Neon Genesis Evangelion .'
・(BitTorrent required) asuka-ld-v1-e4-ex-ad

◆tomoko-kuroki-ld-v1-e20-ex-ad
A pruned model trained using 20 images and epochs of Tomoko Kuroki, the main character of the popular manga ``The reason I'm not popular is because of you!' '.
・(BitTorrent required) tomoko-kuroki-ld-v1-e20-ex-ad

◆70gg30LD70k.ckpt
A model made by combining gg1342_testrun1_pruned.ckpt and Lewd Diffusion 70k 1e at a ratio of 70:30. There is no specific model provided for this, but you can create it by combining it yourself, but it is also distributed as a torrent.
・(BitTorrent required) 70gg30LD70k.ckpt

◆wd1-2_sd1-4_merged.ckpt
It is said to be a model that combines Waifu-Diffusion v1.2 and Stable Diffusion v1.4, but the ratio is unknown.
・(BitTorrent required) wd1-2_sd1-4_merged.ckpt

◆Hiten girl_anime_8k_wallpaper_4k.ckpt
Hiten girl_anime_8k_wallpaper_4k.ckpt is an image generation model that specializes in reproducing Hiten 's designs with high accuracy by learning from a total of 40 images by Taiwanese illustrator Hiten in 4000 steps. It is possible to download models from Hugging Face.

BumblingOrange/Hiten · Hugging Face
https://huggingface.co/BumblingOrange/Hiten

Below is an image generated by Hiten girl_anime_8k_wallpaper_4k.ckpt.