Backlog

GitHub - vipermu/StyleCLIP: Using CLIP and StyleGAN to generate faces from prompts.

for faces, not arbitrary images

GitHub - lucidrains/imagen-pytorch: Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

GitHub - borisdayma/dalle-mini: DALL·E Mini - Generate images from a text prompt

WIP

GitHub - rkhamilton/vqgan-clip-generator: Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.

Easy to setup, comes with some very handy example scripts. Also includes integrated support for upscale and video. Can run from a local notebook or CLI. Favorite project so far.

GitHub - NotNANtoN/diffusion_gen

In theory, should run disco-diffusion locally at decent resolution levels. But in practice, stupid hard to get running on Windows so far. May keep trying, as it should be possible to get diffusion going at decent resolution with 12GB RAM!

GitHub - pixray/pixray

Features multiple generators (including diffusion plus others I’ve not played with at all), and can run in Docker or a slightly higher level container runtime wrapper called cog. Not much luck with this one yet. Cog is known to lack Windows compatibility (see issue #352).

Done

GitHub - lucidrains/big-sleep: A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Technique was originally created by https://twitter.com/advadnoun

This was a very early project, and the results are subpar. Got it working without much hassle, but the lower quality makes it not worth pursuing further.

GitHub - nerdyrodent/CLIP-Guided-Diffusion: Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

Failed to get it working.

GitHub - nerdyrodent/VQGAN-CLIP: Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Works great locally, one of the first I was able to get to that point, so I spent a lot of time with it before moving on. Wrote a PowerShell wrapper script for queuing background jobs which worked amazingly well,. Be sure to keep it single-threaded or you quickly run out of VRAM or just cut throughput drastically only for it to randomly crash OOM minutes or hours into your job.

GitHub - tnwei/vqgan-clip-app: Local image generation using VQGAN-CLIP or CLIP guided diffusion

Useful as cool example for streamlit as a web framework which I’d never heard of. Neat web UI for VQGAN. (If it had ability to queue up jobs, I would use it.) Unsatisfying for disco-diffusion model due to VRAM size limits as noted here: https://github.com/tnwei/vqgan-clip-app/blob/main/docs/notes-and-observations.md#gpu-vram-consumption. The code was very easy to work with.

Softology - Visions of Chaos

I wanted to like this software, but I simply could not get past the install. The author somehow learned programming without also learning how to use computers efficiently. Clearly self-taught, which can be perfectly fine (I did not go far in secondary education myself), but ...just no. I know he has something powerful, but I just can’t get over the choices he’s made in writing the software. It’s like every pet peeve I have was wrapped up in one package.

GitHub - xinntao/Real-ESRGAN: Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

This one is known to work when run locally, but isn’t for TTI. Only for upscaling and image restoration, and is supposed to be pretty darn cool for that. Update: Python version works great, and they make it really easy besides with a native win32 binary. Note that vqgan-clip-generator project has a zoom feature using Real-ESRGAN.