Top Related Projects
Stable Diffusion web UI
High-Resolution Image Synthesis with Latent Diffusion Models
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
Quick Overview
Stablediffusion-infinity is an open-source project that extends the capabilities of Stable Diffusion, a popular text-to-image generation model. It aims to provide infinite outpainting functionality, allowing users to expand generated images in any direction without limitations. This project enhances the creative possibilities for artists and designers working with AI-generated imagery.
Pros
- Enables unlimited expansion of AI-generated images in any direction
- Maintains consistency and coherence across expanded areas
- Integrates seamlessly with existing Stable Diffusion workflows
- Provides a user-friendly interface for easy image manipulation
Cons
- May require significant computational resources for large-scale expansions
- Potential for inconsistencies in complex scenes or highly detailed images
- Limited documentation and community support compared to mainstream Stable Diffusion
- Possible learning curve for users unfamiliar with advanced image generation techniques
Code Examples
# Initialize the Stablediffusion-infinity model
from sd_infinity import SDInfinity
model = SDInfinity.from_pretrained("path/to/model")
# Generate an initial image
initial_image = model.generate("A serene landscape with mountains")
# Expand the image to the right
expanded_image = model.expand(initial_image, direction="right", steps=5)
# Expand an image in multiple directions
from sd_infinity import SDInfinity, ExpansionDirection
model = SDInfinity.from_pretrained("path/to/model")
image = model.generate("A futuristic cityscape")
expanded_image = model.expand_multi(image, [
(ExpansionDirection.UP, 3),
(ExpansionDirection.RIGHT, 2),
(ExpansionDirection.DOWN, 1)
])
# Use custom prompts for different expansion directions
from sd_infinity import SDInfinity, ExpansionDirection
model = SDInfinity.from_pretrained("path/to/model")
image = model.generate("A cozy living room")
expanded_image = model.expand_with_prompts(image, [
(ExpansionDirection.LEFT, "A hallway with paintings", 2),
(ExpansionDirection.RIGHT, "Large windows with a garden view", 3)
])
Getting Started
To get started with Stablediffusion-infinity, follow these steps:
-
Install the library:
pip install stablediffusion-infinity
-
Import and initialize the model:
from sd_infinity import SDInfinity model = SDInfinity.from_pretrained("path/to/model")
-
Generate an initial image and expand it:
initial_image = model.generate("Your prompt here") expanded_image = model.expand(initial_image, direction="right", steps=3) expanded_image.save("expanded_image.png")
For more advanced usage and customization options, refer to the project's documentation and examples in the GitHub repository.
Competitor Comparisons
Stable Diffusion web UI
Pros of stable-diffusion-webui
- More extensive features and options for image generation and manipulation
- Larger community and more frequent updates
- Better support for custom models and extensions
Cons of stable-diffusion-webui
- Higher system requirements and potentially slower performance
- Steeper learning curve due to numerous options and settings
Code Comparison
stable-diffusion-webui:
def create_infotext(p, all_prompts, all_seeds, all_subseeds, comments=None, iteration=0, position_in_batch=0):
index = position_in_batch + iteration * p.batch_size
clip_skip = getattr(p, 'clip_skip', opts.CLIP_stop_at_last_layers)
enable_hr = getattr(p, 'enable_hr', False)
denoising_strength = getattr(p, 'denoising_strength', None)
stablediffusion-infinity:
def get_model_path(model_name):
model_path = os.path.join(models_path, model_name)
if os.path.exists(model_path):
return model_path
return None
The code snippets show different functionalities: stable-diffusion-webui focuses on creating detailed information text for generated images, while stablediffusion-infinity deals with model path retrieval. This reflects the more comprehensive approach of stable-diffusion-webui compared to the simpler implementation of stablediffusion-infinity.
High-Resolution Image Synthesis with Latent Diffusion Models
Pros of stablediffusion
- Official implementation from Stability AI, ensuring authenticity and direct updates
- Comprehensive documentation and extensive community support
- Broader range of features and functionalities for advanced users
Cons of stablediffusion
- Potentially more complex setup and configuration process
- May require more computational resources due to its comprehensive nature
Code Comparison
stablediffusion:
from ldm.util import instantiate_from_config
model = load_model_from_config(config, f"{opt.ckpt}")
sampler = DDIMSampler(model)
stablediffusion-infinity:
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
The stablediffusion repository uses a more low-level approach, allowing for greater customization but requiring more setup. In contrast, stablediffusion-infinity utilizes the Diffusers library, providing a more streamlined and user-friendly interface for quick implementation.
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Pros of diffusers
- Comprehensive library with support for multiple diffusion models and techniques
- Extensive documentation and community support
- Regular updates and maintenance from Hugging Face team
Cons of diffusers
- Steeper learning curve for beginners
- May require more setup and configuration for specific use cases
Code comparison
stablediffusion-infinity:
model = create_model('./models/ldm/stable-diffusion-v1/model.ckpt')
sampler = DDIMSampler(model)
diffusers:
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
stablediffusion-infinity focuses on a specific implementation of Stable Diffusion, while diffusers provides a more flexible and extensible framework for various diffusion models. diffusers offers a higher-level API, making it easier to use different models and techniques, but may require more understanding of the underlying concepts. stablediffusion-infinity might be more straightforward for users specifically interested in Stable Diffusion, but lacks the broader functionality and community support of diffusers.
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
Pros of InvokeAI
- More comprehensive and feature-rich, offering a wide range of AI-powered image generation and manipulation tools
- Active development with frequent updates and a larger community of contributors
- Provides a user-friendly web interface alongside command-line options
Cons of InvokeAI
- Potentially more complex to set up and use for beginners due to its extensive feature set
- May require more computational resources to run smoothly compared to stablediffusion-infinity
Code Comparison
InvokeAI:
from invokeai.app.invocations.baseinvocation import BaseInvocation, InvocationContext, InvocationError
class MyCustomInvocation(BaseInvocation):
def invoke(self, context: InvocationContext):
# Custom invocation logic here
stablediffusion-infinity:
from modules import scripts
class Script(scripts.Script):
def run(self, p, *args):
# Custom script logic here
Both projects allow for custom extensions, but InvokeAI uses a more structured approach with its invocation system, while stablediffusion-infinity relies on a simpler script-based extension mechanism.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
stablediffusion-infinity
Outpainting with Stable Diffusion on an infinite canvas.
https://user-images.githubusercontent.com/1665437/197244111-51884b3b-dffe-4dcf-a82a-fa5117c79934.mp4
Status
Powered by Stable Diffusion inpainting model, this project now works well. However, the quality of results is still not guaranteed. You may need to do prompt engineering, change the size of the selection, reduce the size of the outpainting region to get better outpainting results.
The project now becomes a web app based on PyScript and Gradio. For Jupyter Notebook version, please check out the ipycanvas branch.
Pull requests are welcome for better UI control, ideas to achieve better results, or any other improvements.
Update: the project add photometric correction to suppress seams, to use this feature, you need to install fpie: pip install fpie
(Linux/MacOS only)
Docs
Get Started
- Setup for Windows: setup_guide
- Setup for Linux: setup_guide
- Setup for MacOS: setup_guide
- Running with Docker on Windows or Linux with NVIDIA GPU: run_with_docker
- Usages: usage
FAQs
- The result is a black square:
- False positive rate of safety checker is relatively high, you may disable the safety_checker
- Some GPUs might not work with
fp16
:python app.py --fp32 --lowvram
- What is the init_mode
- init_mode indicates how to fill the empty/masked region, usually
patch_match
is better than others
- init_mode indicates how to fill the empty/masked region, usually
- Why not use
postMessage
for iframe interaction- The iframe and the gradio are in the same origin. For
postMessage
version, check out gradio-space version
- The iframe and the gradio are in the same origin. For
Known issues
- The canvas is implemented with
NumPy
+PyScript
(the project was originally implemented withipycanvas
inside a jupyter notebook), which is relatively inefficient compared with pure frontend solutions. - By design, the canvas is infinite. However, the canvas size is finite in practice. Your RAM and browser limit the canvas size. The canvas might crash or behave strangely when zoomed out by a certain scale.
- The canvas requires internet: You can deploy and serve PyScript, Pyodide, and other JS/CSS assets with a local HTTP server and modify
index.html
accordingly. - Photometric correction might not work (
taichi
does not support the multithreading environment). A dirty hack (quite unreliable) is implemented to move related computation inside a subprocess. - Stable Diffusion inpainting model is much slower when selection size is larger than 512x512
Credit
The code of perlin2d.py
is from https://stackoverflow.com/questions/42147776/producing-2d-perlin-noise-with-numpy/42154921#42154921 and is not included in the scope of LICENSE used in this repo.
The submodule glid_3_xl_stable
is based on https://github.com/Jack000/glid-3-xl-stable
The submodule PyPatchMatch
is based on https://github.com/vacancy/PyPatchMatch
The code of postprocess.py
and process.py
is modified based on https://github.com/Trinkle23897/Fast-Poisson-Image-Editing
The code of convert_checkpoint.py
is modified based on https://github.com/huggingface/diffusers/blob/main/scripts/convert_original_stable_diffusion_to_diffusers.py
The submodule sd_grpcserver
and handleImageAdjustment()
in utils.py
are based on https://github.com/hafriedlander/stable-diffusion-grpcserver and https://github.com/parlance-zz/g-diffuser-bot
w2ui.min.js
and w2ui.min.css
is from https://github.com/vitmalina/w2ui. fabric.min.js
is a custom build of https://github.com/fabricjs/fabric.js
interrogate.py
is based on https://github.com/pharmapsychotic/clip-interrogator v1, the submodule blip_model
is based on https://github.com/salesforce/BLIP
Top Related Projects
Stable Diffusion web UI
High-Resolution Image Synthesis with Latent Diffusion Models
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot