SD.Next Release 2024-10 #3506
vladmandic
announced in
Announcements
Replies: 1 comment
-
thank you for your hard work at such cutting edge tech |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
SD.Next Release 2024-10
A month later and with nearly 300 commits, here is the latest SD.Next update!
Workflow highlights
reprocess at higher quality for select images only or generate without hires/refine and then reprocess with hires/refine
and you can pick any previous latent from auto-captured history!
See all details of your currently loaded model, including components, parameter count, layer count, etc.
and once you like the results simply extract combined LoRA for future use!
New models
New integrations
What else?
Supported quantization engines include:
BitsAndBytes
,TorchAO
,Optimum.quanto
,NNCF
,GGUF
Oh, and we've compiled a full table with list of top-30 (how many have you tried?) popular text-to-image generative models,
their respective parameters and architecture overview: Models Overview
And there are also other goodies like multiple XYZ grid improvements, additional Flux ControlNets, additional Interrogate models, better LoRA tags support, and more...
README | CHANGELOG | WiKi | Discord
Details
reprocess
note: you can change hires/refine settings and run-reprocess again!
history
each history item includes info on operations that were used, timestamp and metadata
e.g. generate base + upscale + hires + detailer
memory usage is ~130kb of ram for 1mp image
model analyzer
text encoder:
will automatically find appropriate encoder in the loaded model and replace it with loaded text encoder
download text encoders into folder set in settings -> system paths -> text encoders
default
models/Text-encoder
folder is used if no custom path is setfinetuned clip-vit-l models: Detailed, Smooth, LongCLIP
reference clip-vit-l and clip-vit-g models: OpenCLIP-Laion2b
note sd/sdxl contain heavily distilled versions of reference models, so switching to reference model produces vastly different results
detailer:
set path in settings -> system paths -> yolo
sampler, steps, prompts
strength, max detected objects, edge padding, edge blur, min detection confidence, max detection overlap, min and max size of detected object
to apply your defaults, set ui values and apply via system -> settings -> apply settings
e.g. original yolo detection model is trained on coco dataset with 80 predefined classes
if you leave field blank, it will use any class found in the model
you can see classes defined in the model while model itself is loaded for the first time
extract lora: extract combined lora from current memory state, thanks @AI-Casanova
load any LoRA(s) and play with generate as usual and once you like the results simply extract combined LoRA for future use!
in models -> extract lora
sampler options: full rewrite
sampler notes:
e.g. karras checkbox is replaced with a choice of different sigma methods
e.g. sd15/sdxl typically use epsilon prediction
to apply your defaults, set ui values and apply via system -> settings -> apply settings
sampler options:
Ctrl+X:
without the need for extra models, all via code feed-forwards!
just describe what it is in a structure prompt so it can be de-structured and correctly applied
APG: Adaptive Projected Guidance
LinFusion
Flux
gguf
gguf
binary format for loading unet/transformer componentgguf
binary format for loading t5/text-encoder component: requires transformers prxhinker
for flux modelsOmniGen
and add
|image|
placeholder where input image is used!examples:
in |image| remove glasses from face
,using depth map from |image|, create new image of a cute robot
Recommended: guidance=3.0, refine-guidance=1.6
Stable Diffusion 3.5 Large
CogView 3 Plus
fp16 is not supported due to internal model overflows
Meissonic
SageAttention
gpu
cuda_dtype
in settings defaulted tofp16
if availablecuda_type
defaults to Auto which executesbf16
andfp16
tests on startup and selects best available dtypeif you have specific requirements, you can still set to fp32/fp16/bf16 as desired
if you have gpu that incorrectly identifies bf16 or fp16 availablity, let us know so we can improve the auto-detection
enable in settings -> compute -> torch expandable segments
can provide significant memory savings for some models
not enabled by default as its only supported on latest versions of torch and some gpus
xyz grid full refactor
allowed params will be checked against models call signature
example:
width=768; height=512, width=512; height=768
params are set directly on main processing object and can be known or new params
example:
steps=10, steps=20; test=unknown
now you can adjust width/height in the grid just as any other param
interrogate
lora auto-apply tags to prompt
0:disable, -1:all-tags, n:top-n-tags
_tags_
it will be used as placeholder for replacement, otherwise tags will be appendedextra_networks_default_multiplier
if not scale is specifiedlora_load_gpu
to load LoRA directly to GPUdefault: true unless lovwram
quantization
configure in settings -> quantization
optimum.quanto
andnncf
, we now havebitsandbytes
andtorchao
transformers
andt5
in sd3 and fluxhuggingface:
HF_HUB
orHF_HUB_CACHE
or via settings -> system pathscogvideox:
torch
backend=original is now marked as in maintenance-only mode
python 3.12 improved compatibility, automatically handle
setuptools
control
video add option
gradio_skip_video
to avoid gradio issues with displaying generated videosadd support for manually downloaded diffusers models from huggingface
ui
full quality, tiling, hidiffusion
to advanced sectionfree-u check if device/dtype are fft compatible and cast as necessary
rocm
directml
torch
to 2.4.1, thanks @lshqqytigerextensions
sd-webui-controlnet
andadetailer
last-known working commitsupscaling
refactor
Beta Was this translation helpful? Give feedback.
All reactions