Stable Diffusion系列课程上：安装、提示词入门、常用模型（checkpoint、embedding、LORA）、放大算法、局部重绘、常用插件

03-11 5717阅读 0评论

文章目录

一、Stable Diffusion安装与源码解析

1.1 Stable Diffusion安装
1.2 webui启动代码分析

1.2.1 加载webui-user.sh
1.2.2 执行launch.py
1.2.3 执行webui.py，启动界面
1.2.4 cmd_args
二、文生图（提示词解析）

2.1 提示词入门
2.2 权重
2.3 负面提示词（ Negative prompt）
2.4 出图参数设置
2.5 新手念咒方法
三、图生图

3.1 图生图入门
3.2 随机种子解析
3.3 图生图拓展
四、模型

4.1 `Checkpoint`

4.1.1 Checkpoint简介
4.1.2 Checkpoint分类与下载
4.2 `VAE`（变分自解码器）
4.3 `embeddings`

4.3.1 特定人物形象（含tag反推）
4.3.2 人物反转
4.3.3 解决bad case
4.4 `LoRa`

4.4.1 `LoRa`简介
4.4.2 扩展模型加载
4.4.3 addition network加载
4.4.4 LORA实际应用
4.4.5 局部重绘+lora
4.5 `hypernetwork`
五、高清修复&放大算法

5.1 高清修复（Hi-Res Fix）
5.2 SD Upscale脚本（分区放大）
5.3 附加功能放大（AI放大）
5.4 Ultimate upscale
5.5 Tiled diffusion

5.5.1 简介
5.5.2 放大算法对比
5.5.3 图生图-常规放大（2K级）
5.5.4 图生图-超高分辨率放大（6K级）
5.5.5 分区绘制
5.5.6 文生图放大（结合controlnet tile）
六、局部重绘

6.1 简单局部重绘
6.2 局部重绘之手涂蒙版（inpaint sketch）
6.3 绘图（sketch）
6.4 局部重绘之上传蒙版（inpaint upload）
七、插件

7.1 插件安装方式
7.2 汉化包
7.3 图库浏览器
7.4 Tag插件

7.4.1 `tag complete`
7.4.2 `tagger`
7.4.3 `prompt-all-in-one`
7.4.4 one button prompt
7.5 `utimate upscale`
7.6 Local Latent Couple
7.7 cutoff精准控色

7.8 oldsix-prompt

AUTOMATIC1111/stable-diffusion-webui
参考B站Nenly视频《零基础学会Stable Diffusion》、视频课件
推荐网站：stable-diffusion-art、Civitai（魔法）、libilibi、AI艺术天堂

推荐Stable Diffusion整合资料：

NovelAI资源整合、《AI绘图指南wiki》、AiDraw绘画手册

重绘学派法术绪论1.2、 Stable Diffusion 潜工具书

一、Stable Diffusion安装与源码解析

1.1 Stable Diffusion安装

各种环境安装教程参考：

AutoDL：

镜像AUTOMATIC1111/stable-diffusion-webui/tzwm_sd_webui_A1111，作者一直在维护，目前更新到V16(8.6)，有问题可以进群交流，作者会解答。这个是我目前用的，强推。
【Stable diffusion教程】AutoDL云部署超详细步骤说明
秋叶：Stable Diffusion整合包v4.2发布
星空：AI绘画整合包（新增 Controlnet1.1和SadTalker）
kaggle：stable-diffusion-webui-kaggle、zh-stable-diffusion-webui-kaggle
colab：stable-diffusion-webui-colab。colab现在已经禁止白嫖Stable Diffusion了，付费的可以。

腾讯云（Windows）：谷歌colab不能白嫖SD了？别怕，教你低成本用云服务器玩AI绘画、Stable Diffusion云服务器部署完整版教程及对应视频讲解

1.2 webui启动代码分析

参考：

《stable-diffusion-webui源码分析（1）-Gradio》

《stable-diffusion-webui源码分析（4）-启动流程》

1.2.1 加载webui-user.sh

stable-diffusion-webui的启动方法为bash webui.sh。webui.sh首先是是判断webui-user.sh这个文件是否存在（if中的-f参数），如果存在则使用source命令加载 webui-user.sh文件中用户自定义的变量。

 # Read variables from webui-user.sh
# shellcheck source=/dev/null
if [[ -f webui-user.sh ]]
then
    source ./webui-user.sh
fi

加载webui-user.sh变量
webui-user.sh内容如下，默认它全是注释掉的。如果用户有需要可以配置一些参数，比如安装的地址（install_dir）、项目文件夹的名字（clone_dir）、传递给webui.py的命令行参数（COMMANDLINE_ARGS）、python和git的可执行文件路径、python虚拟环境路径、启动应用程序的脚本文件（export LAUNCH_SCRIPT="launch.py"）、torch和依赖库安装命令等等。

#!/bin/bash
#########################################################
# Uncomment and change the variables below to your need:#
#########################################################
# Install directory without trailing slash
#install_dir="/home/$(whoami)"
# Name of the subdirectory
#clone_dir="stable-diffusion-webui"
# Commandline arguments for webui.py, for example: export COMMANDLINE_ARGS="--medvram --opt-split-attention"
#export COMMANDLINE_ARGS=""
# python3 executable
#python_cmd="python3"
# git executable
#export GIT="git"
# python3 venv without trailing slash (defaults to ${install_dir}/${clone_dir}/venv)
#venv_dir="venv"
# script to launch to start the app
#export LAUNCH_SCRIPT="launch.py"
# install command for torch
#export TORCH_COMMAND="pip install torch==1.12.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113"
# Requirements file to use for stable-diffusion-webui
#export REQS_FILE="requirements_versions.txt"
...

设置默认值
接着设置一些默认值。如果用户在webui-user.sh中没有自定义以上这些变量（if中的-z即是判断变量的值是否为空），则脚本中将使用默认变量，默认值与webui-user.sh中的内容大体相同。

# Set defaults
# Install directory without trailing slash
if [[ -z "${install_dir}" ]]
then
    install_dir="$(pwd)"
fi
# Name of the subdirectory (defaults to stable-diffusion-webui)
if [[ -z "${clone_dir}" ]]
then
    clone_dir="stable-diffusion-webui"
fi
...

判断root权限
紧接着根据参数设置can_run_as_root的值，默认为0，表示不能用root用户登录，如果设置为1，则可以用root用户登录。

# read any command line flags to the webui.sh script
while getopts "f" flag > /dev/null 2>&1
do
    case ${flag} in
        f) can_run_as_root=1;;
        *) break;;
    esac
done

下面两行ERROR_REPORTING默认为FALSE，禁用错误日志。PIP_IGNORE_INSTALLED默认为0，不重新安装已存在的pip包。

# Disable sentry logging
export ERROR_REPORTING=FALSE
# Do not reinstall existing pip packages on Debian/Ubuntu
export PIP_IGNORE_INSTALLED=0

检测git和python
创建或激活python虚拟环境
执行launch.py：判断有没有accelerate，没有就直接执行LAUNCH_SCRIPT（launch.py），有的话就用accelerate执行。accelerate是Hugging Face发布的一个库，用来加速pytorch，可以做分布式运行。

if [[ ! -z "${ACCELERATE}" ]] && [ ${ACCELERATE}="True" ] && [ -x "$(command -v accelerate)" ]; then
    printf "\n%s\n" "${delimiter}"
    printf "Accelerating launch.py..."
    printf "\n%s\n" "${delimiter}"
    prepare_tcmalloc
    accelerate launch --num_cpu_threads_per_process=6 "${LAUNCH_SCRIPT}" "$@"
else
    printf "\n%s\n" "${delimiter}"
    printf "Launching launch.py..."
    printf "\n%s\n" "${delimiter}"
    prepare_tcmalloc
    "${python_cmd}" "${LAUNCH_SCRIPT}" "$@"
fi

1.2.2 执行launch.py

经过在webui.sh中一系列对运行环境的判断，而后便进入到执行脚本LAUNCH_SCRIPT，默认下对应的是launch.py。launch.py处理两件事情，其一是准备环境（prepare_environment），其二是开始运行start。而start的实现较为简单，它既是调用webui.webui()，启动界面。

# launch.py
from modules import launch_utils
args = launch_utils.args
...
start = launch_utils.start
def main():
    if not args.skip_prepare_environment:
        prepare_environment()
    if args.test_server:
        configure_for_tests()
    start()
if __name__ == "__main__":
    main()

# launch_utils.py
def start():
    print(f"Launching {'API server' if '--nowebui' in sys.argv else 'Web UI'} with arguments: {' '.join(sys.argv[1:])}")
    import webui
    if '--nowebui' in sys.argv:
        webui.api_only()
    else:
        webui.webui()

1.2.3 执行webui.py，启动界面

Gradio是一个适合展示深度学习任务的python网页UI框架，通过简单几行代码就能够构建界面，stable-diffusion-webui就是基于开源项目Gradio搭建的。

程序启动后会进入到webui.py，其中通过命令行来判断是进入api模式（api_only）还是进入界面模式，如果是界面模式，就调用webui()方法。

# webui.py
if __name__ == "__main__":
    if cmd_opts.nowebui:
        api_only()
    else:
        webui()

进入webui方法，其中shared.demo.launch是Gradio的方法，即启动界面，而modules.ui.create_ui()便是构建ui界面的方法。

# webui.py
from modules.shared import cmd_opts
...
def webui():
    launch_api = cmd_opts.api
    initialize()
    while 1:
        if shared.opts.clean_temp_dir_at_start:
            ui_tempdir.cleanup_tmpdr()
            startup_timer.record("cleanup temp dir")
        modules.script_callbacks.before_ui_callback()
        startup_timer.record("scripts before_ui_callback")
        shared.demo = modules.ui.create_ui()      # 启动UI界面
        startup_timer.record("create ui")
        if not cmd_opts.no_gradio_queue:
            shared.demo.queue(64)
        gradio_auth_creds = list(get_gradio_auth_creds()) or None
		# 启动界面
        app, local_url, share_url = shared.demo.launch(
            share=cmd_opts.share,
            server_name=server_name,
            server_port=cmd_opts.port,
            ssl_keyfile=cmd_opts.tls_keyfile,
            ssl_certfile=cmd_opts.tls_certfile,
            ssl_verify=cmd_opts.disable_tls_verify,
            debug=cmd_opts.gradio_debug,
            auth=gradio_auth_creds,
            inbrowser=cmd_opts.autolaunch and os.getenv('SD_WEBUI_RESTARTING ') != '1',
            prevent_thread_lock=True,
            allowed_paths=cmd_opts.gradio_allowed_path,
            app_kwargs={
                "docs_url": "/docs",
                "redoc_url": "/redoc",
            },
        )

从上面代码可以看出，webui方法中，传入了modules文件夹下的shared.py文件中的cmd_opts变量，而这个变量来自于cmd_args.py中的parser变量。

# shared.py
from modules import localization, script_loading, errors, ui_components, shared_items, cmd_args
parser = cmd_args.parser
script_loading.preload_extensions(extensions_dir, parser)
script_loading.preload_extensions(extensions_builtin_dir, parser)
if os.environ.get('IGNORE_CMD_ARGS_ERRORS', None) is None:
    cmd_opts = parser.parse_args()
else:
    cmd_opts, _ = parser.parse_known_args()
...

1.2.4 cmd_args

modules/cmd_args.py文件，主要作用是解析命令行参数并设置相应的选项。创建argparse.ArgumentParser的实例对象parser用于解析命令行参数，然后在parper中添加各种不同的命令行选项，例如：

-f：命令行启用此选项时，允许以root用户运行。action='store_true'表示如果在命令行中制定了该选项，则它的值将被设为True。

--update-all-extensions：命令行启用此选项时，将下载所有的扩展更新。

在这些参数中，我们还可以设置模型路径（–ckpt-dir）、vae路径（–vae-dir）、embedding路径（–embeddings-dir）、hypernetwork路径（–hypernetwork-dir）等等。例如，我们可以在命令行中输入：

cd /root/stable-diffusion-webui/ && ./webui.sh -f 
--port 6006 
--enable-insecure-extension-access 
--api 
--xformers 
--opt-sdp-attention 
--no-half-vae 
--ckpt-dir /root/autodl-tmp/models/ckpt 
--embeddings-dir /root/autodl-tmp/models/embeddings 
--lora-dir /root/autodl-tmp/models/lora 
--vae-dir /root/autodl-tmp/models/vae 
--controlnet-dir /root/autodl-tmp/models/controlnet 
--lyco-dir /root/autodl-tmp/models/lycoris 
--skip-torch-cuda-test 
--skip-version-check 
--skip-python-version-check -

import argparse
import json
import os
from modules.paths_internal import models_path, script_path, data_path, extensions_dir, extensions_builtin_dir, sd_default_config, sd_model_file  # noqa: F401
parser = argparse.ArgumentParser()
parser.add_argument("-f", action='store_true', help=argparse.SUPPRESS)  # allows running as root; implemented outside of webui
parser.add_argument("--update-all-extensions", action='store_true', help="launch.py argument: download updates for all extensions when starting the program")
parser.add_argument("--skip-python-version-check", action='store_true', help="launch.py argument: do not check python version")
parser.add_argument("--skip-torch-cuda-test", action='store_true', help="launch.py argument: do not check if CUDA is able to work properly")
parser.add_argument("--reinstall-xformers", action='store_true', help="launch.py argument: install the appropriate version of xformers even if you have some version already installed")
parser.add_argument("--reinstall-torch", action='store_true', help="launch.py argument: install the appropriate version of torch even if you have some version already installed")
parser.add_argument("--update-check", action='store_true', help="launch.py argument: check for updates at startup")
parser.add_argument("--test-server", action='store_true', help="launch.py argument: configure server for testing")
parser.add_argument("--skip-prepare-environment", action='store_true', help="launch.py argument: skip all environment preparation")
parser.add_argument("--skip-install", action='store_true', help="launch.py argument: skip installation of packages")
parser.add_argument("--data-dir", type=str, default=os.path.dirname(os.path.dirname(os.path.realpath(__file__))), help="base path where all user data is stored")
parser.add_argument("--config", type=str, default=sd_default_config, help="path to config which constructs model",)
parser.add_argument("--ckpt", type=str, default=sd_model_file, help="path to checkpoint of stable diffusion model; if specified, this checkpoint will be added to the list of checkpoints and loaded",)
parser.add_argument("--ckpt-dir", type=str, default=None, help="Path to directory with stable diffusion checkpoints")
parser.add_argument("--vae-dir", type=str, default=None, help="Path to directory with VAE files")
parser.add_argument("--gfpgan-dir", type=str, help="GFPGAN directory", default=('./src/gfpgan' if os.path.exists('./src/gfpgan') else './GFPGAN'))
parser.add_argument("--gfpgan-model", type=str, help="GFPGAN model file name", default=None)
parser.add_argument("--no-half", action='store_true', help="do not switch the model to 16-bit floats")
parser.add_argument("--no-half-vae", action='store_true', help="do not switch the VAE model to 16-bit floats")
parser.add_argument("--no-progressbar-hiding", action='store_true', help="do not hide progressbar in gradio UI (we hide it because it slows down ML if you have hardware acceleration in browser)")
parser.add_argument("--max-batch-count", type=int, default=16, help="maximum batch count value for the UI")
parser.add_argument("--embeddings-dir", type=str, default=os.path.join(data_path, 'embeddings'), help="embeddings directory for textual inversion (default: embeddings)")
parser.add_argument("--textual-inversion-templates-dir", type=str, default=os.path.join(script_path, 'textual_inversion_templates'), help="directory with textual inversion templates")
parser.add_argument("--hypernetwork-dir", type=str, default=os.path.join(models_path, 'hypernetworks'), help="hypernetwork directory")
parser.add_argument("--localizations-dir", type=str, default=os.path.join(script_path, 'localizations'), help="localizations directory")
parser.add_argument("--allow-code", action='store_true', help="allow custom script execution from webui")
parser.add_argument("--medvram", action='store_true', help="enable stable diffusion model optimizations for sacrificing a little speed for low VRM usage")
parser.add_argument("--lowvram", action='store_true', help="enable stable diffusion model optimizations for sacrificing a lot of speed for very low VRM usage")
parser.add_argument("--lowram", action='store_true', help="load stable diffusion checkpoint weights to VRAM instead of RAM")
parser.add_argument("--always-batch-cond-uncond", action='store_true', help="disables cond/uncond batching that is enabled to save memory with --medvram or --lowvram")
parser.add_argument("--unload-gfpgan", action='store_true', help="does not do anything.")
parser.add_argument("--precision", type=str, help="evaluate at this precision", choices=["full", "autocast"], default="autocast")
parser.add_argument("--upcast-sampling", action='store_true', help="upcast sampling. No effect with --no-half. Usually produces similar results to --no-half with better performance while using less memory.")
parser.add_argument("--share", action='store_true', help="use share=True for gradio and make the UI accessible through their site")
parser.add_argument("--ngrok", type=str, help="ngrok authtoken, alternative to gradio --share", default=None)
parser.add_argument("--ngrok-region", type=str, help="does not do anything.", default="")
parser.add_argument("--ngrok-options", type=json.loads, help='The options to pass to ngrok in JSON format, e.g.: \'{"authtoken_from_env":true, "basic_auth":"user:password", "oauth_provider":"google", "oauth_allow_emails":"user@asdf.com"}\'', default=dict())
parser.add_argument("--enable-insecure-extension-access", action='store_true', help="enable extensions tab regardless of other options")
parser.add_argument("--codeformer-models-path", type=str, help="Path to directory with codeformer model file(s).", default=os.path.join(models_path, 'Codeformer'))
parser.add_argument("--gfpgan-models-path", type=str, help="Path to directory with GFPGAN model file(s).", default=os.path.join(models_path, 'GFPGAN'))
parser.add_argument("--esrgan-models-path", type=str, help="Path to directory with ESRGAN model file(s).", default=os.path.join(models_path, 'ESRGAN'))
parser.add_argument("--bsrgan-models-path", type=str, help="Path to directory with BSRGAN model file(s).", default=os.path.join(models_path, 'BSRGAN'))
parser.add_argument("--realesrgan-models-path", type=str, help="Path to directory with RealESRGAN model file(s).", default=os.path.join(models_path, 'RealESRGAN'))
parser.add_argument("--clip-models-path", type=str, help="Path to directory with CLIP model file(s).", default=None)
parser.add_argument("--xformers", action='store_true', help="enable xformers for cross attention layers")
parser.add_argument("--force-enable-xformers", action='store_true', help="enable xformers for cross attention layers regardless of whether the checking code thinks you can run it; do not make bug reports if this fails to work")
parser.add_argument("--xformers-flash-attention", action='store_true', help="enable xformers with Flash Attention to improve reproducibility (supported for SD2.x or variant only)")
parser.add_argument("--deepdanbooru", action='store_true', help="does not do anything")
parser.add_argument("--opt-split-attention", action='store_true', help="prefer Doggettx's cross-attention layer optimization for automatic choice of optimization")
parser.add_argument("--opt-sub-quad-attention", action='store_true', help="prefer memory efficient sub-quadratic cross-attention layer optimization for automatic choice of optimization")
parser.add_argument("--sub-quad-q-chunk-size", type=int, help="query chunk size for the sub-quadratic cross-attention layer optimization to use", default=1024)
parser.add_argument("--sub-quad-kv-chunk-size", type=int, help="kv chunk size for the sub-quadratic cross-attention layer optimization to use", default=None)
parser.add_argument("--sub-quad-chunk-threshold", type=int, help="the percentage of VRAM threshold for the sub-quadratic cross-attention layer optimization to use chunking", default=None)
parser.add_argument("--opt-split-attention-invokeai", action='store_true', help="prefer InvokeAI's cross-attention layer optimization for automatic choice of optimization")
parser.add_argument("--opt-split-attention-v1", action='store_true', help="prefer older version of split attention optimization for automatic choice of optimization")
parser.add_argument("--opt-sdp-attention", action='store_true', help="prefer scaled dot product cross-attention layer optimization for automatic choice of optimization; requires PyTorch 2.*")
parser.add_argument("--opt-sdp-no-mem-attention", action='store_true', help="prefer scaled dot product cross-attention layer optimization without memory efficient attention for automatic choice of optimization, makes image generation deterministic; requires PyTorch 2.*")
parser.add_argument("--disable-opt-split-attention", action='store_true', help="prefer no cross-attention layer optimization for automatic choice of optimization")
parser.add_argument("--disable-nan-check", action='store_true', help="do not check if produced images/latent spaces have nans; useful for running without a checkpoint in CI")
parser.add_argument("--use-cpu", nargs='+', help="use CPU as torch device for specified modules", default=[], type=str.lower)
parser.add_argument("--listen", action='store_true', help="launch gradio with 0.0.0.0 as server name, allowing to respond to network requests")
parser.add_argument("--port", type=int, help="launch gradio with given server port, you need root/admin rights for ports  
二、文生图（提示词解析）

参考 NovelAI资源整合、《图解法术Ⅰ：服装咒语》

参考Stable Diffusion & NovelAI资源及使用技巧收集汇总（自用）、Stable Diffusion完整入门指南

提示词必须是英文，很长且有很多符号，就像高深莫测的咒语一样，所以大家形象地把写提示词prompt过程叫做“念咒”。模型很多时候不知道我们到底想要什么，这时候就要通过prompt来进行指示和引导。在Stable Diffusion中，无论是文生图还是图生图，都需要用到prompt，这是一切的基础。

2.1 提示词入门

提示词必须是英文，如果英文不太好就只能求助翻译软件或者一些插件了。提示词的书写不需要遵守完整句子的语法结构，只是堆砌一些词组也是可以的，而且效果会更好。例如画“一条又长又宽的面，一个又大又圆的碗”，写成（面，长，宽），（碗，大，圆）也是OK的，而且效果可能更好。

使用分隔符：提示词词组之间要使用分隔符隔开，在底层代码中，都是英文书写的，常用的风格符是英文逗号（半角）。
换行：提示词可以换行，但是行尾最好英文逗号隔开
高质量图片的生成需要内容详尽、画质标准清晰的提示词
A girl is walking in the forest（一个女孩在森林中漫步），可以写成1 girl,walking,forest,path,sun,sunshine,shiing in body。但是这样生成的图片远达不到我们期望的效果。这是因为模型（Stable Diffusion）生成图片是有一定随机性的，“一个女孩在森林中漫步”这种描述太笼统了。女孩是什么造型、服装、视角，森林里是什么样子等等这些模型都不知道，模型只能瞎蒙了，所以最终效果并不好。我们可以慢慢再细化、微调和补充。
提示词分类：提示词可以按如下分类，方便书写时对号入座，进行补充。

内容型提示词：根据自己的需求进行针对性地修改

人物主体特征：越具体AI的思路越清晰。形容词如beautiful、happy等抽象词，也可以影响整体的感觉
场景特征：户外场景最好加入outdoor提示词，室内使用indoor，可以显著影响画面的氛围
环境光照

画幅视角：近景特写可以用close up，中距离可以写full body

标准提示词：相对固定，可以抄作业
如果只有内容型提示词，生成的图片大概率会难以令人满意，比如画质魔壶，缺少细节等等，这时候就需要补充画质画风提示词，让生成的图像更趋近于某个固定的标准。（不同风格的图片也依赖于预训练模型）

画幅视角

距离 close-up,distant
人物比例 full body, upper body
观察视角 from above, view of back
镜头类型wide angle,Sony A7I

画风提示词

插画风 illustration,painting,paintbrush
二次元animecomic,game CG
写实系 photorealistic, realistic, photograph

通用高画质

best quality, ultra-detailed, masterpiece,hires,8k

特定高分辨率类型

extremely detailed CG unity 8k wallpaper(超精细的8K Unity游戏CG)、unreal engine rendered (虚幻引擎渲染)

2.2 权重

刚才我们在prompt中加入了white flower，但是生成的图片中并没有。这是因为prompt中的词非常多，所以模型不一定get到你想要什么。所以这个时候可以使用权重来进行调整，有两种方式：

英文括号表示：每一层[]表示权重×0.9（减少），{}表示×1.05，()表示×1.1。所以(((white flower)))表示白色的话权重为1.331.；

英文括号+数字表示：(white flower:1.5)表示白色的花的权重是1.5 。

提示词权重的安全范围一般在1±0.5左右，太高容易扭曲画面的内容。这时候我们想要更多的百花的元素，可以通过更多类型的词条来进行协同效应。提示词进阶规则——混合、迁移、迭代等，后续会补充。

另外还有一些补充规则：

越靠前的prompt权重越大；比如景色在前，人物就会小，相反的人物会变大或半身
图片越大需要的Prompt越多，不然Prompt会相互污染

Prompt支持使用emoji，且表现力较好。

2.3 负面提示词（ Negative prompt）

Negative prompt用于指定不想生成的内容，使用Negative prompt可以消除了Stable Diffusion的常见畸形，比如多余的肢体。采样器（sampler）将比较prompt生成的图片和Negative prompt生成的图片之间的差异，并是最终生成结果逼近前者，远离后者，下面是一个示例：

原始图片有雾状（fog）、颗粒感（grainy，画质低）
Negative prompt为fog：雾状没了但是生成奇怪的紫色
Negative prompt为grainy：没有雾状和紫色，但是色彩单调

Negative prompt为fog，grainy， purple：没有雾状和紫色，画质高，色彩饱和度高。


negative prompt：None	negative prompt：fog


negative prompt：grainy	negative prompt：fog, grainy, purple

出图反向提示词通用模板：

2.4 出图参数设置

采样迭代步数20-50：
Stable Diffusion是通过对图像进行加噪声再去噪声的方式生成图片，图片加噪之后模型才有更多的空间去发挥。去噪过程可以理解为模型会用像素点一点点的生成你需要的图片，生成时画面每闪一次表示模型迭代了一步。
理论上步数越多画质越清晰。但是实际上步数大于20次以后，提升不明显了，而且步数越大，计算时间越长，且显存消耗越大。默认20，需求高清可设置30-50,10以下画质惊悚。

采样方法：可理解为模型生成图片时的算法，选择带+号或模型推荐

Euler和Eular a：插画风，比较简约
DPM 2M和DPM 2M Karras：出图较快

DPM ++ SDE Karras：细节丰富

实际使用时，推荐图中带+号的采样方法，因为都进过了改进。另外很多模型有推荐的采样方法，一般是作者测试表现最好的。

分辨率：推荐1024×1024左右，高画质可使用高清修复
太低细节和画质不够，太高容易显存不足，且可能出现多人多手多脚的画面。因为AI训练时，分辨率不会太高，过高的分辨率AI会认为是多图拼接的。如果确实需要高画质，可以先生成低分辨率，再使用高清修复来放大，本质是图生图，后续会讲到。
面部修复：一般都会选
平铺：不推荐

提示词相关性：提示词对生成图像的影响程度。相关性较高的时候，生成的图像将更符合提示信息的样子；相反，如果提示词相关性较低，对应的权重也较小，则生成的图像会更加随机

对于人物类的提示词，一般将提示词相关性控制在 7-12 之间，太高画面容易变形
对于建筑等大场景类的提示词，一般控制在 3-7 左右。这样可以在一定程度上突出随机性，同时又不会影响生成图像的可视化效果。
随机种子：骰子按钮表示每次都随机（-1），三角循环按钮表示复制上一次的随机种子值

生成批次和批次数量：用于一次生成多张图。批次数量提高，显存消耗也提高，生成高清图时不建议改。

2.5 新手念咒方法

更多方法请参考AI艺术天堂

自然语言：中文表述翻译成英文。翻译插件
提示词工具：
- AI提示词加速器、AI tag生成工具、NovelAI tag生成器 V2.1、魔咒百科词典、NovelAi魔导书：这几个网址都是可以自由勾选tag的，可以试试哪一个更好用。
- ClickPrompt、ChatFlow、
- PromptBase：DALL·E, GPT, Midjourney, Stable Diffusion, ChatGPT Prompt Marketplace。
- 抄作业：Civitai（魔法）、libilibi、炼丹阁、DesAi、openart（偏欧美）、arthub（偏亚洲）这几个网站都有很多优秀的图片可以参考。另外还有室内设计tag整合表、
- chatgpt生成提示词：先在chatgpt中写入下面这段话：

- Reference guide of what is Stable Diffusion and how to Prompt -
Stable Diffusion is a deep learning model for generating images based on text descriptions and can be applied to inpainting, outpainting, and image-to-image translations guided by text prompts. Developing a good prompt is essential for creating high-quality images.
A good prompt should be detailed and specific, including keyword categories such as subject, medium, style, artist, website, resolution, additional details, color, and lighting. Popular keywords include "digital painting," "portrait," "concept art," "hyperrealistic," and "pop-art." Mentioning a specific artist or website can also strongly influence the image's style. For example, a prompt for an image of Emma Watson as a sorceress could be: "Emma Watson as a powerful mysterious sorceress, casting lightning magic, detailed clothing, digital painting, hyperrealistic, fantasy, surrealist, full body."
Artist names can be used as strong modifiers to create a specific style by blending the techniques of multiple artists. Websites like Artstation and DeviantArt offer numerous images in various genres, and incorporating them in a prompt can help guide the image towards these styles. Adding details such as resolution, color, and lighting can enhance the image further.
Building a good prompt is an iterative process. Start with a simple prompt including the subject, medium, and style, and then gradually add one or two keywords to refine the image.
Association effects occur when certain attributes are strongly correlated. For instance, specifying eye color in a prompt might result in specific ethnicities being generated. Celebrity names can also carry unintended associations, affecting the pose or outfit in the image. Artist names, too, can influence the generated images.
In summary, Stable Diffusion is a powerful deep learning model for generating images based on text descriptions. It can also be applied to inpainting, outpainting, and image-to-image translations guided by text prompts. Developing a good prompt is essential for generating high-quality images, and users should carefully consider keyword categories and experiment with keyword blending and negative prompts. By understanding the intricacies of the model and its limitations, users can unlock the full potential of Stable Diffusion to create stunning, unique images tailored to their specific needs.
--
Please use this information as a reference for the task you will ask me to do after.
--
Below is a list of prompts that can be used to generate images with Stable Diffusion.
- Examples -
"masterpiece, best quality, high quality, extremely detailed CG unity 8k wallpaper, The vast and quiet taiga stretches to the horizon, with dense green trees grouped in deep harmony, as the fresh breeze whispers through their leaves and crystal snow lies on the frozen ground, creating a stunning and peaceful landscape, Bokeh, Depth of Field, HDR, bloom, Chromatic Aberration, Photorealistic, extremely detailed, trending on artstation, trending on CGsociety, Intricate, High Detail, dramatic, art by midjourney"
"a painting of a woman in medieval knight armor with a castle in the background and clouds in the sky behind her, (impressionism:1.1), ('rough painting style':1.5), ('large brush texture':1.2), ('palette knife':1.2), (dabbing:1.4), ('highly detailed':1.5), professional majestic painting by Vasily Surikov, Victor Vasnetsov, (Konstantin Makovsky:1.3), trending on ArtStation, trending on CGSociety, Intricate, High Detail, Sharp focus, dramatic"
"masterpiece, best quality, high quality, extremely detailed CG unity 8k wallpaper,flowering landscape, A dry place like an empty desert, dearest, foxy, Mono Lake, hackberry,3D Digital Paintings, award winning photography, Bokeh, Depth of Field, HDR, bloom, Chromatic Aberration, Photorealistic, extremely detailed, trending on artstation, trending on CGsociety, Intricate, High Detail, dramatic, art by midjourney"
"portrait of french women in full steel knight armor, highly detailed, heart professional majestic oil painting by Vasily Surikov, Victor Vasnetsov, Konstantin Makovsky, trending on ArtStation, trending on CGSociety, Intricate, High Detail, Sharp focus, dramatic, photorealistic"
"(extremely detailed CG unity 8k wallpaper), full shot photo of the most beautiful artwork of a medieval castle, snow falling, nostalgia, grass hills, professional majestic oil painting by Ed Blinkey, Atey Ghailan, Studio Ghibli, by Jeremy Mann, Greg Manchess, Antonio Moro, trending on ArtStation, trending on CGSociety, Intricate, High Detail, Sharp focus, dramatic, photorealistic painting art by midjourney and greg rutkowski"
"micro-details, fine details, a painting of a fox, fur, art by Pissarro, fur, (embossed painting texture:1.3), (large brush strokes:1.6), (fur:1.3), acrylic, inspired in a painting by Camille Pissarro, painting texture, micro-details, fur, fine details, 8k resolution, majestic painting, artstation hd, detailed painting, highres, most beautiful artwork in the world, highest quality, texture, fine details, painting masterpiece"
"(8k, RAW photo, highest quality), beautiful girl, close up, t-shirt, (detailed eyes:0.8), (looking at the camera:1.4), (highest quality), (best shadow), intricate details, interior, (ponytail, ginger hair:1.3), dark studio, muted colors, freckles"
"(dark shot:1.1), epic realistic, broken old boat in big storm, illustrated by herg, style of tin tin comics, pen and ink, female pilot, art by greg rutkowski and artgerm, soft cinematic light, adobe lightroom, photolab, hdr, intricate, highly detailed, (depth of field:1.4), faded, (neutral colors:1.2), (hdr:1.4), (muted colors:1.2), hyperdetailed, (artstation:1.4), cinematic, warm lights, dramatic light, (intricate details:1.1), complex background, (rutkowski:0.66), (teal and orange:0.4), (intricate details:1.12), hdr, (intricate details, hyperdetailed:1.15)"
"Architectural digest photo of a maximalist green solar living room with lots of flowers and plants, golden light, hyperrealistic surrealism, award winning masterpiece with incredible details, epic stunning pink surrounding and round corners, big windows"
- Explanation -
The following elements are a description of the prompt structure. You should not include the label of a section like "Scene description:".
Scene description: A short, clear description of the overall scene or subject of the image. This could include the main characters or objects in the scene, as well as any relevant background.
Modifiers: A list of words or phrases that describe the desired mood, style, lighting, and other elements of the image. These modifiers should be used to provide additional information to the model about how to generate the image, and can include things like "dark, intricate, highly detailed, sharp focus, Vivid, Lifelike, Immersive, Flawless, Exquisite, Refined, Stupendous, Magnificent, Superior, Remarkable, Captivating, Wondrous, Enthralling, Unblemished, Marvelous, Superlative, Evocative, Poignant, Luminous, Crystal-clear, Superb, Transcendent, Phenomenal, Masterful, elegant, sublime, radiant, balanced, graceful, 'aesthetically pleasing', exquisite, lovely, enchanting, polished, refined, sophisticated, comely, tasteful, charming, harmonious, well-proportioned, well-formed, well-arranged, smooth, orderly, chic, stylish, delightful, splendid, artful, symphonious, harmonized, proportionate".
Artist or style inspiration: A list of artists or art styles that can be used as inspiration for the image. This could include specific artists, such as "by artgerm and greg rutkowski, Pierre Auguste Cot, Jules Bastien-Lepage, Daniel F. Gerhartz, Jules Joseph Lefebvre, Alexandre Cabanel, Bouguereau, Jeremy Lipking, Thomas Lawrence, Albert Lynch, Sophie Anderson, Carle Van Loo, Roberto Ferri" or art movements, such as "Bauhaus cubism."
Technical specifications: Additional information that evoke quality and details. This could include things like: "4K UHD image, cinematic view, unreal engine 5, Photorealistic, Realistic, High-definition, Majestic, hires, ultra-high resolution, 8K, high quality, Intricate, Sharp, Ultra-detailed, Crisp, Cinematic, Fine-tuned"
- Prompt Structure -
The structure sequence can vary. However, the following is a good reference:
[Scene description]. [Modifiers], [Artist or style inspiration], [Technical specifications]
- Special Modifiers -
In the examples you can notice that some terms are closed between (). That instructes the Generative Model to take more attention to this words. If there are more (()) it means more attention.
Similarly, you can find a structure like this (word:1.4). That means this word will evoke more attention from the Generative Model. The number "1.4" means 140%. Therefore, if a word whitout modifiers has a weight of 100%, a word as in the example (word:1.4), will have a weight of 140%.
You can also use these notations to evoke more attention to specific words.
- Your Task -
Based on the examples and the explanation of the structure, you will create 5 prompts. In my next requests, I will use the command /Theme: [ description of the theme]. Then, execute your task based on the description of the theme.
--
Acknowledge that you understood the instructions

或者是：

# Stable Diffusion prompt 助理
你来充当一位有艺术气息的Stable Diffusion prompt 助理。
## 任务
我用自然语言告诉你要生成的prompt的主题，你的任务是根据这个主题想象一幅完整的画面，然后转化成一份详细的、高质量的prompt，让Stable Diffusion可以生成高质量的图像。
## 背景介绍
Stable Diffusion是一款利用深度学习的文生图模型，支持通过使用 prompt 来产生新的图像，描述要包含或省略的元素。
## prompt 概念
- 完整的prompt包含“**Prompt:**”和"**Negative Prompt:**"两部分。
- prompt 用来描述图像，由普通常见的单词构成，使用英文半角","做为分隔符。
- negative prompt用来描述你不想在生成的图像中出现的内容。
- 以","分隔的每个单词或词组称为 tag。所以prompt和negative prompt是由系列由","分隔的tag组成的。
## () 和 [] 语法
调整关键字强度的等效方法是使用 () 和 []。 (keyword) 将tag的强度增加 1.1 倍，与 (keyword:1.1) 相同，最多可加三层。 [keyword] 将强度降低 0.9 倍，与 (keyword:0.9) 相同。
## Prompt 格式要求
下面我将说明 prompt 的生成步骤，这里的 prompt 可用于描述人物、风景、物体或抽象数字艺术图画。你可以根据需要添加合理的、但不少于5处的画面细节。
### 1. prompt 要求
- 你输出的 Stable Diffusion prompt 以“**Prompt:**”开头。
- prompt 内容包含画面主体、材质、附加细节、图像质量、艺术风格、色彩色调、灯光等部分，但你输出的 prompt 不能分段，例如类似"medium:"这样的分段描述是不需要的，也不能包含":"和"."。
- 画面主体：不简短的英文描述画面主体, 如 A girl in a garden，主体细节概括（主体可以是人、事、物、景）画面核心内容。这部分根据我每次给你的主题来生成。你可以添加更多主题相关的合理的细节。
- 对于人物主题，你必须描述人物的眼睛、鼻子、嘴唇，例如'beautiful detailed eyes,beautiful detailed lips,extremely detailed eyes and face,longeyelashes'，以免Stable Diffusion随机生成变形的面部五官，这点非常重要。你还可以描述人物的外表、情绪、衣服、姿势、视角、动作、背景等。人物属性中，1girl表示一个女孩，2girls表示两个女孩。
- 材质：用来制作艺术品的材料。 例如：插图、油画、3D 渲染和摄影。 Medium 有很强的效果，因为一个关键字就可以极大地改变风格。
- 附加细节：画面场景细节，或人物细节，描述画面细节内容，让图像看起来更充实和合理。这部分是可选的，要注意画面的整体和谐，不能与主题冲突。
- 图像质量：这部分内容开头永远要加上“(best quality,4k,8k,highres,masterpiece:1.2),ultra-detailed,(realistic,photorealistic,photo-realistic:1.37)”， 这是高质量的标志。其它常用的提高质量的tag还有，你可以根据主题的需求添加：HDR,UHD,studio lighting,ultra-fine painting,sharp focus,physically-based rendering,extreme detail description,professional,vivid colors,bokeh。
- 艺术风格：这部分描述图像的风格。加入恰当的艺术风格，能提升生成的图像效果。常用的艺术风格例如：portraits,landscape,horror,anime,sci-fi,photography,concept artists等。
- 色彩色调：颜色，通过添加颜色来控制画面的整体颜色。
- 灯光：整体画面的光线效果。
### 2. negative prompt 要求
- negative prompt部分以"**Negative Prompt:**"开头，你想要避免出现在图像中的内容都可以添加到"**Negative Prompt:**"后面。
- 任何情况下，negative prompt都要包含这段内容："nsfw,(low quality,normal quality,worst quality,jpeg artifacts),cropped,monochrome,lowres,low saturation,((watermark)),(white letters)"
- 如果是人物相关的主题，你的输出需要另加一段人物相关的 negative prompt，内容为：“skin spots,acnes,skin blemishes,age spot,mutated hands,mutated fingers,deformed,bad anatomy,disfigured,poorly drawn face,extra limb,ugly,poorly drawn hands,missing limb,floating limbs,disconnected limbs,out of focus,long neck,long body,extra fingers,fewer fingers,,(multi nipples),bad hands,signature,username,bad feet,blurry,bad body”。
### 3. 限制：
- tag 内容用英语单词或短语来描述，并不局限于我给你的单词。注意只能包含关键词或词组。
- 注意不要输出句子，不要有任何解释。
- tag数量限制40个以内，单词数量限制在60个以内。
- tag不要带引号("")。
- 使用英文半角","做分隔符。
- tag 按重要性从高到低的顺序排列。
- 我给你的主题可能是用中文描述，你给出的prompt和negative prompt只用英文。
我的第一个主题是： 一个美丽的中国女孩

三、图生图

3.1 图生图入门

在文生图时，我们可以通过一些提示词，告诉AI模型我们想要它生成什么图像，但是AI绘画是有一定随机性的，它不一定完全get到你想要什么。这时候如果给它一张参考图，AI就可以从图片上获取更多的信息，更直观的get到你的想法。

提示词
图生图也需要提示词具体、准确。如果完全不输入提示词，一般都会翻车。如果加入更多描述细节的内容型提示词（短发、蓝眼睛、胡子、戴羊毛帽、穿格子衬衫），再加上一些标准化的正反提示词，出图效果会更好：
重绘幅度：图生图独有的参数，越高表示在原图基础上重新绘制的程度越高。
下面这个例子，我们选择一张真人图，然后用深渊橘模型生成对应的漫画人物。此时重绘幅度推荐0.6-0.8，因为重绘幅度太高，人物形象可能会变形；太低则看不出效果。出图效果如下：

分辨率：一般推荐保持和原图的分辨率一致
- 如果原图尺寸太大（比如3000×3000），可以等比缩小；
- 如果你就是想生成其它高宽比的图片，建议先在原图上进行裁剪，再进行生成。
  如果设置的高宽比和原图不同，会导致图像的变形、拉伸。图片下方也有几个缩放的选项可以进行部分裁剪。最后一个直接缩放对显存要求很高，不建议使用。

3.2 随机种子解析

在上面的例子中，加入更多细节描述后，模型生成了一张更相似的图片。但是此时，我们发现生成的图片是室内场景，而原图是室外的。此时可以加上一些场景词（xx in the backgrounds）进行约束，比如野外、森林、旅行、景深（depth of field，背景虚化的效果）等等。

景深

加入场景词之后，我们发现整个人物的形态也发生了改变，这是因为AI绘画的随机性。如果我们想保持之前的人物形象不变，只改变背景，只需要固定随机种子就行。

AI生成图片的过程是随机的，每次生成都会随机采样，表现在这里就是一组随机数（随机种子）。如果选择同一组随机种子，生成的图片必然就有很多相似之处，因为都是同一套随机方法生成的。

骰子：随机种子设为-1，表示每次随机都不一样

三角循环：随机种子保持和上一次一样

这次我们采用之前喜欢的那张图片的随机种子，再加上场景词，可以看到背景被改变了，人物形象基本不变。

3.3 图生图拓展

物体拟人化
导入物体图片并输入拟人化的提示词，可以实现物体甚至是风景的拟人化
二次元人物3D化
导入二次元人物图，使用2.5D模型，以真实质感标准化提示词进行约束，可以到的近似3D的效果图。如果想更具体准确的转换，可以使用一些lora。
抽象派绘画
有时候，我们只是随手涂写，模型也能画出惊艳的效果。（图上图中的绘画模式）

Stable Diffusion系列课程上：安装、提示词入门、常用模型（checkpoint、embedding、LORA）、放大算法、局部重绘、常用插件第31张

Stable Diffusion系列课程上：安装、提示词入门、常用模型（checkpoint、embedding、LORA）、放大算法、局部重绘、常用插件第32张

四、模型

类型	后缀名	大小	存放路径
checkpoint	.ckpt或.safetensors	2-7G/1-2G	stable-diffusion-webui/models/Stable-diffusion
VAE	.pt或.safetensor	几百M	stable-diffusion-webui/models/VAE
embeddings	.pt或.safetensor	几百k	stable-diffusion-webui/embeddings
hypernetwork	.pt或.safetensor	几百M	stable-diffusion-webui/models/hypernetworks
LORA	.pt或.safetensor	上百M	stable-diffusion-webui/models/lora

4.1 Checkpoint

4.1.1 Checkpoint简介

对于模型作者而言，训练模型通常指生成 Checkpoint 文件。这些文件包含了模型参数和优化器状态等信息，是训练过程中定期保存的状态快照。

对于使用者而言，可以将 Checkpoint 文件理解为一种风格滤镜，例如油画、漫画、写实风等。通过选择对应的 Checkpoint 文件，您可以将 Stable Diffusion 模型生成的结果转换为您所选择的特定风格。需要注意的是，一些 Checkpoint 文件可能需要与特定的低码率编码器（如 Lora）配合使用，以获得更好的效果。

在下载 Checkpoint 文件时，您可以查看相应的模型简介，通常作者会提供相应的文件和说明事项，以帮助您更好地使用和理解该文件

在webui打开时新添加了模型文件，点一下刷新就可以。模型没加载好就生成图片，可能会导致报错。

4.1.2 Checkpoint分类与下载

Checkpoint按画风可以分为三类：

官方发布的Stable Diffusion1.4/1.5/2.0/2.1等模型效果都比较一般，因为有版权的约束。现在大家用的比较多的，都是私炉模型（大家一般把训练AI模型叫做炼丹，因为很多不可控）。

目前主流的模型下载网站有:

Hugging Face：点击上方models选项卡，左侧标签选text-to-image就可以看到文生图模型了。
Civitai（魔法）、libilibi
4.2 VAE（变分自解码器）

VAE负责将加噪后的潜空间数据转为正常图像，可以简单理解为模型的调色滤镜，主要影响画面的色彩质感。目前大多数新模型在文件中已经融合了VAE，还有一些作者会在model card中推荐合适的VAE。

VAE文件后缀一般是.pt，或者是.safetensor，存放路径是。还有一种自动加载特定模型VAE的方法，是将VAE文件也放在models/Stable-diffusion文件夹下，然后将文件名改成和模型名一致，再在后缀.pt前添加.vae字段，这样就可以在加载模型时选择自动加载VAE了。

4.3 embeddings

在C站或者liblibai网站上，embeddings用Textual Inversion标签来筛选。
- badhandv4：触发词badhandv4
- EasyNegativeV2：针对二次元模型训练的，解决肢体混乱、颜色混杂、灰度异常等等一系列负面问题，触发词easynegative。
- Deep Negative V1.x：针对真人模型训练的。解决包括错误的人体解剖结构、令人反感的配色方案、颠倒的空间结构等等问题。触发词NG_DeepNegative_V1_75T。
- CharTurnerV2:基本句式A character turnaround of a(X)wearing(Y)。这段tag越靠前权重越高，还可以加上Multiple views of the same character in the same outfit来达到同一角色服装多个视角的效果。单独使用效果不好，配合人物转身lora——CharTurnerBeta - Lora效果更好。
  在 Stable Diffusion 中，embedding 技术可以被理解为一种组件，它可以将输入数据转换成向量表示，方便模型进行处理和生成。如果说checkpoint是一本厚厚的字典，可以查询许多的条目（关键词）进行生成，那么embeddings就像一个高效的索引，可以指向特定的内容；而LORA就像字典中的一张彩页，指向特定内容更加具体（包含的信息更多）。
  
  举个例子，如果我们想要生成一个开心的皮卡丘，通常需要输入很多描述词，如黄毛、老鼠、长耳朵、腮红等等。但是，如果引入皮卡丘的 embedding，我们只需要输入两个词：皮卡丘和开心。皮卡丘的 embedding 打包了所有皮卡丘的特征描述，这样我们就不用每次输入很多单词来控制生成的画面了。
  
  在日常使用中，embedding 技术通常用于控制人物的动作和特征，或者生成特定的画风。相比于其他模型（如 LORA），embedding 的大小只有几十 KB，而不是几百兆或几 GB。虽然还原度对比 lora 差一些，但在存储和使用上更加方便。
  
  总之，使用 embedding，我们可以更加轻松地生成符合预期的样本，而不需要手动输入大量的描述词汇，下面推荐几个常用embedding。
  
  4.3.1 特定人物形象（含tag反推）
  
  比如liblibai网址上的Corneo’s D.va，训练的是守望先锋里的人气角色D.va，下载后存入stable-diffusion-webui/embeddings文件夹，然后在prompt中使用特定的提示词进行激活。
  
  在这个embedding的medel card里会，作者说激活词是corneo_dva，推荐权重推荐0.9到0.95，所以我们可以写(corneo_dva:0.95) 。另外提示词中加入人物描述，生成会更准确。所以我们上传一张作者的展示图，先反推tag，再填入prompt。
  
  在图生图中，反推一张照片的tag有CLIP和DB两种算法，后一种更快更准。识别后进行一次筛选，只保留准确的描述词就行。
  如果嫌手动筛选太麻烦，可以打开tigger标签栏，使用tigger插件进行反推。上传图片后，反推的提示词会显示其置信度，其中还有一个sensitive表示安全评分。我们可以手动设置置信度阈值0.8，再点一次反向推导，就只保留置信度>0.8的提示词了。
  
  4.3.2 人物反转
  
  前段时间，网上有很多非常精致的3D人物形象的例子，就是通过CharTurner这个embedding实现的。这其实就是把几个不同朝向的人物并列的图片进行训练得到的。作者在model cord中给出了基本启用句式：A character turnaround of a(X)wearing(Y)。这段tag越靠前权重越高，还可以加上Multiple views of the same character in the same outfit来达到同一角色服装多个视角的效果。prompt中可以启用多个embeddings，效果需要自己把握。
  
  CharTurner这个embedding还是有很多缺点，在人物细节和转身动作上无法准确把控这一点在后续LORA中可以部分解决，例如CharTurnerBeta - Lora的转身效果更好，还可以和embedding配合使用。
  
  4.3.3 解决bad case
  
  直到现在，Stable Diffusion生成的图片还是容易画错手脚的情况，甚至是多手多脚。C站上排行较高的几个embedding可以解决这个问题，这些embedding记录了一系列AI画错的方式，整合后放入负面提示框中进行激活，一定程度上避免了上述画错手脚的情况。
  - badhandv4：触发词badhandv4
  - EasyNegativeV2：针对二次元模型训练的，解决肢体混乱、颜色混杂、灰度异常等等一系列负面问题，触发词easynegative。
  - Deep Negative V1.x：针对真人模型训练的。解决包括错误的人体解剖结构、令人反感的配色方案、颠倒的空间结构等等问题。触发词NG_DeepNegative_V1_75T。
    4.4 LoRa
    
    以下是几个我喜欢的lora：
    - Adjuster 衣物增/减 LoRA:可实现衣物增减
    - Detail Tweaker LoRA (细节调整LoRA):用于在保持整体风格/特征的同时增强/减少细节;它适用于各种基本模型（包括动漫和现实模型）/风格 LoRA/角色 LoRA 等，权重±2
    - Polaroid LoRA：拍立得LORA，不带白色边框、只保留拍立得照片质感的模型。
    - CharTurnerBeta - Lora:触发词charturnbetalora，权重建议0.2到0.4，配合CharTurnerV2 embedding效果更好。
    - 娜乌斯嘉角色lora、墨幽角色LoRa、FilmGirl 胶片风 LoRA、现代修仙传lora、国风未来lora、汉服唐风lora
      4.4.1 LoRa简介
      
      LORA 与 embedding 在本质上类似，因为携带着大量的训练数据，所以 LORA 对人物和细节特征的复刻更加细腻。
      
      LORA激活方法：，其中，lora_filename就是你要启用的LORA文件的文件名（不含后缀），例如可以启用人物转身LORA CharTurnerBeta。有的LORA还有触发词，表示作者根据这个tag进行了加强训练，二者同时启用可以加强效果。
      
      LORA 模型对应的底模和触发词汇，在 LORA 作者给出的model信息中可以获得。
      
      需要注意的是， LORA 训练的图源复杂，所以一般会对画风造成轻微影响，可以通过降低其权重来抑制。权重设置越大，对画风的影响因素就越小，通常情况下，权重应该控制在 0.3-1 之间。
      
      为了获得最佳效果，我们可以根据不同的 LORA 模型选择适当的提示词和排除词，并在设置权重时进行调整。同时，我们还可以参考其他作者的经验和技巧，以便更好地利用 LORA 生成图像。
      
      4.4.2 扩展模型加载
      
      我们点击文生图下面红色小按钮，就可以显示扩展模型选项。
      
      这些模型默认没有预览图，你可以加载之后跑一张图片，然后点击用当前生成图片替换预览图。配置完之后，会在模型文件旁边生成同名的图片，不想要可以删除。也可以选择一张图片改成和模型一样的名字，刷新后就自动成为模型封面图。
      
      在设置——扩展模型中，可以设置一些细节。比如设置模型展示方式是卡牌还是缩略图，卡牌宽高尺寸、LORA模型加载权重等等。
      
      4.4.3 addition network加载
      
      github地址：sd-webui-additional-networks
      
      前面几种方式加载的LORA都会明确写在提示词中，这样在图片分享时，这部分内容是可见的。还有一种加载方式，用于多个LORA同时使用的情况。点击页面下方的addition network，就可以启用最多五个LORA，并分别配置权重。
      通过addition network，默认读取的是在extensions-addition-network-models-lora文件夹下的LORA文件。我们可以在settings-addition network下，第一行设置其读取路径为lora默认的安装路径。
      addition network加载LORA的方式是和提示词完全独立的，所以提高了提示词的阅读性，缺点就是分享时，这种方式加载的LORA信息不会被显示出来。所以有些网上比较满意的图片，copy参数信息后无法完美还原，就可能是作者使用了这种方式加载了LORA文件。
      
      addition network还为LORA扩展了蒙版功能，使其可以作用于图片的特定区域，后续再补充。
      
      4.4.4 LORA实际应用
      
      LORA的具体应用，可以分为以下五种：
      1. 人物角色lora：推荐权重0.7-0.8。
        以lucy这个LORA为例。我们先用封面图进行tagger反推，然后直接文生图，即使描述如此详细，AI也无法还原一个准确的lucy形象。而加入lucy这个LORA ，马上就像了很多。因为这个LORA就是用很多个lucy的角色图训练的，传递信息更加准确。
        比如我们反推的提示词里有白夹克，但世上有无数款白夹克，AI并不知道要绘制哪一种。而提供了LORA后，AI就可以提取其中的关键信息，进行准确的绘制了。
        使用真实系大模型+动漫角色LORA，就可以得到一个真人coser形象。结合后续要讲的controlnet，=还可以设计角色的姿势及构图，定制自己跌作品。
      下面是其它的几种模型的效果（来自《AI 绘画融合于工作流的案例和经验》）
      - ↓ 角色 LoRA + 贴纸模型 waves-chibi-style
      - ↓ 角色 LoRA + 大头娃娃模型 bigheaddoll_v1
      - ↓ 角色 LoRA + 古风模型 moxin1.0
      - 角色 LoRA + 吉卜力模型 StudioGhibliStyle
        
        风格lora（art-style）：权重0.2-0.3，过高会让角色LORA失去一些原本的特征
        以C站很受欢迎的fashion girl为例，作者使用100个符合他审美的时尚女孩照片进行训练的，可以使生成的女性角色更符合审美，触发词为fashi-girl、red lips、mature female、makeup。类似的还有FilmGirl 胶片风、花想容/Chinese style等等，可以实现自己喜欢的风格。
        
        画风lora：推荐权重0.2-0.4
        例如Ghibli Style LoRA，叠加使用触发词ghibli style，可以实现吉卜力工作室（宫崎骏）的画风，这种画风可以概括为绘本化的角色设计，水分质感的丰富色彩、精美细腻的背景场景。
        
        concept（概念）lora
        以Gacha splash LORA举例，这个是使用抽卡游戏中抽卡时的精美立绘来训练的，使用这个LORA之后，生成的图片也会有这种抽卡立绘的风格。这种概念型的LORA对提示词的书写要求更高，使用之前要熟读model card，并参考作者的示例图信息。类似的概念LORA还有塔罗牌、mugshot lora（档案照片）、国风未来等等。
        
        服饰lora：权重太高容易出现人体缺失的情况，因为这种lora是根据衣服来训练的。
        比如想生成机甲风格的作品，搜索mecha就能出现很多机甲风lora，比如机械风暴lora、镭射服holographic clothing、汉服唐风lora等等。
        最后，当你想强化作品中某种方面的特质时，可以叠加使用多个同类型的lora，例如使用多种机甲lora来生成机甲风图片。具体使用时，你可以通过控制不同lora的权重，使得作品更像某一种lora。
        
        6. object：特定元素，这种可以实现产品设计、产品蓝图等。
        
        4.4.5 局部重绘+lora
        
        进阶用法，是使用局部重绘的方式引入到图片中。比如给一张科技感的少女图片加上头盔，我们可以将生成的图片进行局部重绘。我们涂上头部部分，并向外多涂一部分，给AI充分一点的创作空间，然后选择头盔lora进行局部重绘。重绘方式有两种：
        
        全图重绘：这时候可以全图重绘，大部分提示词和参数不变，只增加头盔lora相关提示词、触发词）
        蒙版重绘：去掉之前内容词，只保留lora相关的提示词、触发词，这样重绘结果会更精确。
        这样做有什么好处呢？因为头盔lora这种只涉及画面的很小一部分，如果硬要施加到全图范围，有一定概率会干扰大模型生成优秀的画面，这时，局部重绘就成为一种优秀的单独解决方案。这种画大图不加lora，画局部加lora的方式，可以用于服饰、产品等等其它lora的应用中。
        
        4.5 hypernetwork
        
        hypernetwork的效果和LORA差不多，区别在于， hypernetwork一般用于改善整体的画风，比不同checkpoint之间的画风更细腻一点，例如区别不像真实模型和二次元模型那么大的画风差别，而类似梵高和莫奈之间的小差别。
        
        以WavenChibiStyle为例，是一种Q版可爱的画。使用方式是点击设置标签栏，在左侧附件网络中选择加载超网络下拉菜单中的超网络模型，然后保存设置就可以了。我们维持人物设置不变，删去所有场景词，使用纯色背景，分辨率为正方形，生成一张图片。
        
        目前， hypernetwork的作用可部分被LORA取代，而其生成效果没有LORA那么好，但需要生成特定风格的画面时，hypernetwork亦依然是一种选择。
        
        最后，装了tag complete这个补全tag的插件之后，输入