低スペックなPCでも始められるStable diffusion + Google colaboratoryが最強【画像生成AI】

Midjourneyを始め、誰でも簡単にクオリティの高い絵を作ることができる画像生成AIが話題です。

無料で始められるStable dffusionですがPCスペックが要求されるため、誰でもすぐ使えるという感じではありませんでした。

しかしクラウド環境を利用すれば、低スペックなPCやスマホでも始めることが可能に。本記事ではGoogle Colaboratoryを利用したStable Diffusionの始め方を紹介します。

Stable diffusionとは

Stable diffusionとは入力したテキストを画像に変換する画像生成AIです。

生成された画像の使用権はユーザーに与えているが、インターネット上にある大量な画像（著作権で保護されているもの藻含む）を教師データとしているため倫理的・法的問題ともなっているため、使用は自己責任となっています。

オープンソースのプロジェクトとして公開されているので無料で始めることが可能。

しかし、VRAMが最低4GB、ディスクの空き容量が10GB以上、NVIDIA製GPUが必要なため、これらのスペックのPCが無ければなかなか難しいところです。

Google colaboratoryとは

Google colaboratoryはGoogleが機械学習の教育及び研究用に提供しているウェブサービス。環境構築不要でPythonの実行環境を利用できます。

高スペックなPCがなくともクラウド上でPythonが動作し、無料で利用できます。

Google colaboratoryで始めるStable diffusion

準備

Google Colaboratoryで新規ノートブックを作成します。

[ファイル] -> [ノートブックを新規作成]

ノートブックの名前を設定します。上部のUntitled0.ipynbをクリックして、stable-diffusion.ipynbに変更します。

[ランタイム] -> [ランタイムのタイプを変更] を開き、ハードウェアアクセラレーターをnone → GPUに変更、保存します。

起動

以下のコードを入力画面に貼り付け、をクリックします

!curl -Lo memfix.zip https://github.com/nolanaatama/sd-webui/raw/main/memfix.zip
!unzip /content/memfix.zip
!apt install -qq libunwind8-dev
!dpkg -i *.deb
%env LD_PRELOAD=libtcmalloc.so
!rm *
!pip install --upgrade fastapi==0.90.1
!git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
!git clone https://github.com/nolanaatama/sd-webui-tunnels /content/stable-diffusion-webui/extensions/sd-webui-tunnels
!git clone https://github.com/Mikubill/sd-webui-controlnet /content/stable-diffusion-webui/extensions/sd-webui-controlnet
!git clone https://github.com/DominikDoom/a1111-sd-webui-tagcomplete /content/stable-diffusion-webui/extensions/a1111-sd-webui-tagcomplete

%cd /content/stable-diffusion-webui/extensions
!git clone https://github.com/Katsuyuki-Karasawa/stable-diffusion-webui-localization-ja_JP.git


!curl -Lo /content/stable-diffusion-webui/models/Stable-diffusion/AOM3_orangemixs.safetensors https://huggingface.co/WarriorMama777/OrangeMixs/resolve/main/Models/AbyssOrangeMix3/AOM3_orangemixs.safetensors

!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_canny.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_canny-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_depth.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_depth-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_hed-fp16.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_hed-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_mlsd-fp16.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_mlsd-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_normal-fp16.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_normal-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_openpose-fp16.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_openpose-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_scribble-fp16.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_scribble-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/control_seg-fp16.safetensors https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_seg-fp16.safetensors
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_canny_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_canny_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_color_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_color_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_depth_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_depth_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_keypose_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_keypose_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_openpose_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_openpose_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_seg_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_seg_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_sketch_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_sketch_sd14v1.pth
!curl -Lo /content/stable-diffusion-webui/extensions/sd-webui-controlnet/models/t2iadapter_style_sd14v1.pth https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_style_sd14v1.pth
import shutil
shutil.rmtree('/content/stable-diffusion-webui/embeddings')
%cd /content/stable-diffusion-webui
!git checkout 0cc0ee1
!git clone https://huggingface.co/nolanaatama/embeddings

!COMMANDLINE_ARGS="--share --disable-safe-unpickle --no-half-vae --xformers --reinstall-xformers --enable-insecure-extension- --gradio-queue --remotemoe" REQS_FILE="requirements.txt" python launch.py

ログが下に表示されますので、URLをクリックするとStable diffusionが利用できます。

一番下が推奨されているURLです。

Stable diffusionの画面

ざっくりですが、画面の役割は以下の通り。

よく呪文と呼ばれていますが、プロンプトに生成したいワードをカンマ区切りで入力して、ネガティブプロンプトに除外したいワードを入力。生成ボタンを押して少し待つと画像が生成できます。

出力例

呪文

super fine illustration, an extremely cool and beautiful man, highly detailed beautiful face and eyes, look at viewer, cowboy shot, short hair, solo, dynamic angle, beautiful detailed short crystal wear , dark background, there are many luminous crystals in background, dynamic angle

ネガティブプロンプト

flat color, flat shading, retro style, poor quality, bad face, bad fingers, bad anatomy, missing fingers, low res, cropped, signature, watermark, username, artist name, text

生成画像

呪文

body,fullbody,1girl,gothic lolita,red hair,long hair,red eyes,gothic dress,smile,simple background,celtic hair ornaments,ahoge,white background,large breasts,long skirt,black frill,bare shoulder,black chinese dress,red long skirt,wide sleeves,floating sleeves,hair bun,open mouth

ネガティブプロンプト

flat color, flat shading, retro style, poor quality, bad face, bad fingers, bad anatomy, missing fingers, low res, cropped, signature, watermark, username, artist name, text