torch cuda amp cuda的混合精度

torch.cuda.amp.autocast函数：自动将上下文中的计算步骤转换为 FP16 格式，并在计算完成后将结果转换回 FP32 格式，以保证数值精度。需要注意的是，autocast() 只会自动转换支持 FP16 格式的计算步骤，对于不支持 FP16 格式的计算步骤，仍然会使用 FP32 格式进行计算。另外，autocast() 需要与支持 FP16 的硬件和软件环境一起使用，否则可能会导致计算结果的不准确性。

contextlib.nullcontext(): 用于在不需要执行任何操作的情况下创建上下文。它可以作为一个占位符上下文管理器使用，以便在某些情况下避免代码重复或错误。

def maybe_autocast(self, dtype=torch.float16):
    # if on cpu, don't use autocast
    # if on gpu, use autocast with dtype if provided, otherwise use torch.float16
    enable_autocast = self.device != torch.device("cpu")

    if enable_autocast:
        return torch.cuda.amp.autocast(dtype=dtype)
    else:
        return contextlib.nullcontext()
with self.maybe_autocast():  
    image_embeds = self.ln_vision(self.visual_encoder(image))

这里的self.maybe_autocast() 是一个上下文管理器，用于控制模型是否启用自动混合精度（Automatic Mixed Precision，简称 AMP）加速。当启用 AMP 加速时，模型的参数和梯度会以 FP16 格式存储和计算，从而减少计算量和内存占用。但是，由于 FP16 格式的数值精度较低，可能会对模型的精度造成一定影响。
self.maybe_autocast() 的作用是判断当前环境是否支持使用 AMP 加速。如果支持，则启用 AMP 加速，否则不启用。这样可以在不同的环境下保持代码的兼容性，并且可以轻松地在支持 AMP 的环境中实现加速。

cuda_amp

https://johnson7788.github.io/2023/03/30/cuda-amp/

作者

Johnson

发布于

2023年3月30日

许可协议

stable_diffusion 上一篇

webui报错下一篇