DeepSpeed的使用
DeepSpeed文档汇总
估算模型所需要的显存
from transformers import AutoModel
from deepspeed.runtime.zero.stage3 import estimate_zero3_model_states_mem_needs_all_live
model = AutoModel.from_pretrained("bigscience/T0_3B") # your huggingface model name or path
estimate_zero3_model_states_mem_needs_all_live(model, num_gpus_per_node=1, num_nodes=1)一些注意事项
Last updated