MONAI Label 1.5.0 Not Recognizing PyTorch 2.7.1 With GPU On Windows

by gitftunila 68 views
Iklan Headers

Introduction

This article addresses an issue where MONAI Label with MONAI 1.5.0 fails to recognize PyTorch 2.7.1 with GPU support on Windows 10 and Windows 11. The problem manifests as MONAI Label and PyTorch reverting to CPU-only operation despite a successful GPU installation. This can significantly impact the performance of medical imaging AI workflows, which heavily rely on GPU acceleration for training and inference. This article provides a comprehensive analysis of the issue, including server logs, reproduction steps, and environment details. We will also explore potential causes and solutions to help users effectively utilize MONAI Label with GPU support on Windows systems.

Problem Description

I encountered a problem while trying to use MONAI Label with MONAI 1.5.0 on both Windows 10 and Windows 11. I had installed PyTorch 2.7.1 with CUDA 11.8 and CUDA 12.6 respectively, but MONAI Label did not seem to recognize the GPU during training. The server logs indicated that PyTorch was running in 2.6.0+cpu mode, even though I had installed the GPU version. This issue also affected PyTorch itself, as it could no longer detect the GPU after installing MONAI Label. However, this configuration worked flawlessly on Ubuntu 22.04, where MONAI Label correctly recognized PyTorch 2.7.1 with CUDA 11.8.

Detailed Observations

  1. Incorrect PyTorch Version: When running python -c "import monai; monai.config.print_config()", the output showed Pytorch version: 2.6.0+cpu instead of the expected 2.7.1+cu118 or 2.7.1+cu126.
  2. GPU Detection Failure: After installing MONAI Label, PyTorch itself failed to recognize the GPU. Before installing MONAI Label, python -c "import torch; print(torch.cuda.is_available())" would return True, but after installation, it would return False.
  3. Training on CPU: The server logs during training showed that the device being used was cpu, indicating that the GPU was not being utilized.
  4. CUDA Error in Logs: The traceback in the server logs revealed a RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. This further confirms that the GPU was not being detected during the training process.

Server Logs Analysis

The server logs provide crucial insights into the problem. Here's a breakdown of the key log snippets:

(monai_cu26) C:\Users\chi.zhang\Documents>monailabel start_server --app apps/radiology --studies datasets/Task09_Spleen/imagesTr --conf models deepedit
Using PYTHONPATH=...
[2025-07-08 14:04:28,299] [7508] [MainThread] [INFO] (__main__:285) - USING:: version = False
...
[2025-07-08 14:04:33,915] [7508] [MainThread] [INFO] (uvicorn.error:216) - Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

This section shows the server starting up and initializing the application with the specified configurations. The server appears to launch without any immediate errors.

[2025-07-08 14:05:45,536] [7508] [MainThread] [INFO] (monailabel.utils.async_tasks.task:41) - Train request: {'model': 'deepedit', 'name': 'train_01', 'pretrained': True, 'device': 'cpu', ...}

This log indicates that a training request has been initiated, but the device is set to cpu, which is a clear sign that the GPU is not being used.

[2025-07-08 14:05:54,605] [3064] [MainThread] [ERROR] (ignite.engine.engine.SupervisedTrainer:992) - Engine run is terminating due to exception: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False.

This critical error message confirms that the training process is failing because CUDA is not available. The system is trying to use CUDA, but the torch.cuda.is_available() check returns False.

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

The traceback provides further details on the error, indicating that the issue arises during the deserialization of a PyTorch object, which is expected to be on a CUDA device but CUDA is not detected.

Steps to Reproduce the Behavior

The following steps can be used to reproduce the issue:

  1. Create a Conda Environment:

    conda create -n monai python=3.9
    conda activate monai
    
  2. Upgrade Pip, Setuptools, and Wheel:

    python -m pip install --upgrade pip setuptools wheel
    
  3. Install PyTorch with CUDA:

    pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
    
  4. Verify GPU Availability in PyTorch:

    python -c