Files
docs/deployments/platforms/virtualization/proxmox/Detecting and Removing Orphaned VM Disks.md
nicole 430a2857a6
Automatic Documentation Deployment / Sync Docs to https://kb.bunny-lab.io (push) Successful in 7s
Fixed Admonitions
2026-06-04 13:46:20 -06:00

18 KiB
Raw Permalink Blame History

Proxmox VE Shared iSCSI/LVM Orphan Disk Audit and Cleanup Procedure

Purpose

This procedure describes how to identify and safely remove orphaned Proxmox VE virtual machine disks from shared iSCSI-backed LVM storage.

It is intended for environments where:

  • Proxmox VE is clustered.
  • Multiple Proxmox nodes access the same shared iSCSI LUN.
  • The shared storage is exposed to Proxmox as LVM storage.
  • VM disks are stored as LVM logical volumes.
  • Some volumes may remain after VM disk deletion, failed migrations, failed resizes, storage UI inconsistencies, or manual recovery work.

The goal is to reclaim storage space without accidentally deleting disks that are still attached to running or stopped VMs.


Scope

This document focuses on the following storage type:

Proxmox storage type: lvm
Backing storage:      shared iSCSI
Volume group:         vg_proxmox_iscsi
Storage ID example:   iscsi-cluster-lvm

Adjust the storage ID and volume group names as needed for your environment.


Safety Requirements

!!! danger "Never delete based on the Storage UI alone" The Proxmox storage UI may show a volume as belonging to a VM because its name follows the pattern vm-<vmid>-disk-<n>. That does not prove the disk is currently attached to the VM.

```
Always verify against VM configuration files and active QEMU processes before deleting.
```

!!! warning "Run the audit before running any cleanup commands" The audit scripts in this document are read-only. The cleanup commands are destructive. Do not run cleanup commands until the audit output has been reviewed.

!!! warning "Snapshot volumes require extra caution" Volumes named like the following may be part of a snapshot chain:

````
```text
snap_vm-<vmid>-disk-<n>_<snapshot-name>
```

Do not remove snapshot volumes manually unless you have verified that the VM and snapshot are no longer known to Proxmox, no backing chain references them, and no QEMU process has them open.
````

!!! note "Shared storage does not mean shared config" In some cluster layouts, each node may only show VM config files for VMs assigned to that node. Therefore, an audit run from only one node can falsely report disks from other nodes as orphaned.

```
Run the confirmation script on every node in the cluster.
```

Terms

Term Meaning
Attached disk A disk volume referenced in a VM config, such as scsi0, sata0, virtio0, efidisk0, or tpmstate0.
Orphan disk A storage volume that exists on shared storage but is not referenced by any VM config on any node and is not opened by any active process.
Volume ID Proxmox storage identifier, such as iscsi-cluster-lvm:vm-107-disk-1.qcow2.
LV LVM logical volume, such as /dev/vg_proxmox_iscsi/vm-107-disk-1.qcow2.
Snapshot chain A chain of qcow2 backing files or Proxmox snapshot volumes.

Phase 1: Identify Storage Names

Run this on any Proxmox node:

pvesm status
cat /etc/pve/storage.cfg
vgs

Identify the shared iSCSI/LVM storage.

Example:

Storage ID: iscsi-cluster-lvm
VG name:    vg_proxmox_iscsi

For the rest of this document, replace these values if your environment differs:

STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"

Phase 2: Run the Storage Orphan Audit

Run the following script on one node that can see the shared LVM storage.

This script does not delete anything.

cat > /root/pve-iscsi-orphan-audit.sh <<'EOF'
#!/usr/bin/env bash
set -u

STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"
OUT="/root/pve-iscsi-orphan-audit-$(hostname)-$(date +%Y%m%d-%H%M%S).txt"

{
  echo "===== PVE ISCSI ORPHAN AUDIT ====="
  echo "Host: $(hostname)"
  echo "Date: $(date)"
  echo "Storage: ${STORAGE}"
  echo "VG: ${VG}"

  echo
  echo "===== STORAGE STATUS ====="
  pvesm status 2>&1 | egrep "^(Name|${STORAGE})" || true
  vgs "${VG}" 2>&1 || true

  echo
  echo "===== ALL VOLUMES IN ${STORAGE} ====="
  pvesm list "${STORAGE}" 2>&1 || true

  echo
  echo "===== ALL LVs IN ${VG} ====="
  lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "${VG}" 2>&1 || true

  echo
  echo "===== LOCAL VM CONFIG FILES ====="
  for conf in /etc/pve/qemu-server/*.conf; do
    [ -e "$conf" ] || continue
    echo
    echo "----- $conf -----"
    cat "$conf"
  done

  echo
  echo "===== REFERENCE ANALYSIS - LOCAL CONFIG FILES ONLY ====="
  printf '%-55s | %-8s | %-10s | %-8s | %-8s | %s\n' \
    "volume" "vmid" "referenced" "open" "size" "path"

  {
    pvesm list "${STORAGE}" 2>/dev/null | awk 'NR>1 {print $1}' | sed "s#^${STORAGE}:##"
    lvs --noheadings -o lv_name "${VG}" 2>/dev/null | awk '{print $1}'
  } | sort -u | while read -r vol; do
    [ -n "$vol" ] || continue

    case "$vol" in
      vm-*-disk-*|snap_vm-*-disk-*) ;;
      *) continue ;;
    esac

    vmid="unknown"
    if [[ "$vol" =~ ^vm-([0-9]+)-disk- ]]; then
      vmid="${BASH_REMATCH[1]}"
    elif [[ "$vol" =~ ^snap_vm-([0-9]+)-disk- ]]; then
      vmid="${BASH_REMATCH[1]}"
    fi

    ref="no"
    if grep -R -Fq "$vol" /etc/pve/qemu-server 2>/dev/null; then
      ref="yes"
    fi

    open="no"
    if lsof 2>/dev/null | grep -Fq "$vol"; then
      open="yes"
    fi

    size="$(lvs --noheadings -o lv_size "${VG}/${vol}" 2>/dev/null | awk '{$1=$1;print}')"
    path="$(lvs --noheadings -o lv_path "${VG}/${vol}" 2>/dev/null | awk '{$1=$1;print}')"

    printf '%-55s | %-8s | %-10s | %-8s | %-8s | %s\n' \
      "$vol" "$vmid" "$ref" "$open" "$size" "$path"
  done

  echo
  echo "===== DONE ====="
} | tee "$OUT"

echo
echo "Saved audit to: $OUT"
EOF

chmod +x /root/pve-iscsi-orphan-audit.sh
/root/pve-iscsi-orphan-audit.sh

The script writes a file similar to:

/root/pve-iscsi-orphan-audit-<node>-<timestamp>.txt

How to Read the First Audit

The most important section is:

REFERENCE ANALYSIS - LOCAL CONFIG FILES ONLY

Example:

volume                 | vmid | referenced | open | size
vm-107-disk-0.qcow2    | 107  | no         | no   | 4.00m
vm-107-disk-1.qcow2    | 107  | no         | no   | 256.04g
vm-107-disk-2.qcow2    | 107  | yes        | no   | 4.00m
vm-107-disk-3.qcow2    | 107  | yes        | no   | 256.04g

Interpretation:

Field Meaning
referenced=yes The volume appears in a local VM config file. Do not delete.
referenced=no The volume does not appear in local VM configs. It may be orphaned, but confirm across all nodes first.
open=yes A process has the volume open. Do not delete.
open=no No process on this node has the volume open. Still confirm across all nodes.

!!! warning "Local reference analysis is not enough" If a VM runs on another cluster node, its config may not appear on the node where you ran the audit. This can make valid disks look orphaned.

```
Continue to Phase 3 before deleting anything.
```

Phase 3: Run Cluster-Wide Confirmation

Run the following script on every Proxmox node in the cluster.

This script is read-only.

cat > /root/pve-cluster-vm-confirm.sh <<'EOF'
#!/usr/bin/env bash
set -u

STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"
OUT="/root/pve-cluster-vm-confirm-$(hostname)-$(date +%Y%m%d-%H%M%S).txt"

{
  echo "===== NODE ====="
  hostname
  date

  echo
  echo "===== CLUSTER RESOURCES - VMS ====="
  pvesh get /cluster/resources --type vm 2>&1 || true

  echo
  echo "===== LOCAL QM LIST ====="
  qm list 2>&1 || true

  echo
  echo "===== QEMU CONFIG FILES PRESENT ====="
  ls -la /etc/pve/qemu-server/ 2>&1 || true

  echo
  echo "===== QEMU CONFIG FILE CONTENTS ====="
  for conf in /etc/pve/qemu-server/*.conf; do
    [ -e "$conf" ] || continue
    echo
    echo "----- $conf -----"
    cat "$conf"
  done

  echo
  echo "===== ALL STORAGE VOLUMES ====="
  pvesm list "${STORAGE}" 2>&1 || true

  echo
  echo "===== ALL LVs ====="
  lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "${VG}" 2>&1 || true

} | tee "$OUT"

echo
echo "Saved to: $OUT"
EOF

chmod +x /root/pve-cluster-vm-confirm.sh
/root/pve-cluster-vm-confirm.sh

Collect the output file from each node.

Example for a three-node cluster:

/root/pve-cluster-vm-confirm-cluster-node-01-YYYYMMDD-HHMMSS.txt
/root/pve-cluster-vm-confirm-cluster-node-02-YYYYMMDD-HHMMSS.txt
/root/pve-cluster-vm-confirm-cluster-node-03-YYYYMMDD-HHMMSS.txt

How to Read the Cluster Confirmation

For each suspicious volume, search all three outputs.

Example candidate:

vm-107-disk-1.qcow2

Check whether it appears in any VM config:

grep -R "vm-107-disk-1.qcow2" /etc/pve/qemu-server/ || true

If reviewing output files manually, look for config lines such as:

scsi0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
sata0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
virtio0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
efidisk0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
tpmstate0: iscsi-cluster-lvm:vm-107-disk-1.qcow2

If a volume appears in any of those lines, it is attached to a VM and must not be deleted.


Phase 4: Classify Candidate Volumes

Use the following decision table.

Condition Classification Action
Volume appears in any VM config on any node In use Do not delete
Volume is opened by QEMU or another process In use or unsafe Do not delete
Volume is a snap_vm-* snapshot volume Snapshot-chain item Inspect snapshot/backing chain before deletion
Volume does not appear in any VM config and is not open Orphan candidate Eligible for final verification
VMID no longer exists in cluster resources and disk is not referenced Strong orphan Eligible for cleanup

Example: Valid VM Disks

If VM 107 has this config:

efidisk0: iscsi-cluster-lvm:vm-107-disk-2.qcow2
sata0: iscsi-cluster-lvm:vm-107-disk-3.qcow2

Then these disks are valid and must not be deleted:

vm-107-disk-2.qcow2
vm-107-disk-3.qcow2

If storage also contains:

vm-107-disk-0.qcow2
vm-107-disk-1.qcow2

and neither appears in any config file on any node, those are orphan candidates.


Phase 5: Final Verification Before Deletion

For each candidate volume, run the following checks on a node that can see the shared storage.

Replace the volume name as appropriate.

VOL="vm-107-disk-1.qcow2"
STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"

echo "===== Check all cluster config references ====="
grep -R "$VOL" /etc/pve/qemu-server/ || true

echo
echo "===== Check Proxmox storage listing ====="
pvesm list "$STORAGE" | grep "$VOL" || true

echo
echo "===== Check LVM volume ====="
lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "$VG" | grep "$VOL" || true

echo
echo "===== Check whether open by any process ====="
lsof | grep "$VOL" || true

echo
echo "===== Check qemu-img metadata if device path exists ====="
LVPATH="$(lvs --noheadings -o lv_path "${VG}/${VOL}" 2>/dev/null | awk '{$1=$1;print}')"
if [ -n "$LVPATH" ] && [ -e "$LVPATH" ]; then
  qemu-img info --backing-chain "$LVPATH"
else
  echo "LV path missing or inactive: $LVPATH"
fi

Safe deletion pattern:

grep -R ...          no output
pvesm list ...       shows the volume
lvs ...              shows the volume
lsof ...             no output
qemu-img info ...    no unexpected backing file dependency

!!! danger "Stop if grep finds a reference" If the candidate volume appears in any /etc/pve/qemu-server/*.conf file, do not delete it.

!!! danger "Stop if lsof finds a process" If lsof shows the volume is open, do not delete it.


Phase 6: Cleanup Commands

Preferred Method: Proxmox Storage Layer

Use pvesm free first.

Example:

pvesm free iscsi-cluster-lvm:vm-107-disk-0.qcow2
pvesm free iscsi-cluster-lvm:vm-107-disk-1.qcow2

Then verify:

pvesm list iscsi-cluster-lvm | grep "vm-107-disk" || true
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" || true
vgs vg_proxmox_iscsi
pvesm status | egrep '^(Name|iscsi-cluster-lvm)'

Expected result:

vm-107-disk-0.qcow2    gone
vm-107-disk-1.qcow2    gone
vm-107-disk-2.qcow2    still present
vm-107-disk-3.qcow2    still present

Fallback Method: Direct LVM Removal

Only use this if pvesm free refuses and the final verification confirms the volume is not referenced and not open.

lvremove /dev/vg_proxmox_iscsi/vm-107-disk-0.qcow2
lvremove /dev/vg_proxmox_iscsi/vm-107-disk-1.qcow2

Then refresh device nodes and verify:

vgscan --mknodes
udevadm settle

pvesm list iscsi-cluster-lvm | grep "vm-107-disk" || true
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" || true
vgs vg_proxmox_iscsi

!!! warning "Prefer pvesm free over lvremove" pvesm free lets Proxmox remove the volume through its storage abstraction. Use direct lvremove only when Proxmox refuses and the orphan status is already proven.


Phase 7: Post-Cleanup Validation

After deleting orphan volumes, validate storage and VM health.

pvesm status
vgs vg_proxmox_iscsi
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi

Check the affected VMs config:

qm config 107

Confirm the VM still starts or remains healthy:

qm status 107

If the VM is running, confirm its active QEMU process only references expected disks:

ps auxww | grep "kvm -id 107" | grep -o "/dev/vg_proxmox_iscsi/[^ ,\"]*" | sort -u

Expected example:

/dev/vg_proxmox_iscsi/vm-107-disk-2.qcow2
/dev/vg_proxmox_iscsi/vm-107-disk-3.qcow2

Snapshot Volume Handling

Snapshot volumes require additional review.

Examples:

snap_vm-105-disk-0_Fresh_Install.qcow2
snap_vm-106-disk-0_Fresh_Install_FullyUpdated.qcow2

Before deleting a snapshot volume, check:

qm config <vmid>
qm listsnapshot <vmid>
grep -R "snap_vm-<vmid>" /etc/pve/qemu-server/ || true
qemu-img info --backing-chain /dev/vg_proxmox_iscsi/vm-<vmid>-disk-<n>.qcow2

If the VM still has a parent: line or qm listsnapshot shows the snapshot, remove it through Proxmox first:

qm delsnapshot <vmid> <snapshot-name>

Only consider manual removal if:

  • the VM no longer references the snapshot,
  • no backing chain references the snapshot volume,
  • no QEMU process has it open,
  • and Proxmox cannot delete it normally.

!!! danger "Do not manually delete active snapshot-chain volumes" Deleting an active snapshot backing volume can corrupt the VM disk chain.


Example Cleanup Walkthrough

Scenario

VM 107 has this config:

efidisk0: iscsi-cluster-lvm:vm-107-disk-2.qcow2
sata0: iscsi-cluster-lvm:vm-107-disk-3.qcow2

Storage contains:

vm-107-disk-0.qcow2
vm-107-disk-1.qcow2
vm-107-disk-2.qcow2
vm-107-disk-3.qcow2

disk-0 and disk-1 do not appear in any config and are not open by any process.

Verify

grep -R "vm-107-disk-0.qcow2" /etc/pve/qemu-server/ || true
grep -R "vm-107-disk-1.qcow2" /etc/pve/qemu-server/ || true

lsof | grep "vm-107-disk-0.qcow2" || true
lsof | grep "vm-107-disk-1.qcow2" || true

Expected output:

no output

Delete

pvesm free iscsi-cluster-lvm:vm-107-disk-0.qcow2
pvesm free iscsi-cluster-lvm:vm-107-disk-1.qcow2

Validate

pvesm list iscsi-cluster-lvm | grep "vm-107-disk"
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk"
vgs vg_proxmox_iscsi

Expected remaining volumes:

vm-107-disk-2.qcow2
vm-107-disk-3.qcow2

Technician Checklist

Use this checklist before removing any orphan disk.

  • I ran the storage orphan audit.
  • I ran the cluster confirmation script on every Proxmox node.
  • I confirmed the candidate volume is not referenced in any VM config.
  • I confirmed the candidate volume is not open by any process.
  • I confirmed the candidate volume is not part of an active snapshot chain.
  • I confirmed the VMID relationship is understood.
  • I used pvesm free first.
  • I used lvremove only if Proxmox refused and the volume was proven orphaned.
  • I validated storage state after cleanup.
  • I validated the affected VM still references only expected disks.

Quick Reference Commands

List shared storage volumes

pvesm list iscsi-cluster-lvm

List LVs

lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices vg_proxmox_iscsi

Search VM configs

grep -R "vm-<vmid>-disk-<n>" /etc/pve/qemu-server/ || true

Check open files

lsof | grep "vm-<vmid>-disk-<n>" || true

Check image metadata

qemu-img info --backing-chain /dev/vg_proxmox_iscsi/vm-<vmid>-disk-<n>.qcow2

Delete via Proxmox

pvesm free iscsi-cluster-lvm:vm-<vmid>-disk-<n>.qcow2

Delete via LVM fallback

lvremove /dev/vg_proxmox_iscsi/vm-<vmid>-disk-<n>.qcow2

Verify storage usage

pvesm status
vgs vg_proxmox_iscsi