18 KiB
Proxmox VE Shared iSCSI/LVM Orphan Disk Audit and Cleanup Procedure
Purpose
This procedure describes how to identify and safely remove orphaned Proxmox VE virtual machine disks from shared iSCSI-backed LVM storage.
It is intended for environments where:
- Proxmox VE is clustered.
- Multiple Proxmox nodes access the same shared iSCSI LUN.
- The shared storage is exposed to Proxmox as LVM storage.
- VM disks are stored as LVM logical volumes.
- Some volumes may remain after VM disk deletion, failed migrations, failed resizes, storage UI inconsistencies, or manual recovery work.
The goal is to reclaim storage space without accidentally deleting disks that are still attached to running or stopped VMs.
Scope
This document focuses on the following storage type:
Proxmox storage type: lvm
Backing storage: shared iSCSI
Volume group: vg_proxmox_iscsi
Storage ID example: iscsi-cluster-lvm
Adjust the storage ID and volume group names as needed for your environment.
Safety Requirements
!!! danger "Never delete based on the Storage UI alone"
The Proxmox storage UI may show a volume as belonging to a VM because its name follows the pattern vm-<vmid>-disk-<n>. That does not prove the disk is currently attached to the VM.
```
Always verify against VM configuration files and active QEMU processes before deleting.
```
!!! warning "Run the audit before running any cleanup commands" The audit scripts in this document are read-only. The cleanup commands are destructive. Do not run cleanup commands until the audit output has been reviewed.
!!! warning "Snapshot volumes require extra caution" Volumes named like the following may be part of a snapshot chain:
````
```text
snap_vm-<vmid>-disk-<n>_<snapshot-name>
```
Do not remove snapshot volumes manually unless you have verified that the VM and snapshot are no longer known to Proxmox, no backing chain references them, and no QEMU process has them open.
````
!!! note "Shared storage does not mean shared config" In some cluster layouts, each node may only show VM config files for VMs assigned to that node. Therefore, an audit run from only one node can falsely report disks from other nodes as orphaned.
```
Run the confirmation script on every node in the cluster.
```
Terms
| Term | Meaning |
|---|---|
| Attached disk | A disk volume referenced in a VM config, such as scsi0, sata0, virtio0, efidisk0, or tpmstate0. |
| Orphan disk | A storage volume that exists on shared storage but is not referenced by any VM config on any node and is not opened by any active process. |
| Volume ID | Proxmox storage identifier, such as iscsi-cluster-lvm:vm-107-disk-1.qcow2. |
| LV | LVM logical volume, such as /dev/vg_proxmox_iscsi/vm-107-disk-1.qcow2. |
| Snapshot chain | A chain of qcow2 backing files or Proxmox snapshot volumes. |
Phase 1: Identify Storage Names
Run this on any Proxmox node:
pvesm status
cat /etc/pve/storage.cfg
vgs
Identify the shared iSCSI/LVM storage.
Example:
Storage ID: iscsi-cluster-lvm
VG name: vg_proxmox_iscsi
For the rest of this document, replace these values if your environment differs:
STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"
Phase 2: Run the Storage Orphan Audit
Run the following script on one node that can see the shared LVM storage.
This script does not delete anything.
cat > /root/pve-iscsi-orphan-audit.sh <<'EOF'
#!/usr/bin/env bash
set -u
STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"
OUT="/root/pve-iscsi-orphan-audit-$(hostname)-$(date +%Y%m%d-%H%M%S).txt"
{
echo "===== PVE ISCSI ORPHAN AUDIT ====="
echo "Host: $(hostname)"
echo "Date: $(date)"
echo "Storage: ${STORAGE}"
echo "VG: ${VG}"
echo
echo "===== STORAGE STATUS ====="
pvesm status 2>&1 | egrep "^(Name|${STORAGE})" || true
vgs "${VG}" 2>&1 || true
echo
echo "===== ALL VOLUMES IN ${STORAGE} ====="
pvesm list "${STORAGE}" 2>&1 || true
echo
echo "===== ALL LVs IN ${VG} ====="
lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "${VG}" 2>&1 || true
echo
echo "===== LOCAL VM CONFIG FILES ====="
for conf in /etc/pve/qemu-server/*.conf; do
[ -e "$conf" ] || continue
echo
echo "----- $conf -----"
cat "$conf"
done
echo
echo "===== REFERENCE ANALYSIS - LOCAL CONFIG FILES ONLY ====="
printf '%-55s | %-8s | %-10s | %-8s | %-8s | %s\n' \
"volume" "vmid" "referenced" "open" "size" "path"
{
pvesm list "${STORAGE}" 2>/dev/null | awk 'NR>1 {print $1}' | sed "s#^${STORAGE}:##"
lvs --noheadings -o lv_name "${VG}" 2>/dev/null | awk '{print $1}'
} | sort -u | while read -r vol; do
[ -n "$vol" ] || continue
case "$vol" in
vm-*-disk-*|snap_vm-*-disk-*) ;;
*) continue ;;
esac
vmid="unknown"
if [[ "$vol" =~ ^vm-([0-9]+)-disk- ]]; then
vmid="${BASH_REMATCH[1]}"
elif [[ "$vol" =~ ^snap_vm-([0-9]+)-disk- ]]; then
vmid="${BASH_REMATCH[1]}"
fi
ref="no"
if grep -R -Fq "$vol" /etc/pve/qemu-server 2>/dev/null; then
ref="yes"
fi
open="no"
if lsof 2>/dev/null | grep -Fq "$vol"; then
open="yes"
fi
size="$(lvs --noheadings -o lv_size "${VG}/${vol}" 2>/dev/null | awk '{$1=$1;print}')"
path="$(lvs --noheadings -o lv_path "${VG}/${vol}" 2>/dev/null | awk '{$1=$1;print}')"
printf '%-55s | %-8s | %-10s | %-8s | %-8s | %s\n' \
"$vol" "$vmid" "$ref" "$open" "$size" "$path"
done
echo
echo "===== DONE ====="
} | tee "$OUT"
echo
echo "Saved audit to: $OUT"
EOF
chmod +x /root/pve-iscsi-orphan-audit.sh
/root/pve-iscsi-orphan-audit.sh
The script writes a file similar to:
/root/pve-iscsi-orphan-audit-<node>-<timestamp>.txt
How to Read the First Audit
The most important section is:
REFERENCE ANALYSIS - LOCAL CONFIG FILES ONLY
Example:
volume | vmid | referenced | open | size
vm-107-disk-0.qcow2 | 107 | no | no | 4.00m
vm-107-disk-1.qcow2 | 107 | no | no | 256.04g
vm-107-disk-2.qcow2 | 107 | yes | no | 4.00m
vm-107-disk-3.qcow2 | 107 | yes | no | 256.04g
Interpretation:
| Field | Meaning |
|---|---|
referenced=yes |
The volume appears in a local VM config file. Do not delete. |
referenced=no |
The volume does not appear in local VM configs. It may be orphaned, but confirm across all nodes first. |
open=yes |
A process has the volume open. Do not delete. |
open=no |
No process on this node has the volume open. Still confirm across all nodes. |
!!! warning "Local reference analysis is not enough" If a VM runs on another cluster node, its config may not appear on the node where you ran the audit. This can make valid disks look orphaned.
```
Continue to Phase 3 before deleting anything.
```
Phase 3: Run Cluster-Wide Confirmation
Run the following script on every Proxmox node in the cluster.
This script is read-only.
cat > /root/pve-cluster-vm-confirm.sh <<'EOF'
#!/usr/bin/env bash
set -u
STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"
OUT="/root/pve-cluster-vm-confirm-$(hostname)-$(date +%Y%m%d-%H%M%S).txt"
{
echo "===== NODE ====="
hostname
date
echo
echo "===== CLUSTER RESOURCES - VMS ====="
pvesh get /cluster/resources --type vm 2>&1 || true
echo
echo "===== LOCAL QM LIST ====="
qm list 2>&1 || true
echo
echo "===== QEMU CONFIG FILES PRESENT ====="
ls -la /etc/pve/qemu-server/ 2>&1 || true
echo
echo "===== QEMU CONFIG FILE CONTENTS ====="
for conf in /etc/pve/qemu-server/*.conf; do
[ -e "$conf" ] || continue
echo
echo "----- $conf -----"
cat "$conf"
done
echo
echo "===== ALL STORAGE VOLUMES ====="
pvesm list "${STORAGE}" 2>&1 || true
echo
echo "===== ALL LVs ====="
lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "${VG}" 2>&1 || true
} | tee "$OUT"
echo
echo "Saved to: $OUT"
EOF
chmod +x /root/pve-cluster-vm-confirm.sh
/root/pve-cluster-vm-confirm.sh
Collect the output file from each node.
Example for a three-node cluster:
/root/pve-cluster-vm-confirm-cluster-node-01-YYYYMMDD-HHMMSS.txt
/root/pve-cluster-vm-confirm-cluster-node-02-YYYYMMDD-HHMMSS.txt
/root/pve-cluster-vm-confirm-cluster-node-03-YYYYMMDD-HHMMSS.txt
How to Read the Cluster Confirmation
For each suspicious volume, search all three outputs.
Example candidate:
vm-107-disk-1.qcow2
Check whether it appears in any VM config:
grep -R "vm-107-disk-1.qcow2" /etc/pve/qemu-server/ || true
If reviewing output files manually, look for config lines such as:
scsi0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
sata0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
virtio0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
efidisk0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
tpmstate0: iscsi-cluster-lvm:vm-107-disk-1.qcow2
If a volume appears in any of those lines, it is attached to a VM and must not be deleted.
Phase 4: Classify Candidate Volumes
Use the following decision table.
| Condition | Classification | Action |
|---|---|---|
| Volume appears in any VM config on any node | In use | Do not delete |
| Volume is opened by QEMU or another process | In use or unsafe | Do not delete |
Volume is a snap_vm-* snapshot volume |
Snapshot-chain item | Inspect snapshot/backing chain before deletion |
| Volume does not appear in any VM config and is not open | Orphan candidate | Eligible for final verification |
| VMID no longer exists in cluster resources and disk is not referenced | Strong orphan | Eligible for cleanup |
Example: Valid VM Disks
If VM 107 has this config:
efidisk0: iscsi-cluster-lvm:vm-107-disk-2.qcow2
sata0: iscsi-cluster-lvm:vm-107-disk-3.qcow2
Then these disks are valid and must not be deleted:
vm-107-disk-2.qcow2
vm-107-disk-3.qcow2
If storage also contains:
vm-107-disk-0.qcow2
vm-107-disk-1.qcow2
and neither appears in any config file on any node, those are orphan candidates.
Phase 5: Final Verification Before Deletion
For each candidate volume, run the following checks on a node that can see the shared storage.
Replace the volume name as appropriate.
VOL="vm-107-disk-1.qcow2"
STORAGE="iscsi-cluster-lvm"
VG="vg_proxmox_iscsi"
echo "===== Check all cluster config references ====="
grep -R "$VOL" /etc/pve/qemu-server/ || true
echo
echo "===== Check Proxmox storage listing ====="
pvesm list "$STORAGE" | grep "$VOL" || true
echo
echo "===== Check LVM volume ====="
lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "$VG" | grep "$VOL" || true
echo
echo "===== Check whether open by any process ====="
lsof | grep "$VOL" || true
echo
echo "===== Check qemu-img metadata if device path exists ====="
LVPATH="$(lvs --noheadings -o lv_path "${VG}/${VOL}" 2>/dev/null | awk '{$1=$1;print}')"
if [ -n "$LVPATH" ] && [ -e "$LVPATH" ]; then
qemu-img info --backing-chain "$LVPATH"
else
echo "LV path missing or inactive: $LVPATH"
fi
Safe deletion pattern:
grep -R ... no output
pvesm list ... shows the volume
lvs ... shows the volume
lsof ... no output
qemu-img info ... no unexpected backing file dependency
!!! danger "Stop if grep finds a reference"
If the candidate volume appears in any /etc/pve/qemu-server/*.conf file, do not delete it.
!!! danger "Stop if lsof finds a process"
If lsof shows the volume is open, do not delete it.
Phase 6: Cleanup Commands
Preferred Method: Proxmox Storage Layer
Use pvesm free first.
Example:
pvesm free iscsi-cluster-lvm:vm-107-disk-0.qcow2
pvesm free iscsi-cluster-lvm:vm-107-disk-1.qcow2
Then verify:
pvesm list iscsi-cluster-lvm | grep "vm-107-disk" || true
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" || true
vgs vg_proxmox_iscsi
pvesm status | egrep '^(Name|iscsi-cluster-lvm)'
Expected result:
vm-107-disk-0.qcow2 gone
vm-107-disk-1.qcow2 gone
vm-107-disk-2.qcow2 still present
vm-107-disk-3.qcow2 still present
Fallback Method: Direct LVM Removal
Only use this if pvesm free refuses and the final verification confirms the volume is not referenced and not open.
lvremove /dev/vg_proxmox_iscsi/vm-107-disk-0.qcow2
lvremove /dev/vg_proxmox_iscsi/vm-107-disk-1.qcow2
Then refresh device nodes and verify:
vgscan --mknodes
udevadm settle
pvesm list iscsi-cluster-lvm | grep "vm-107-disk" || true
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" || true
vgs vg_proxmox_iscsi
!!! warning "Prefer pvesm free over lvremove"
pvesm free lets Proxmox remove the volume through its storage abstraction. Use direct lvremove only when Proxmox refuses and the orphan status is already proven.
Phase 7: Post-Cleanup Validation
After deleting orphan volumes, validate storage and VM health.
pvesm status
vgs vg_proxmox_iscsi
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi
Check the affected VM’s config:
qm config 107
Confirm the VM still starts or remains healthy:
qm status 107
If the VM is running, confirm its active QEMU process only references expected disks:
ps auxww | grep "kvm -id 107" | grep -o "/dev/vg_proxmox_iscsi/[^ ,\"]*" | sort -u
Expected example:
/dev/vg_proxmox_iscsi/vm-107-disk-2.qcow2
/dev/vg_proxmox_iscsi/vm-107-disk-3.qcow2
Snapshot Volume Handling
Snapshot volumes require additional review.
Examples:
snap_vm-105-disk-0_Fresh_Install.qcow2
snap_vm-106-disk-0_Fresh_Install_FullyUpdated.qcow2
Before deleting a snapshot volume, check:
qm config <vmid>
qm listsnapshot <vmid>
grep -R "snap_vm-<vmid>" /etc/pve/qemu-server/ || true
qemu-img info --backing-chain /dev/vg_proxmox_iscsi/vm-<vmid>-disk-<n>.qcow2
If the VM still has a parent: line or qm listsnapshot shows the snapshot, remove it through Proxmox first:
qm delsnapshot <vmid> <snapshot-name>
Only consider manual removal if:
- the VM no longer references the snapshot,
- no backing chain references the snapshot volume,
- no QEMU process has it open,
- and Proxmox cannot delete it normally.
!!! danger "Do not manually delete active snapshot-chain volumes" Deleting an active snapshot backing volume can corrupt the VM disk chain.
Example Cleanup Walkthrough
Scenario
VM 107 has this config:
efidisk0: iscsi-cluster-lvm:vm-107-disk-2.qcow2
sata0: iscsi-cluster-lvm:vm-107-disk-3.qcow2
Storage contains:
vm-107-disk-0.qcow2
vm-107-disk-1.qcow2
vm-107-disk-2.qcow2
vm-107-disk-3.qcow2
disk-0 and disk-1 do not appear in any config and are not open by any process.
Verify
grep -R "vm-107-disk-0.qcow2" /etc/pve/qemu-server/ || true
grep -R "vm-107-disk-1.qcow2" /etc/pve/qemu-server/ || true
lsof | grep "vm-107-disk-0.qcow2" || true
lsof | grep "vm-107-disk-1.qcow2" || true
Expected output:
no output
Delete
pvesm free iscsi-cluster-lvm:vm-107-disk-0.qcow2
pvesm free iscsi-cluster-lvm:vm-107-disk-1.qcow2
Validate
pvesm list iscsi-cluster-lvm | grep "vm-107-disk"
lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk"
vgs vg_proxmox_iscsi
Expected remaining volumes:
vm-107-disk-2.qcow2
vm-107-disk-3.qcow2
Technician Checklist
Use this checklist before removing any orphan disk.
- I ran the storage orphan audit.
- I ran the cluster confirmation script on every Proxmox node.
- I confirmed the candidate volume is not referenced in any VM config.
- I confirmed the candidate volume is not open by any process.
- I confirmed the candidate volume is not part of an active snapshot chain.
- I confirmed the VMID relationship is understood.
- I used
pvesm freefirst. - I used
lvremoveonly if Proxmox refused and the volume was proven orphaned. - I validated storage state after cleanup.
- I validated the affected VM still references only expected disks.
Quick Reference Commands
List shared storage volumes
pvesm list iscsi-cluster-lvm
List LVs
lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices vg_proxmox_iscsi
Search VM configs
grep -R "vm-<vmid>-disk-<n>" /etc/pve/qemu-server/ || true
Check open files
lsof | grep "vm-<vmid>-disk-<n>" || true
Check image metadata
qemu-img info --backing-chain /dev/vg_proxmox_iscsi/vm-<vmid>-disk-<n>.qcow2
Delete via Proxmox
pvesm free iscsi-cluster-lvm:vm-<vmid>-disk-<n>.qcow2
Delete via LVM fallback
lvremove /dev/vg_proxmox_iscsi/vm-<vmid>-disk-<n>.qcow2
Verify storage usage
pvesm status
vgs vg_proxmox_iscsi