# Proxmox VE Shared iSCSI/LVM Orphan Disk Audit and Cleanup Procedure ## Purpose This procedure describes how to identify and safely remove orphaned Proxmox VE virtual machine disks from shared iSCSI-backed LVM storage. It is intended for environments where: * Proxmox VE is clustered. * Multiple Proxmox nodes access the same shared iSCSI LUN. * The shared storage is exposed to Proxmox as LVM storage. * VM disks are stored as LVM logical volumes. * Some volumes may remain after VM disk deletion, failed migrations, failed resizes, storage UI inconsistencies, or manual recovery work. The goal is to reclaim storage space without accidentally deleting disks that are still attached to running or stopped VMs. --- ## Scope This document focuses on the following storage type: ```text Proxmox storage type: lvm Backing storage: shared iSCSI Volume group: vg_proxmox_iscsi Storage ID example: iscsi-cluster-lvm ``` Adjust the storage ID and volume group names as needed for your environment. --- ## Safety Requirements !!! danger "Never delete based on the Storage UI alone" The Proxmox storage UI may show a volume as belonging to a VM because its name follows the pattern `vm--disk-`. That does not prove the disk is currently attached to the VM. ``` Always verify against VM configuration files and active QEMU processes before deleting. ``` !!! warning "Run the audit before running any cleanup commands" The audit scripts in this document are read-only. The cleanup commands are destructive. Do not run cleanup commands until the audit output has been reviewed. !!! warning "Snapshot volumes require extra caution" Volumes named like the following may be part of a snapshot chain: ```` ```text snap_vm--disk-_ ``` Do not remove snapshot volumes manually unless you have verified that the VM and snapshot are no longer known to Proxmox, no backing chain references them, and no QEMU process has them open. ```` !!! note "Shared storage does not mean shared config" In some cluster layouts, each node may only show VM config files for VMs assigned to that node. Therefore, an audit run from only one node can falsely report disks from other nodes as orphaned. ``` Run the confirmation script on every node in the cluster. ``` --- ## Terms | Term | Meaning | | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | | Attached disk | A disk volume referenced in a VM config, such as `scsi0`, `sata0`, `virtio0`, `efidisk0`, or `tpmstate0`. | | Orphan disk | A storage volume that exists on shared storage but is not referenced by any VM config on any node and is not opened by any active process. | | Volume ID | Proxmox storage identifier, such as `iscsi-cluster-lvm:vm-107-disk-1.qcow2`. | | LV | LVM logical volume, such as `/dev/vg_proxmox_iscsi/vm-107-disk-1.qcow2`. | | Snapshot chain | A chain of qcow2 backing files or Proxmox snapshot volumes. | --- # Phase 1: Identify Storage Names Run this on any Proxmox node: ```bash pvesm status cat /etc/pve/storage.cfg vgs ``` Identify the shared iSCSI/LVM storage. Example: ```text Storage ID: iscsi-cluster-lvm VG name: vg_proxmox_iscsi ``` For the rest of this document, replace these values if your environment differs: ```bash STORAGE="iscsi-cluster-lvm" VG="vg_proxmox_iscsi" ``` --- # Phase 2: Run the Storage Orphan Audit Run the following script on one node that can see the shared LVM storage. This script does not delete anything. ```bash cat > /root/pve-iscsi-orphan-audit.sh <<'EOF' #!/usr/bin/env bash set -u STORAGE="iscsi-cluster-lvm" VG="vg_proxmox_iscsi" OUT="/root/pve-iscsi-orphan-audit-$(hostname)-$(date +%Y%m%d-%H%M%S).txt" { echo "===== PVE ISCSI ORPHAN AUDIT =====" echo "Host: $(hostname)" echo "Date: $(date)" echo "Storage: ${STORAGE}" echo "VG: ${VG}" echo echo "===== STORAGE STATUS =====" pvesm status 2>&1 | egrep "^(Name|${STORAGE})" || true vgs "${VG}" 2>&1 || true echo echo "===== ALL VOLUMES IN ${STORAGE} =====" pvesm list "${STORAGE}" 2>&1 || true echo echo "===== ALL LVs IN ${VG} =====" lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "${VG}" 2>&1 || true echo echo "===== LOCAL VM CONFIG FILES =====" for conf in /etc/pve/qemu-server/*.conf; do [ -e "$conf" ] || continue echo echo "----- $conf -----" cat "$conf" done echo echo "===== REFERENCE ANALYSIS - LOCAL CONFIG FILES ONLY =====" printf '%-55s | %-8s | %-10s | %-8s | %-8s | %s\n' \ "volume" "vmid" "referenced" "open" "size" "path" { pvesm list "${STORAGE}" 2>/dev/null | awk 'NR>1 {print $1}' | sed "s#^${STORAGE}:##" lvs --noheadings -o lv_name "${VG}" 2>/dev/null | awk '{print $1}' } | sort -u | while read -r vol; do [ -n "$vol" ] || continue case "$vol" in vm-*-disk-*|snap_vm-*-disk-*) ;; *) continue ;; esac vmid="unknown" if [[ "$vol" =~ ^vm-([0-9]+)-disk- ]]; then vmid="${BASH_REMATCH[1]}" elif [[ "$vol" =~ ^snap_vm-([0-9]+)-disk- ]]; then vmid="${BASH_REMATCH[1]}" fi ref="no" if grep -R -Fq "$vol" /etc/pve/qemu-server 2>/dev/null; then ref="yes" fi open="no" if lsof 2>/dev/null | grep -Fq "$vol"; then open="yes" fi size="$(lvs --noheadings -o lv_size "${VG}/${vol}" 2>/dev/null | awk '{$1=$1;print}')" path="$(lvs --noheadings -o lv_path "${VG}/${vol}" 2>/dev/null | awk '{$1=$1;print}')" printf '%-55s | %-8s | %-10s | %-8s | %-8s | %s\n' \ "$vol" "$vmid" "$ref" "$open" "$size" "$path" done echo echo "===== DONE =====" } | tee "$OUT" echo echo "Saved audit to: $OUT" EOF chmod +x /root/pve-iscsi-orphan-audit.sh /root/pve-iscsi-orphan-audit.sh ``` The script writes a file similar to: ```text /root/pve-iscsi-orphan-audit--.txt ``` --- ## How to Read the First Audit The most important section is: ```text REFERENCE ANALYSIS - LOCAL CONFIG FILES ONLY ``` Example: ```text volume | vmid | referenced | open | size vm-107-disk-0.qcow2 | 107 | no | no | 4.00m vm-107-disk-1.qcow2 | 107 | no | no | 256.04g vm-107-disk-2.qcow2 | 107 | yes | no | 4.00m vm-107-disk-3.qcow2 | 107 | yes | no | 256.04g ``` Interpretation: | Field | Meaning | | ---------------- | ------------------------------------------------------------------------------------------------------- | | `referenced=yes` | The volume appears in a local VM config file. Do not delete. | | `referenced=no` | The volume does not appear in local VM configs. It may be orphaned, but confirm across all nodes first. | | `open=yes` | A process has the volume open. Do not delete. | | `open=no` | No process on this node has the volume open. Still confirm across all nodes. | !!! warning "Local reference analysis is not enough" If a VM runs on another cluster node, its config may not appear on the node where you ran the audit. This can make valid disks look orphaned. ``` Continue to Phase 3 before deleting anything. ``` --- # Phase 3: Run Cluster-Wide Confirmation Run the following script on **every Proxmox node** in the cluster. This script is read-only. ```bash cat > /root/pve-cluster-vm-confirm.sh <<'EOF' #!/usr/bin/env bash set -u STORAGE="iscsi-cluster-lvm" VG="vg_proxmox_iscsi" OUT="/root/pve-cluster-vm-confirm-$(hostname)-$(date +%Y%m%d-%H%M%S).txt" { echo "===== NODE =====" hostname date echo echo "===== CLUSTER RESOURCES - VMS =====" pvesh get /cluster/resources --type vm 2>&1 || true echo echo "===== LOCAL QM LIST =====" qm list 2>&1 || true echo echo "===== QEMU CONFIG FILES PRESENT =====" ls -la /etc/pve/qemu-server/ 2>&1 || true echo echo "===== QEMU CONFIG FILE CONTENTS =====" for conf in /etc/pve/qemu-server/*.conf; do [ -e "$conf" ] || continue echo echo "----- $conf -----" cat "$conf" done echo echo "===== ALL STORAGE VOLUMES =====" pvesm list "${STORAGE}" 2>&1 || true echo echo "===== ALL LVs =====" lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "${VG}" 2>&1 || true } | tee "$OUT" echo echo "Saved to: $OUT" EOF chmod +x /root/pve-cluster-vm-confirm.sh /root/pve-cluster-vm-confirm.sh ``` Collect the output file from each node. Example for a three-node cluster: ```text /root/pve-cluster-vm-confirm-cluster-node-01-YYYYMMDD-HHMMSS.txt /root/pve-cluster-vm-confirm-cluster-node-02-YYYYMMDD-HHMMSS.txt /root/pve-cluster-vm-confirm-cluster-node-03-YYYYMMDD-HHMMSS.txt ``` --- ## How to Read the Cluster Confirmation For each suspicious volume, search all three outputs. Example candidate: ```text vm-107-disk-1.qcow2 ``` Check whether it appears in any VM config: ```bash grep -R "vm-107-disk-1.qcow2" /etc/pve/qemu-server/ || true ``` If reviewing output files manually, look for config lines such as: ```text scsi0: iscsi-cluster-lvm:vm-107-disk-1.qcow2 sata0: iscsi-cluster-lvm:vm-107-disk-1.qcow2 virtio0: iscsi-cluster-lvm:vm-107-disk-1.qcow2 efidisk0: iscsi-cluster-lvm:vm-107-disk-1.qcow2 tpmstate0: iscsi-cluster-lvm:vm-107-disk-1.qcow2 ``` If a volume appears in any of those lines, it is attached to a VM and must not be deleted. --- # Phase 4: Classify Candidate Volumes Use the following decision table. | Condition | Classification | Action | | --------------------------------------------------------------------- | ------------------- | ---------------------------------------------- | | Volume appears in any VM config on any node | In use | Do not delete | | Volume is opened by QEMU or another process | In use or unsafe | Do not delete | | Volume is a `snap_vm-*` snapshot volume | Snapshot-chain item | Inspect snapshot/backing chain before deletion | | Volume does not appear in any VM config and is not open | Orphan candidate | Eligible for final verification | | VMID no longer exists in cluster resources and disk is not referenced | Strong orphan | Eligible for cleanup | --- ## Example: Valid VM Disks If VM `107` has this config: ```text efidisk0: iscsi-cluster-lvm:vm-107-disk-2.qcow2 sata0: iscsi-cluster-lvm:vm-107-disk-3.qcow2 ``` Then these disks are valid and must not be deleted: ```text vm-107-disk-2.qcow2 vm-107-disk-3.qcow2 ``` If storage also contains: ```text vm-107-disk-0.qcow2 vm-107-disk-1.qcow2 ``` and neither appears in any config file on any node, those are orphan candidates. --- # Phase 5: Final Verification Before Deletion For each candidate volume, run the following checks on a node that can see the shared storage. Replace the volume name as appropriate. ```bash VOL="vm-107-disk-1.qcow2" STORAGE="iscsi-cluster-lvm" VG="vg_proxmox_iscsi" echo "===== Check all cluster config references =====" grep -R "$VOL" /etc/pve/qemu-server/ || true echo echo "===== Check Proxmox storage listing =====" pvesm list "$STORAGE" | grep "$VOL" || true echo echo "===== Check LVM volume =====" lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices "$VG" | grep "$VOL" || true echo echo "===== Check whether open by any process =====" lsof | grep "$VOL" || true echo echo "===== Check qemu-img metadata if device path exists =====" LVPATH="$(lvs --noheadings -o lv_path "${VG}/${VOL}" 2>/dev/null | awk '{$1=$1;print}')" if [ -n "$LVPATH" ] && [ -e "$LVPATH" ]; then qemu-img info --backing-chain "$LVPATH" else echo "LV path missing or inactive: $LVPATH" fi ``` Safe deletion pattern: ```text grep -R ... no output pvesm list ... shows the volume lvs ... shows the volume lsof ... no output qemu-img info ... no unexpected backing file dependency ``` !!! danger "Stop if grep finds a reference" If the candidate volume appears in any `/etc/pve/qemu-server/*.conf` file, do not delete it. !!! danger "Stop if lsof finds a process" If `lsof` shows the volume is open, do not delete it. --- # Phase 6: Cleanup Commands ## Preferred Method: Proxmox Storage Layer Use `pvesm free` first. Example: ```bash pvesm free iscsi-cluster-lvm:vm-107-disk-0.qcow2 pvesm free iscsi-cluster-lvm:vm-107-disk-1.qcow2 ``` Then verify: ```bash pvesm list iscsi-cluster-lvm | grep "vm-107-disk" || true lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" || true vgs vg_proxmox_iscsi pvesm status | egrep '^(Name|iscsi-cluster-lvm)' ``` Expected result: ```text vm-107-disk-0.qcow2 gone vm-107-disk-1.qcow2 gone vm-107-disk-2.qcow2 still present vm-107-disk-3.qcow2 still present ``` --- ## Fallback Method: Direct LVM Removal Only use this if `pvesm free` refuses and the final verification confirms the volume is not referenced and not open. ```bash lvremove /dev/vg_proxmox_iscsi/vm-107-disk-0.qcow2 lvremove /dev/vg_proxmox_iscsi/vm-107-disk-1.qcow2 ``` Then refresh device nodes and verify: ```bash vgscan --mknodes udevadm settle pvesm list iscsi-cluster-lvm | grep "vm-107-disk" || true lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" || true vgs vg_proxmox_iscsi ``` !!! warning "Prefer pvesm free over lvremove" `pvesm free` lets Proxmox remove the volume through its storage abstraction. Use direct `lvremove` only when Proxmox refuses and the orphan status is already proven. --- # Phase 7: Post-Cleanup Validation After deleting orphan volumes, validate storage and VM health. ```bash pvesm status vgs vg_proxmox_iscsi lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi ``` Check the affected VM’s config: ```bash qm config 107 ``` Confirm the VM still starts or remains healthy: ```bash qm status 107 ``` If the VM is running, confirm its active QEMU process only references expected disks: ```bash ps auxww | grep "kvm -id 107" | grep -o "/dev/vg_proxmox_iscsi/[^ ,\"]*" | sort -u ``` Expected example: ```text /dev/vg_proxmox_iscsi/vm-107-disk-2.qcow2 /dev/vg_proxmox_iscsi/vm-107-disk-3.qcow2 ``` --- # Snapshot Volume Handling Snapshot volumes require additional review. Examples: ```text snap_vm-105-disk-0_Fresh_Install.qcow2 snap_vm-106-disk-0_Fresh_Install_FullyUpdated.qcow2 ``` Before deleting a snapshot volume, check: ```bash qm config qm listsnapshot grep -R "snap_vm-" /etc/pve/qemu-server/ || true qemu-img info --backing-chain /dev/vg_proxmox_iscsi/vm--disk-.qcow2 ``` If the VM still has a `parent:` line or `qm listsnapshot` shows the snapshot, remove it through Proxmox first: ```bash qm delsnapshot ``` Only consider manual removal if: * the VM no longer references the snapshot, * no backing chain references the snapshot volume, * no QEMU process has it open, * and Proxmox cannot delete it normally. !!! danger "Do not manually delete active snapshot-chain volumes" Deleting an active snapshot backing volume can corrupt the VM disk chain. --- # Example Cleanup Walkthrough ## Scenario VM `107` has this config: ```text efidisk0: iscsi-cluster-lvm:vm-107-disk-2.qcow2 sata0: iscsi-cluster-lvm:vm-107-disk-3.qcow2 ``` Storage contains: ```text vm-107-disk-0.qcow2 vm-107-disk-1.qcow2 vm-107-disk-2.qcow2 vm-107-disk-3.qcow2 ``` `disk-0` and `disk-1` do not appear in any config and are not open by any process. ## Verify ```bash grep -R "vm-107-disk-0.qcow2" /etc/pve/qemu-server/ || true grep -R "vm-107-disk-1.qcow2" /etc/pve/qemu-server/ || true lsof | grep "vm-107-disk-0.qcow2" || true lsof | grep "vm-107-disk-1.qcow2" || true ``` Expected output: ```text no output ``` ## Delete ```bash pvesm free iscsi-cluster-lvm:vm-107-disk-0.qcow2 pvesm free iscsi-cluster-lvm:vm-107-disk-1.qcow2 ``` ## Validate ```bash pvesm list iscsi-cluster-lvm | grep "vm-107-disk" lvs -a -o lv_name,lv_size,lv_attr,devices vg_proxmox_iscsi | grep "vm-107-disk" vgs vg_proxmox_iscsi ``` Expected remaining volumes: ```text vm-107-disk-2.qcow2 vm-107-disk-3.qcow2 ``` --- # Technician Checklist Use this checklist before removing any orphan disk. * [ ] I ran the storage orphan audit. * [ ] I ran the cluster confirmation script on every Proxmox node. * [ ] I confirmed the candidate volume is not referenced in any VM config. * [ ] I confirmed the candidate volume is not open by any process. * [ ] I confirmed the candidate volume is not part of an active snapshot chain. * [ ] I confirmed the VMID relationship is understood. * [ ] I used `pvesm free` first. * [ ] I used `lvremove` only if Proxmox refused and the volume was proven orphaned. * [ ] I validated storage state after cleanup. * [ ] I validated the affected VM still references only expected disks. --- # Quick Reference Commands ## List shared storage volumes ```bash pvesm list iscsi-cluster-lvm ``` ## List LVs ```bash lvs -a -o lv_name,lv_path,lv_size,lv_attr,devices vg_proxmox_iscsi ``` ## Search VM configs ```bash grep -R "vm--disk-" /etc/pve/qemu-server/ || true ``` ## Check open files ```bash lsof | grep "vm--disk-" || true ``` ## Check image metadata ```bash qemu-img info --backing-chain /dev/vg_proxmox_iscsi/vm--disk-.qcow2 ``` ## Delete via Proxmox ```bash pvesm free iscsi-cluster-lvm:vm--disk-.qcow2 ``` ## Delete via LVM fallback ```bash lvremove /dev/vg_proxmox_iscsi/vm--disk-.qcow2 ``` ## Verify storage usage ```bash pvesm status vgs vg_proxmox_iscsi ```