Documentation Restructure
All checks were successful
GitOps Automatic Documentation Deployment / Sync Docs to https://kb.bunny-lab.io (push) Successful in 4s
GitOps Automatic Documentation Deployment / Sync Docs to https://docs.bunny-lab.io (push) Successful in 6s

This commit is contained in:
2026-01-27 05:25:22 -07:00
parent 3ea11e04ff
commit e73bb0376f
205 changed files with 469 additions and 146 deletions

View File

@@ -0,0 +1,119 @@
**Purpose**: Deploying a Windows Server Node into the Hyper-V Failover Cluster is an essential part of rebuilding and expanding the backbone of my homelab. The documentation below goes over the process of setting up a bare-metal host from scratch and integrating it into the Hyper-V Failover Cluster.
!!! note "Prerequisites & Assumptions"
This document assumes you are have installed and are running a bare-metal Hewlett-Packard Enterprise server with iLO (Integrated Lights Out) with the latest build of **Windows Server 2022 Datacenter (Desktop Experience)**.
This document also assumes that you are adding an additional server node to an existing Hyper-V Failover Cluster. This document does not outline the exact process of setting up a Hyper-V Failover Cluster from-scratch, setting up a domain, DNS server, etc. Those are assumed to already exist in the environment. Your domain controller(s) need to be online and accessible from the Failover Cluster node you are building for things to work correctly.
Download the newest build ISO of Windows Server 2022 at the [Microsoft Evaluation Center](https://go.microsoft.com/fwlink/p/?linkid=2195686&clcid=0x409&culture=en-us&country=us)
### Enable Remote Desktop
Enable remote desktop however you can, but just be sure to disable NLA, see the notes below for details.
!!! warning "Disable NLA (Network Level Authentication)"
Ensure that "Allow Connections only from computers running Remote Desktop with Network Level Authentication" is un-checked. This is important because if you are running a Hyper-V Failover Cluster, if the domain controller(s) are not running, you may be effectively locked out from using Remote Desktop to access the failover cluster's nodes, forcing you to use iLO or a physical console into the server to log in and bootstrap the cluster's Guest VMs online.
This step can be disregarded if the domain controller(s) exist outside of the Hyper-V Failover Cluster.
``` powershell
# Enable Remote Desktop (NLA-Disabled)
Set-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\Terminal Server" -Name "fDenyTSConnections" -Value 0
Set-ItemProperty -Path "HKLM:\System\CurrentControlSet\Control\Terminal Server\WinStations\RDP-Tcp" -Name "UserAuthentication" -Value 0 Enable-NetFirewallRule -DisplayGroup "Remote Desktop"
```
### Provision Server Roles, Activate, and Domain Join
``` powershell
# Rename the server
Rename-Computer BUNNY-NODE-02
# Install Hyper-V, Failover, and MPIO Server Roles
Install-WindowsFeature -Name Hyper-V, Failover-Clustering, Multipath-IO -IncludeManagementTools
# Change edition of Windows (Then Reboot)
irm https://get.activated.win | iex
# Force activate server (KMS38)
irm https://get.activated.win | iex
# Configure DNS Servers
Get-NetAdapter | Where-Object { $_.Status -eq 'Up' } | ForEach-Object { Set-DnsClientServerAddress -InterfaceIndex $_.InterfaceIndex -ServerAddresses ("192.168.3.25","192.168.3.26") }
# Domain-join the server
Add-Computer BUNNY-LAB.io
# Restart the Server
Restart-Computer
```
## Failover Cluster Configuration
### Configure Cluster SET Networking
!!! note "Disable Embedded Ports"
We want to only use the 10GbE Cluster_SET network for both virtual machines and the virtualization host itself. This ensures that **all** traffic goes through the 10GbE team. Disable all other non-10GbE network adapters.
You will need to start off by configuring a Switch Embedded Teaming (SET) team. This is the backbone that the server will use for all Guest VM traffic as well as remote-desktop access to the server node itself. You will need to rename the network adapters to make management easier.
- Navigate to "Network Connections" then "Change Adapter Options"
* Rename the network adapters with simpler names. e.g. (`Ethernet 1` becomes `Port_1`)
* For the sake of demonstration, assume there are 2 10GbE NICs (`Port_1` and `Port_2`)
``` powershell
# Create Switch Embedded Teaming (SET) team
New-VMSwitch -Name Cluster_SET -NetAdapterName Port_1, Port_2 -EnableEmbeddedTeaming $true
# Disable IPv4 and IPv6 on all other network adapters
Get-NetAdapter | Where-Object { $_.Name -ne "vEthernet (Cluster_SET)" } | ForEach-Object { Set-NetAdapterBinding -Name $_.Name -ComponentID "ms_tcpip" -Enabled $false; Set-NetAdapterBinding -Name $_.Name -ComponentID "ms_tcpip6" -Enabled $false }
# Set IP Address of Cluster_SET for host-access and clustering
New-NetIPAddress -InterfaceAlias "vEthernet (Cluster_SET)" -IPAddress 192.168.3.5 -PrefixLength 24 -DefaultGateway 192.168.3.1
Set-DnsClientServerAddress -InterfaceAlias "vEthernet (Cluster_SET)" -ServerAddresses ("192.168.3.25","192.168.3.26")
```
### Configure iSCSI Initiator to Connect to TrueNAS Core Server
At this point, now that we have verified that the 10GbE NICs can ping their respective iSCSI target server IP addresses, we can add them to the iSCSI Initiator in Server Manager which will allow us to mount the cluster storage for the Hyper-V Failover Cluster.
- Open **Server Manager > MPIO**
* Navigate to the "Discover Multi-Paths" tab
* Check the "Add support for iSCSI devices" checkbox
* Click the "Add" button
- Open **TrueNAS Core Server**
* Navigate to the [TrueNAS Core server](http://192.168.3.3) and add the "Initiator Name" seen on the "Configuration" tab of the iSCSI Initiator on the Virtualization Host to the `Sharing > iSCSI > Initiator Groups` > "iSCSI-Connected Servers"
- Open **iSCSI Initiator**
* Click on the "Discovery" tab
* Click the "Discover Portal" button
* Enter the IP addresses of "192.168.3.3". Leave the port as "3260".
* Example Initiator Name: `iqn.1991-05.com.microsoft:bunny-node-02.bunny-lab.io`
* Click the "Targets" tab to go back to the main page
* Click the "Refresh" button to display available iSCSI Targets
* Click on the first iSCSI Target `iqn.2005-10.org.moon-storage-01.ctl:iscsi-cluster-storage` then click the "Connect" button
* Check the "Enable Multi-Path" checkbox
* Click the "Advanced" button
* Click the "OK" button
* Navigate to "Disk Management" to bring the iSCSI drives "Online" (Dont do anything after this in Disk Management)
## Initialize and Join to Existing Failover-Cluster
### Validate Server is Ready to Join Cluster
Now it is time to set up the Failover Cluster itself so we can join the server to the existing cluster.
- Open **Server Manager**
* Click on the "Tools" dropdown menu
* Click on "Failover Cluster Manager"
* Click the "Validate Configuration" button in the middle of the window that appears
* Click "Next"
* Enter Server Name: `BUNNY-NODE-02.bunny-lab.io`
* Click the "Add" button, then "Next"
* Ensure "Run All Tests (Recommended)" is selected, then click "Next", then click "Next" to start.
### Join Server to Failover Cluster
* On the left-hand side, right-click on the "Failover Cluster Manager" in the tree
* Click on "Connect to Cluster"
* Enter `USAGI-CLUSTER.bunny-lab.io`
* Click "OK"
* Expand "USAGI-CLUSTER.bunny-lab.io" on the left-hand tree
* Right-click on "Nodes"
* Click "Add Node..."
* Click "Next"
* Enter Server Name: `BUNNY-NODE-02.bunny-lab.io`
* Click the "Add" button, then "Next"
* Ensure that "Run Configuration Validation Tests" radio box is checked, then click "Next"
* Validate that the node was successfully added to the Hyper-V Failover Cluster
## Cleanup & Final Touches
Ensure that you run all available Windows Updates before delegating guest VM roles to the new server in the failover cluster. This ensures you are up-to-date before you become reliant on the server for production operations.

View File

@@ -0,0 +1,80 @@
**Purpose**: If you run an environment with multiple Hyper-V: Failover Clusters, for the purpose of Hyper-V: Failover Cluster Replication via a `Hyper-V Replica Broker` role installed on a host within the Failover Cluster, sometimes a GuestVM will fail to replicate itself to the replica cluster, and in those cases, it may not be able to recover on its own. This guide attempts to outline the process to rebuild replication for GuestVMs on a one-by-one basis.
!!! note "Assumptions"
This guide assumes you have two Hyper-V Failover Clusters, for the sake of the guide, we will refer to the Production cluster as `CLUSTER-01` and the Replication cluster as `CLUSTER-02`. This guide also assumes that Replication was set up beforehand, and does not include instructions on how to deploy a Replica Broker (at this time).
## Production Cluster - CLUSTER-01
### Locate the GuestVM
You need to start by locating the GuestVM in the Production cluster, CLUSTER-01. You will know you found the VM if the "Replication Health" is either `Unhealthy`, `Warning`, or `Critical`.
### Remove Replication from GuestVM
- Within a node of the Hyper-V: Failover Cluster Manager
- Right-Click the GuestVM
- Navigate to "**Replication > Remove Replication**"
- Confirm the removal by clicking the "**Yes**" button. You will know if it removed replication when the "Replication State" of the GuestVM is `Not enabled`
## Replication Cluster - CLUSTER-02
### Note the storage GUID of the GuestVM in the replication cluster
- Within a node of the replication cluster's Hyper-V: Failover Cluster Manager
- Right-Click the same GuestVM and click "Manage..." `This will open Hyper-V Manager`
- Right-Click the GuestVM and click "Settings..."
- Navigate to "**ISCSI Controller**"
- Click on one of the Virtual Disks attached to the replica VM, and note the full folder path for later. e.g. `C:\ClusterStorage\Volume1\HYPER-V REPLICA\VIRTUAL HARD DISKS\020C9A30-EB02-41F3-8D8B-3561C4521182`
!!! warning "Noting the GUID of the GuestVM"
You need to note the folder location so you have the GUID. Without the GUID, cleaning up the old storage associated with the GuestVM replica files will be much more difficult / time-consuming. Note it down somewhere safe, and reference it later in this guide.
### Delete the GuestVM from the Replication Cluster
Now that you have noted the GUID of the storage folder of the GuestVM, we can safely move onto removing the GuestVM from the replication cluster.
- Within a node of the replication cluster's Hyper-V: Failover Cluster Manager
- Right-Click the GuestVM
- Navigate to "**Replication > Remove Replication**"
- Confirm the removal by clicking the "**Yes**" button. You will know if it removed replication when the "Replication State" of the GuestVM is `Not enabled`
- Right-Click the GuestVM (again) `You will see that "Enable Replication" is an option now, indicating it was successfully removed.`
!!! note "Replica Checkpoint Merges"
When you removed replication, there may have been replication checkpoints that automatically try to merge together with a `Merge in Progress` status. Just let it finish before moving forward.
- Within the same node of the replication cluster's Hyper-V: Failover Cluster Manager `Switch back from Hyper-V Manager`
- Right-Click the GuestVM and click "**Remove**"
- Confirm the action by clicking the "**Yes**" button
### Delete the GuestVM manually from Hyper-V Manager on all replication cluster hosts
At this point in time, we need to remove the GuestVM from all of the servers in the cluster. Just because we removed it from the Hyper-V: Failover Cluster did not remove it from the cluster's nodes. We can automate part of this work by opening Hyper-V Manager on the same Failover Node we have been working on thus far, and from there we can connect the rest of the replication nodes to the manager to have one place to connect to all of the nodes, avoiding hopping between servers.
- Open Hyper-V Manager
- Right-Click "Hyper-V Manager" on the left-hand navigation menu
- Click "Connect to Server..."
- Type the names of every node in the replication cluster to connect to each of them, repeating the two steps above for every node
- Remove GuestVM from the node it appears on
- On one of the replication cluster nodes, we will see the GuestVM listed, we are going to Right-Click the GuestVM and select "**Delete**"
### Delete the GuestVM's replicated VHDX storage from replication ClusterStorage
Now we need to clean up the storage left behind by the replication cluster.
- Within a node of the replication cluster
- Navigate to `C:\ClusterStorage\Volume1\HYPER-V REPLICA\VIRTUAL HARD DISKS`
- Delete the entire GUID folder noted in the previous steps. `e.g. 020C9A30-EB02-41F3-8D8B-3561C4521182`
## Production Cluster - CLUSTER-01
### Re-Enable Replication on GuestVM in Cluster-01 (Production Cluster)
At this point, we have disabled replication for the GuestVM and cleaned up traces of it in the replication cluster. Now we need to re-enable replication on the GuestVM back in the production cluster.
- Within a node of the production Hyper-V: Failover Cluster Manager
- Right-Click the GuestVM
- Navigate to "**Replication > Enable Replication...**"
- Click "Next"
- For the "**Replica Server**", enter the name of the role of the Hyper-V Replica Broker role in the (replication cluster's) Failover Cluster. `e.g. CLUSTER-02-REPL`, then click "Next"
- Click the "Select Certificate" button, since the Broker was configured with Certificate-based authentication instead of Kerberos (in this example environment). It will prompt you to accept the certificate by clicking "OK". (e.g. `HV Replica Root CA`), then click "Next"
- Make sure every drive you want replicated is checked, then click "Next"
- Replication Frequency: `5 Minutes`, then click "Next"
- Additional Recovery Points: `Maintain only the latest recovery point`, then click "Next"
- Initial Replication Method: `Send initial copy over the network`
- Schedule Initial Replication: `Start replication immediately`
- Click "Next"
- Click "Finish"
!!! success "Replication Enabled"
If everything was successful, you will see a dialog box named "Enable replication for `<GuestVM>`" with a message similar to the following: "Replica virtual machine `<GuestVM>` was successfully created on the specified Replica server `<Node-in-Replication-Cluster>`.
At this point, you can click "Close" to finish the process. Under the GuestVM details, you will see "Replication State": `Initial Replication in Progress`.

View File

@@ -0,0 +1,35 @@
**Purpose**:
You may find that you want to be able to live-migrate guestVMs on a Hyper-V environment that is not clustered as a Hyper-V Failover Cluster, you will have permission issues. One way to work around this is to use CredSSP as the authentication mechanism, which is not ideal but useful in a pinch, or you can use Kerberos-based authentication.
This document will cover both scenarios.
=== "Kerberos Authentication (*Preferred*)"
- Log into a domain controller that both Hyper-V hosts are capable of communicating with
- Open "**Server Manager > Tools " Active Directory Users & Computers**"
- Locate the computer objects representing both of the Hyper-V servers and repeat the steps below for each Hyper-V computer object:
- Right-Click > "**Properties**"
- Click on the "**Delegation**" Tab
- Check the radiobox for the open "**Trust this computer for delegation to specified services only.**"
- Ensure that "**User Kerberos Only** is checked
- Click on the "**Add**" button
- Click the "**Users or Computers...**" button
- Within the object search field, type in the name of the Hyper-V server you want to delegate access to (this will be the opposite host. e.g. VIRT-NODE-02, then repeat these steps later to delegate access for VIRT-NODE-01, etc)
- You will see a list of services that you can allow delegation to, add the following services:
- `cisvc`
- `mcsvc`
- `cifs`
- `Virtual Machine Migration Service`
- `Microsoft Virtualization Console`
- Click the "**Apply**" button, then click the "**OK**" button to finalize these changes.
- Repeat the above steps for the opposite Hyper-V host. This way both hosts are delegated to eachother
- e.g. `VIRT-NODE-01 <---(delegation)---> VIRT-NODE-02`
=== "CredSSP Authentication"
- Log into both Hyper-V Hosts as the same administrative user. Preferrably a domain administrator
- From the Hyper-V host currently running the GuestVM that needs to be migrated, open Hyper-V Manager and right-click > "**Move**" the guestVM.
- Select the destination by providing the fully-qualified domain name of the destination server (or in some cases the shorthand hostname of the destination server)
- It should begin the migration process.
**Note**: Do not perform a "Pull" from source to the destination. You want to always "Push" the VM to its destination. It will generally fail if you try to "Pull" the VM to its destination due to the way that CredSSP works in this context.

View File

@@ -0,0 +1,150 @@
!!! warning "Document Under Construction"
This document is very unfinished and should **NOT** be followed by anyone for deployment at this time.
**Purpose**: Deploying OpenStack via Ansible.
## Required Hardware/Infrastructure Breakdown
Every node in the OpenStack environment (including the deployment node) will be running Rocky Linux 9.5, as OpenStack Ansible only supports CentOS/RHEL/Rocky for its deployment.
| **Hostname** | **IP** | **Storage** | **Memory** | **CPU** | **Network** | **Purpose** |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| OPENSTACK-BOOTSTRAPPER | 192.168.3.46 (eth0) | 32GB (OS) | 4GB | 4-Cores | eth0 | OpenStack Ansible Playbook Deployment Node |
| OPENSTACK-NODE-01 | 192.168.3.43 (eth0) | 250GB (OS), 500GB (Ceph Storage) | 32GB | 16-Cores | eth0, eth1 | OpenStack Cluster/Target Node |
| OPENSTACK-NODE-02 | 192.168.3.44 (eth0) | 250GB (OS), 500GB (Ceph Storage) | 32GB | 16-Cores | eth0, eth1 | OpenStack Cluster/Target Node |
| OPENSTACK-NODE-03 | 192.168.3.45 (eth0) | 250GB (OS), 500GB (Ceph Storage) | 32GB | 16-Cores | eth0, eth1 | OpenStack Cluster/Target Node |
## Configure Hard-Coded DNS for Cluster Nodes
We want to ensure everything works even if the nodes have no internet access. By hardcoding the FQDNs, this protects us against several possible stupid situations.
Run the following script to add the DNS entries.
```sh
# Make yourself root
sudo su
```
!!! note "Run `sudo su` Separately"
When I ran `sudo su` and the echo commands below as one block of commands, it did not correctly write the changes to the `/etc/hosts` file. Just run `sudo su` by itself, then you can copy paste the codeblock below for all of the echo lines for each DNS entry.
```sh
# Add the OpenStack node entries to /etc/hosts
echo "192.168.3.43 OPENSTACK-NODE-01.bunny-lab.io OPENSTACK-NODE-01" >> /etc/hosts
echo "192.168.3.44 OPENSTACK-NODE-02.bunny-lab.io OPENSTACK-NODE-02" >> /etc/hosts
echo "192.168.3.45 OPENSTACK-NODE-03.bunny-lab.io OPENSTACK-NODE-03" >> /etc/hosts
```
### Validate DNS Entries Added
```sh
cat /etc/hosts
```
!!! example "/etc/hosts Example Contents"
When you run `cat /etc/hosts`, you should see output similar to the following:
```ini title="/etc/hosts"
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.3.43 OPENSTACK-NODE-01.bunny-lab.io OPENSTACK-NODE-01
192.168.3.44 OPENSTACK-NODE-02.bunny-lab.io OPENSTACK-NODE-02
192.168.3.45 OPENSTACK-NODE-03.bunny-lab.io OPENSTACK-NODE-03
```
## OpenStack Deployment Node
The "Deployment" node / bootstrapper is responsible for running Ansible playbooks against the cluster nodes that will eventually be running OpenStack. [Original Deployment Node Documentation](https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/deploymenthost.html)
### Install Necessary Software
```sh
sudo su
dnf upgrade
dnf install -y git chrony openssh-server python3-devel sudo
dnf group install -y "Development Tools"
```
### Configure SSH keys
Ansible uses SSH with public key authentication to connect the deployment host and target hosts. Run the following commands to configure this.
!!! warning "Do not run as root"
You want to make sure you run these commands as a normal user. (e.g. `nicole`).
``` sh
# Generate SSH Keys (Private / Public)
ssh-keygen
# Install Public Key on OpenStack Cluster/Target Nodes
ssh-copy-id -i /home/nicole/.ssh/id_rsa.pub nicole@openstack-node-01.bunny-lab.io
ssh-copy-id -i /home/nicole/.ssh/id_rsa.pub nicole@openstack-node-02.bunny-lab.io
ssh-copy-id -i /home/nicole/.ssh/id_rsa.pub nicole@openstack-node-03.bunny-lab.io
# Validate that SSH Authentication Works Successfully on Each Node
ssh nicole@openstack-node-01.bunny-lab.io
ssh nicole@openstack-node-02.bunny-lab.io
ssh nicole@openstack-node-03.bunny-lab.io
```
### Install the source and dependencies
Install the source and dependencies for the deployment host.
```sh
sudo su
git clone -b master https://opendev.org/openstack/openstack-ansible /opt/openstack-ansible
cd /opt/openstack-ansible
bash scripts/bootstrap-ansible.sh
```
### Disable Firewalld
The `firewalld` service is enabled on most CentOS systems by default and its default ruleset prevents OpenStack components from communicating properly. Stop the firewalld service and mask it to prevent it from starting.
```sh
systemctl stop firewalld
systemctl mask firewalld
```
## OpenStack Target Node (1/3)
Now we need to get the cluster/target nodes configured so that OpenStack can be deployed into them via the bootstrapper node later. [Original Target Node Documentation](https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html)
### Disable SELinux
SELinux enabled is not currently supported in OpenStack-Ansible for CentOS/RHEL due to a lack of maintainers for the feature.
```sh
sudo sed -i 's/^SELINUX=enforcing/SELINUX=disabled/' /etc/sysconfig/selinux
```
### Disable Firewalld
The `firewalld` service is enabled on most CentOS systems by default and its default ruleset prevents OpenStack components from communicating properly. Stop the firewalld service and mask it to prevent it from starting.
```sh
systemctl stop firewalld
systemctl mask firewalld
```
### Install Necessary Software
```sh
dnf upgrade
dnf install -y iputils lsof openssh-server sudo tcpdump python3
```
### Reduce Kernel Logging
Reduce the kernel log level by changing the printk value in your sysctls.
```sh
sudo echo "kernel.printk='4 1 7 4'" >> /etc/sysctl.conf
```
### Configure Local Cinder/Ceph Storage (Optional if using iSCSI)
At this point, we need to configure `/dev/sdb` as the local storage for Cinder.
```sh
pvcreate --metadatasize 2048 /dev/sdb
vgcreate cinder-volumes /dev/sdb
```
!!! failure "`Cannot use /dev/sdb: device is partitioned`"
You may (in rare cases) see the following error when trying to run `pvcreate --metadatasize 2048 /dev/sdb`, if that happens, just use `lsblk` to get the drive of the expected disk. In my example, we want the 500GB disk located at `/dev/sda`, seen in the example below:
```
[root@openstack-node-02 nicole]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 500G 0 disk
sdb 8:16 0 250G 0 disk
├─sdb1 8:17 0 600M 0 part /boot/efi
├─sdb2 8:18 0 1G 0 part /boot
├─sdb3 8:19 0 15.7G 0 part [SWAP]
└─sdb4 8:20 0 232.7G 0 part /
sr0 11:0 1 1024M 0 rom
```
!!! question "End of Current Documentation"
This is the end of where I have currently iterated in my lab and followed-along with the official documentation while generalizing it for my specific lab scenarios. The following link is where I am currently at/stuck and need to revisit at my earliest convenience.
https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/targethosts.html#configuring-the-network

View File

@@ -0,0 +1,76 @@
# OpenStack
OpenStack is basically a virtual machine hypervisor that is HA and cluster-friendly. This particular variant is deployed via Canonical's MiniStack environment using SNAP. It will deploy OpenStack onto a single node, which can later be expanded to additional nodes. You can also use something like OpenShift to deploy a Kubernetes Cluster onto OpenStack automatically via its various APIs.
**Reference Documentation**:
- https://discourse.ubuntu.com/t/single-node-guided/35765
- https://microstack.run/docs/single-node-guided
!!! note
This document assumes your bare-metal host server is running Ubuntu 22.04 LTS, has at least 16GB of Memory (**32GB for Multi-Node Deployments**), two network interfaces (one for management, one for remote VM access), 200GB of Disk Space for the root filesystem, another 200GB disk for Ceph distributed storage, and 4 processor cores. See [Single-Node Mode System Requirements](https://ubuntu.com/openstack/install)
!!! note Assumed Networking on the First Cluster Node
- **eth0** = 192.168.3.5
- **eth1** = 192.168.5.200
### Update APT then install upgrades
```
sudo apt update && sudo apt upgrade -y && sudo apt install htop ncdu iptables nano -y
```
!!! tip
At this time, it would be a good idea to take a checkpoint/snapshot of the server (if it is a virtual machine). This gives you a starting point to come back to as you troubleshoot inevitable deployment issues.
### Update SNAP then install OpenStack SNAP
```
sudo snap refresh
sudo snap install openstack --channel 2023.1
```
### Install & Configure Dependencies
Sunbeam can generate a script to ensure that the machine has all of the required dependencies installed and is configured correctly for use in MicroStack.
```
sunbeam prepare-node-script | bash -x && newgrp snap_daemon
sudo reboot
```
### Bootstrapping
Deploy the OpenStack cloud using the cluster bootstrap command.
```
sunbeam cluster bootstrap
```
!!! warning
If you get an "Unable to connect to websocket" error, run `sudo snap restart lxd`.
[Known Bug Report](https://bugs.launchpad.net/snap-openstack/+bug/2033400)
!!! note
Management networks shared by hosts = `192.168.3.0/24`
MetalLB address allocation range (supports multiple ranges, comma separated) (10.20.21.10-10.20.21.20): `192.168.3.50-192.168.3.60`
### Cloud Initialization:
- nicole@moon-stack-01:~$ `sunbeam configure --openrc demo-openrc`
- Local or remote access to VMs [local/remote] (local): `remote`
- CIDR of network to use for external networking (10.20.20.0/24): `192.168.5.0/24`
- IP address of default gateway for external network (192.168.5.1):
- Populate OpenStack cloud with demo user, default images, flavors etc [y/n] (y):
- Username to use for access to OpenStack (demo): `nicole`
- Password to use for access to OpenStack (Vb********): `<PASSWORD>`
- Network range to use for project network (192.168.122.0/24):
- List of nameservers guests should use for DNS resolution (192.168.3.11 192.168.3.10):
- Enable ping and SSH access to instances? [y/n] (y):
- Start of IP allocation range for external network (192.168.5.2): `192.168.5.201`
- End of IP allocation range for external network (192.168.5.254): `192.168.5.251`
- Network type for access to external network [flat/vlan] (flat):
- Free network interface that will be configured for external traffic: `eth1`
- WARNING: Interface eth1 is configured. Any configuration will be lost, are you sure you want to continue? [y/n]: y
### Pull Down / Generate the Dashboard URL
```
sunbeam openrc > admin-openrc
sunbeam dashboard-url
```
### Launch a Test VM:
Verify the cloud by launching a VM called test based on the ubuntu image (Ubuntu 22.04 LTS).
```
sunbeam launch ubuntu --name test
```
!!! note Sample output:
- Launching an OpenStack instance ...
- Access instance with `ssh -i /home/ubuntu/.config/openstack/sunbeam ubuntu@10.20.20.200`

View File

@@ -0,0 +1,116 @@
## Purpose
You may need to deploy many copies of a virtual machine rapidly, and don't want to go through the hassle of setting up everything ad-hoc as the needs arise for each VM workload. Creating a cloud-init template allows you to more rapidly deploy production-ready copies of a template VM (that you create below) into a ProxmoxVE environment.
### Download Image and Import into ProxmoxVE
You will first need to pull down the OS image from Ubuntu's website via CLI, as there is currently no way to do this via the WebUI. Using SSH or the Shell within the WebUI of one of the ProxmoxVE servers, run the following commands to download and import the image into ProxmoxVE.
```sh
# Make a place to keep cloud images
mkdir -p /var/lib/vz/template/images/ubuntu && cd /var/lib/vz/template/images/ubuntu
# Download Ubuntu 24.04 LTS cloud image (amd64, server)
wget -q --show-progress https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
# Create a Placeholder VM to Attach Cloud Image
qm create 9000 --name ubuntu-2404-cloud --memory 8192 --cores 8 --net0 virtio,bridge=vmbr0
# Set UEFI (OVMF) + SCSI controller (Cloud images expect UEFI firmware and SCSI disk.)
qm set 9000 --bios ovmf --scsihw virtio-scsi-pci
qm set 9000 --efidisk0 nfs-cluster-storage:0,pre-enrolled-keys=1
# Import the disk into ProxmoxVE
qm importdisk 9000 noble-server-cloudimg-amd64.img nfs-cluster-storage --format qcow2
# Query ProxmoxVE to find out where the volume was created
pvesm list nfs-cluster-storage | grep 9000
# Attach the disk to the placeholder VM
qm set 9000 --scsi0 nfs-cluster-storage:9000/vm-9000-disk-0.qcow2
# Configure Disk to Boot
qm set 9000 --boot c --bootdisk scsi0
```
### Add Cloud-Init Drive & Configure Template Defaults
Now that the Ubuntu cloud image is attached as the VMs primary disk, you need to attach a Cloud-Init drive. This special drive is where Proxmox writes your user data (username, SSH keys, network settings, etc.) at clone time.
```sh
# Add a Cloud-Init drive to the VM
qm set 9000 --ide2 nfs-cluster-storage:cloudinit
# Enable QEMU Guest Agent
qm set 9000 --agent enabled=1
# Set a default Cloud-Init user (replace 'nicole' with your preferred username)
qm set 9000 --ciuser nicole
# Set a default password (this will be resettable per-clone)
qm set 9000 --cipassword 'SuperSecretPassword'
# Set DNS Servers and Search Domain
qm set 9000 --nameserver "1.1.1.1 1.0.0.1"
qm set 9000 --searchdomain bunny-lab.io
# Enable automatic package upgrades within the VM on first boot
qm set 9000 --ciupgrade 1
# Download your infrastructure public SSH key onto the Proxmox node
wget -O /root/infrastructure_id_rsa.pub \
https://git.bunny-lab.io/Infrastructure/LinuxServer_SSH_PublicKey/raw/branch/main/id_rsa.pub
# Tell Proxmox to inject this key via Cloud-Init
qm set 9000 --sshkey /root/infrastructure_id_rsa.pub
# Configure networking to use DHCP by default (this will be overridden at cloning)
qm set 9000 --ipconfig0 ip=dhcp
```
### Setup Packages in VM & Convert to Template
At this point, we have a few things we need to do first before we can turn the VM into a template and make clones of it. You will need to boot up the VM we made (id 9000) and run the following commands to prepare it for becoming a template:
```sh
# Install Updates
sudo apt update && sudo apt upgrade
sudo apt install -y qemu-guest-agent cloud-init
sudo systemctl enable qemu-guest-agent --now
# Magic Stuff Goes Here =============================
# Convert the placeholder VM into a reusable template (ignore chattr errors on NFS storage backends)
qm template 9000
```
### Clone the Template into a New VM
You can now create new VMs instantly from the template we created above.
=== "Via WebUI"
- Log into the ProxmoxVE node where the template was created
- Right-Click the Template > "**Clone**"
- Give the new VM a name
- Set the "Mode" of the clone to "**Full Clone**"
- Navigate to the new GuestVM in ProxmoxVE and click on the "**Cloud-Init**" tab
- Change the "**User**" and "**Password**" fields if you want to change them
- Double-click on the "**IP Config (net0)**" option
- **IPv4/CIDR**: `192.168.3.67/24`
- **Gateway (IPv4)**: `192.168.3.1`
- Click the "**OK**" button
- Start the VM and wait for it to automatically provision itself
=== "Via CLI"
``` sh
# Create a new VM (example: VM 9100) cloned from the template
qm clone 9000 9100 --name ubuntu-2404-test --full
# Optionally, override Cloud-Init settings for this clone:
qm set 9100 --ciuser nicole --cipassword 'AnotherStrongPass'
qm set 9100 --ipconfig0 ip=192.168.3.67/24,gw=192.168.3.1
# Boot the new cloned VM
qm start 9100
```
### Configure VM Hostname
At this point, the hostname of the VM will be randomized and you will probably want to set it to something statically, you can do that with the following commands after the server has finished starting:
```sh
```

View File

@@ -0,0 +1,7 @@
**Purpose**: The purpose of this document is to outline common tasks that you may need to run in your cluster to perform various tasks.
## Delete Node from Cluster
Sometimes you may need to delete a node from the cluster if you have re-built it or had issues and needed to destroy it. In these instances, you would run the following command (assuming you have a 3-node quorum in your cluster).
```
pvecm delnode promox-node-01
```

View File

@@ -0,0 +1,245 @@
## Purpose
This document describes the **end-to-end procedure** for creating a **thick-provisioned iSCSI-backed shared storage target** on **TrueNAS CORE**, and consuming it from a **Proxmox VE cluster** using **shared LVM**.
This approach is intended to:
- Provide SAN-style block semantics
- Enable Proxmox-native snapshot functionality (LVM volume chains)
- Avoid third-party plugins or middleware
- Be fully reproducible via CLI
## Assumptions
- TrueNAS **CORE** (not SCALE)
- ZFS pool already exists and is healthy
- SSH service is enabled on TrueNAS
- Proxmox VE nodes have network connectivity to TrueNAS
- iSCSI traffic is on a reliable, low-latency network (10GbE recommended)
- All VM workloads are drained from at least one Proxmox node for maintenance
!!! note "Proxmox VE Version Context"
This guide assumes **Proxmox VE 9.1.4 (or later)**. Snapshot-as-volume-chain support on shared LVM (e.g., iSCSI) is available and improved, including enhanced handling of vTPM state in offline snapshots.
!!! warning "Important"
`volblocksize` **cannot be changed after zvol creation**. Choose carefully.
## Target Architecture
```
ZFS Pool
└─ Zvol (Thick / Reserved)
└─ iSCSI Extent
└─ Proxmox LVM PV
└─ Shared VG
└─ VM Disks
```
## Create a Dedicated Zvol for Proxmox
### Variables
Adjust as needed before execution.
```sh
POOL_NAME="CLUSTER-STORAGE"
ZVOL_NAME="iscsi-storage"
ZVOL_SIZE="14T"
VOLBLOCKSIZE="16K"
```
### Create the Zvol (Thick-Provisioned)
```sh
zfs create -V ${ZVOL_SIZE} \
-o volblocksize=${VOLBLOCKSIZE} \
-o compression=lz4 \
-o refreservation=${ZVOL_SIZE} \
${POOL_NAME}/${ZVOL_NAME}
```
!!! note
The `refreservation` enforces **true thick provisioning** and prevents overcommit.
## Configure iSCSI Target (TrueNAS CORE)
This section uses a **hybrid approach**:
- **CLI** is used for ZFS and LUN (extent backing) creation
- **TrueNAS GUI** is used for iSCSI portal, target, and association
- **CLI** is used again for validation
### Enable iSCSI Service
```sh
service ctld start
sysrc ctld_enable=YES
```
### Create the iSCSI LUN Backing (CLI)
This step creates the **actual block-backed LUN** that will be exported via iSCSI.
```sh
# Sanity check: confirm the backing zvol exists
ls -l /dev/zvol/${POOL_NAME}/${ZVOL_NAME}
# Create CTL LUN backed by the zvol
ctladm create -b block \
-o file=/dev/zvol/${POOL_NAME}/${ZVOL_NAME} \
-S ISCSI-STORAGE \
-d ISCSI-STORAGE
```
### Verify the LUN is real and correctly sized
```sh
ctladm devlist -v
```
!!! tip
`Size (Blocks)` must be **non-zero** and match the zvol size. If it is `0`, stop and correct before proceeding.
### Configure iSCSI Portal, Target, and Extent Association (CLI Only)
!!! warning "Do NOT Use the TrueNAS iSCSI GUI"
**Once you choose a CLI-managed iSCSI configuration, the TrueNAS Web UI must never be used for iSCSI.**
Opening or modifying **Sharing → Block Shares (iSCSI)** in the GUI will **overwrite CTL runtime state**, invalidate manual `ctladm` configuration, and result in targets that appear correct but expose **no LUNs** to initiators.
**This configuration is CLI-owned and CLI-managed.**
- Do **not** add, edit, or view iSCSI objects in the GUI
- Do **not** use the iSCSI wizard
- Do **not** mix GUI extents with CLI-created LUNs
#### Create iSCSI Portal (Listen on All Interfaces)
```sh
# Backup any existing ctl.conf
cp -av /etc/ctl.conf /etc/ctl.conf.$(date +%Y%m%d-%H%M%S).bak 2>/dev/null || true
# Write a clean /etc/ctl.conf
cat > /etc/ctl.conf <<'EOF'
# --- Bunny Lab: Proxmox iSCSI (CLI-only) ---
auth-group "no-auth" {
auth-type none
initiator-name "iqn.1993-08.org.debian:01:5b963dd51f93" # cluster-node-01 ("cat /etc/iscsi/initiatorname.iscsi")
initiator-name "iqn.1993-08.org.debian:01:1b4df0fa3540" # cluster-node-02 ("cat /etc/iscsi/initiatorname.iscsi")
initiator-name "iqn.1993-08.org.debian:01:5669aa2d89a2" # cluster-node-03 ("cat /etc/iscsi/initiatorname.iscsi")
}
# Listen on all interfaces on the default iSCSI port
portal-group "pg0" {
listen 0.0.0.0:3260
discovery-auth-group "no-auth"
}
# Create a target IQN
target "iqn.2026-01.io.bunny-lab:storage" {
portal-group "pg0"
auth-group "no-auth"
# Export LUN 0 backed by the zvol device
lun 0 {
path /dev/zvol/CLUSTER-STORAGE/iscsi-storage
serial "ISCSI-STORAGE"
device-id "ISCSI-STORAGE"
}
}
EOF
# Restart ctld to apply the configuration file
service ctld restart
# Verify the iSCSI listener is actually up
sockstat -4l | grep ':3260'
# Verify CTL now shows an iSCSI frontend
ctladm portlist -v | egrep -i '(^Port|iscsi|listen=)'
```
!!! success
At this point, the iSCSI target is live and correctly exposing a block device to initiators. You may now proceed to **Connect from ProxmoxVE Nodes** section.
## Connect from ProxmoxVE Nodes
Perform the following **on each Proxmox node**.
```sh
# Install iSCSI Utilities
apt update
apt install -y open-iscsi lvm2
# Discover Target
iscsiadm -m discovery -t sendtargets -p <TRUENAS_IP>
# Log In
iscsiadm -m node --login
# Rescan SCSI Bus
iscsiadm -m session -P 3
### Verify Device
# If everything works successfully, you should see something like "sdi 8:128 0 8T 0 disk".
lsblk
```
## Create Shared LVM (Execute on One Node Only)
!!! warning "Important"
**Only run LVM creation on ONE node**. All other nodes will only scan.
```sh
# Initialize Physical Volume
pvcreate /dev/sdX
# Create Volume Group
vgcreate vg_proxmox_iscsi /dev/sdX
```
## Register Storage in Proxmox
### Rescan LVM (Other Nodes)
```sh
pvscan
vgscan
```
### Add Storage (GUI)
**Datacenter → Storage → Add → LVM**
- ID: `iscsi-cluster-lvm`
- Volume Group: `vg_proxmox_iscsi`
- Content: `Disk image, Container`
- Shared: ✔️
- Allow Snapshots as Volume-Chain: ✔️
## Validation
- Snapshot create / revert / delete
- Live migration between nodes
- PBS backup and restore test
!!! success
If all validation tests pass, the storage is production-ready.
## Expanding iSCSI Storage (No Downtime)
If you need to expand the storage space of the newly-created iSCSI LUN, you can run the ZFS commands seen below on the TrueNAS Core server. The first command increases the size, the second command pre-allocated the space (thick-provisioned).
!!! warning "ProxmoxVE Cluster-specific Notes"
- `pvresize` must be executed on **exactly one** ProxmoxVE node.
- All other nodes should only perform `pvscan` / `vgscan` after the resize.
- Running `pvresize` on multiple nodes can corrupt shared LVM metadata.
```sh
# Expand Zvol (TrueNAS)
zfs set volsize=16T CLUSTER-STORAGE/iscsi-storage
zfs set refreservation=16T CLUSTER-STORAGE/iscsi-storage
service ctld restart
# Rescan the block device on all ProxmoxVE nodes
echo 1 > /sys/class/block/sdX/device/rescan
# Verify on all nodes that the new size is displayed
lsblk /dev/sdX
# Run this on only ONE of the ProxmoxVE nodes.
pvresize /dev/sdX
# Rescan on the other nodes that you did not run the pvresize command on. They will now see the expanded free space.
pvscan
vgscan
```

View File

@@ -0,0 +1,15 @@
## Purpose
Sometimes in some very specific situations, you will find that an LVM / VG just won't come online in ProxmoxVE. If this happens, you can run the following commands (and replace the placeholder location) to manually bring the storage online.
```sh
lvchange -an local-vm-storage/local-vm-storage
lvchange -an local-vm-storage/local-vm-storage_tmeta
lvchange -an local-vm-storage/local-vm-storage_tdata
vgchange -ay local-vm-storage
```
!!! info "Be Patient"
It can take some time for everything to come online.
!!! success
If you see something like this: `6 logical volume(s) in volume group "local-vm-storage" now active`, then you successfully brought the volume online.

View File

@@ -0,0 +1,38 @@
## Purpose
There are a few steps you have to take when upgrading ProxmoxVE from 8.4.1+ to 9.0+. The process is fairly straightforward, so just follow the instructions seen below.
!!! info "GuestVM Assumptions"
It is assumed that if you are running a ProxmoxVE cluster, you will migrate all GuestVMs to another cluster node. If this is a standalone ProxmoxVE server, you will shut down all GuestVMs safely before proceeding.
!!! warning "Perform `pve8to9` Readiness Check"
It's critical that you run the `pve8to9` command to ensure that your ProxmoxVE server meets all of the requirements and doesn't have any failures or potentially server-breaking warnings. If the `pve8to9` command is unknown, then run `apt update && apt dist-upgrade` in the shell then try again. Warnings should be addressed ad-hoc, but *CPU Microcode warnings can be safely ignored*.
**Example pve8to9 Summary Output**:
```sh
= SUMMARY =
TOTAL: 48
PASSED: 39
SKIPPED: 8
WARNINGS: 1
FAILURES: 0
```
### Update Repositories from `bookworm` to `trixie`
```sh
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list.d/pve-install-repo.list
apt update
```
### Upgrade to ProxmoxVE 9.0
!!! warning "Run Upgrade Commands in iLO/iDRAC/IPMI"
At this point, its very likely that if you are using SSH, it may unexpectedly have the session terminated, so you absolutely want to use a local or remote console to the server to run the commands below, both to ensure you maintain access to the console, as well as to see if any issues arise during POST after the reboot.
```sh
apt dist-upgrade -y
reboot
```
!!! note "Disable `pve-enterprise` Repository"
At this point, the ProxmoxVE server should be running on v9.0+, you will want to disable the `pve-enterprise` repository as it will goof up future updates if you don't disable it.

View File

@@ -0,0 +1,152 @@
## Initial Installation / Configuration
Proxmox Virtual Environment is an open source server virtualization management solution based on QEMU/KVM and LXC. You can manage virtual machines, containers, highly available clusters, storage and networks with an integrated, easy-to-use web interface or via CLI.
!!! note
This document assumes you have a storage server that hosts both ISO files via CIFS/SMB share, and has the ability to set up an iSCSI LUN (VM & Container storage). This document assumes that you are using a TrueNAS Core server to host both of these services.
### Create the first Node
You will need to download the [Proxmox VE 8.1 ISO Installer](https://www.proxmox.com/en/downloads) from the Official Proxmox Website. Once it is downloaded, you can use [Balena Etcher](https://etcher.balena.io/#download-etcher) or [Rufus](https://rufus.ie/en/) to deploy Proxmox onto a server.
!!! warning
If you are virtualizing Proxmox under a Hyper-V environment, you will need to follow the [Official Documentation](https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/user-guide/enable-nested-virtualization) to ensure that nested virtualization is enabled. An example is listed below:
```
Set-VMProcessor -VMName <VMName> -ExposeVirtualizationExtensions $true # (1)
Get-VMNetworkAdapter -VMName <VMName> | Set-VMNetworkAdapter -MacAddressSpoofing On # (2)
```
1. This tells Hyper-V to allow the GuestVM to behave as a hypervisor, nested under Hyper-V, allowing the virtualization functionality of the Hypervisor's CPU to be passed-through to the GuestVM.
2. This tells Hyper-V to allow your GuestVM to have multiple nested virtual machines with their own independant MAC addresses. This is useful when using nested Virtual Machines, but is also a requirement when you set up a [Docker Network](../../../networking/docker-networking/docker-networking.md) leveraging MACVLAN technology.
### Networking
You will need to set a static IP address, in this case, it will be an address within the 20GbE network. You will be prompted to enter these during the ProxmoxVE installation. Be sure to set the hostname to something that matches the following FQDN: `proxmox-node-01.MOONGATE.local`.
| Hostname | IP Address | Subnet Mask | Gateway | DNS Server | iSCSI Portal IP |
| --------------- | --------------- | ------------------- | ------- | ---------- | ----------------- |
| proxmox-node-01 | 192.168.101.200 | 255.255.255.0 (/24) | None | 1.1.1.1 | 192.168.101.100 |
| proxmox-node-01 | 192.168.103.200 | 255.255.255.0 (/24) | None | 1.1.1.1 | 192.168.103.100 |
| proxmox-node-02 | 192.168.102.200 | 255.255.255.0 (/24) | None | 1.1.1.1 | 192.168.102.100 |
| proxmox-node-02 | 192.168.104.200 | 255.255.255.0 (/24) | None | 1.1.1.1 | 192.168.104.100 |
### iSCSI Initator Configuration
You will need to add the iSCSI initiator from the proxmox node to the allowed initiator list in TrueNAS Core under "**Sharing > Block Shares (iSCSI) > Initiators Groups**"
In this instance, we will reference Group ID: `2`. We need to add the iniator to the "**Allowed Initiators (IQN)**" section. This also includes the following networks that are allowed to connect to the iSCSI portal:
- `192.168.101.0/24`
- `192.168.102.0/24`
- `192.168.103.0/24`
- `192.168.104.0/24`
To get the iSCSI Initiator IQN of the current Proxmox node, you need to navigate to the Proxmox server's webUI, typically located at `https://<IP>:8006` then log in with username `root` and whatever you set the password to during initial setup when the ISO image was mounted earlier.
- On the left-hand side, click on the name of the server node (e.g. `proxmox-node-01` or `proxmox-node-02`)
- Click on "**Shell**" to open a CLI to the server
- Run the following command to get the iSCSI Initiator (IQN) name to give to TrueNAS Core for the previously-mentioned steps:
``` sh
cat /etc/iscsi/initiatorname.iscsi | grep "InitiatorName=" | sed 's/InitiatorName=//'
```
!!! example
Output of this command will look something like `iqn.1993-08.org.debian:01:b16b0ff1778`.
## Disable Enterprise Subscription functionality
You will likely not be paying for / using the enterprise subscription, so we are going to disable that functionality and enable unstable builds. The unstable builds are surprisingly stable, and should not cause you any issues.
Add Unstable Update Repository:
```jsx title="/etc/apt/sources.list"
# Add to the end of the file
# Non-Production / Unstable Updates
deb https://download.proxmox.com/debian bookworm pve-no-subscription
```
!!! warning
Please note the reference to `bookworm` in both the sections above and below this notice, this may be different depending on the version of ProxmoxVE you are deploying. Please reference the version indicated by the rest of the entries in the sources.list file to know which one to use in the added line section.
Comment-Out Enterprise Repository:
```jsx title="/etc/apt/sources.list.d/pve-enterprise.list"
# deb https://enterprise.proxmox.com/debian/pve bookworm pve-enterprise
```
Pull / Install Available Updates:
``` sh
apt-get update
apt dist-upgrade
reboot
```
## NIC Teaming
You will need to set up NIC teaming to configure a LACP LAGG. This will add redundancy and a way for devices outside of the 20GbE backplane to interact with the server.
- Ensure that all of the network interfaces appear as something similar to the following:
```jsx title="/etc/network/interfaces"
iface eno1 inet manual
iface eno2 inet manual
# etc
```
- Adjust the network interfaces to add a bond:
```jsx title="/etc/network/interfaces"
auto eno1
iface eno1 inet manual
auto eno2
iface eno2 inet manual
auto bond0
iface bond0 inet manual
bond-slaves eno1 eno2
bond-miimon 100
bond-mode 802.3ad
bond-xmit-hash-policy layer2+3
auto vmbr0
iface vmbr0 inet static
address 192.168.0.11/24
gateway 192.168.0.1
bridge-ports bond0
bridge-stp off
bridge-fd 0
# bridge-vlan-aware yes # I do not use VLANs
# bridge-vids 2-4094 # I do not use VLANs (This could be set to any VLANs you want it a member of)
```
!!! warning
Be sure to include both interfaces for the (Dual-Port) 10GbE connections in the network configuration. Final example document will be updated at a later point in time once the production server is operational.
- Reboot the server again to make the networking changes take effect fully. Use iLO / iDRAC / IPMI if you have that functionality on your server in case your configuration goes errant and needs manual intervention / troubleshooting to re-gain SSH control of the proxmox server.
## Generalizing VMs for Cloning / Templating:
These are the commands I run after cloning a Linux machine so that it resets all information for the machine it was cloned from.
!!! note
If you use cloud-init-aware OS images as described under Cloud-Init Support on https://pve.proxmox.com/pve-docs/chapter-qm.html, these steps wont be necessary!
```jsx title="Change Hostname"
sudo nano /etc/hostname
```
```jsx title="Change Hosts File"
sudo nano /etc/hosts
```
```jsx title="Reset the Machine ID"
rm -f /etc/machine-id /var/lib/dbus/machine-id
dbus-uuidgen --ensure=/etc/machine-id
dbus-uuidgen --ensure
```
```jsx title="Regenerate SSH Keys"
rm -f /etc/machine-id /var/lib/dbus/machine-id
dbus-uuidgen --ensure=/etc/machine-id
dbus-uuidgen --ensure
```
```jsx title="Reboot the Server to Apply Changes"
reboot
```
## Configure Alerting
Setting up alerts in Proxmox is important and critical to making sure you are notified if something goes wrong with your servers.
https://technotim.live/posts/proxmox-alerts/

View File

@@ -0,0 +1,116 @@
**Purpose**: There is a way to incorporate ProxmoxVE and TrueNAS more deeply using SSH, simplifying the deployment of virtual disks/volumes passed into GuestVMs in ProxmoxVE. Using ZFS over iSCSI will give you the following non-exhaustive list of benefits:
- Automatically make Zvols in a ZFS Storage Pool
- Automatically bind device-based iSCSI Extents/LUNs to the Zvols
- Allow TrueNAS to handle VM snapshots directly
- Simplify the filesystem overhead of using TrueNAS and iSCSI with ProxmoxVE
- Ability to take snapshots of GuestVMs
- Ability to perform live-migrations of GuestVMs between ProxmoxVE cluster nodes
!!! note "Environment Assumptions"
This document assumes you are running at least 2 ProxmoxVE nodes. For the sake of the example, it will assume they are named `proxmox-node-01` and `proxmox-node-02`. We will also assume you are using TrueNAS Core. TrueNAS SCALE (should work) in the same way, but there may be minor operational / setup differences between the two different deployments of TrueNAS.
Secondly, this guide assumes the ProxmoxVE cluster nodes and TrueNAS server exist on the same network `192.168.101.0/24`.
## ZFS over iSCSI Operational Flow
``` mermaid
sequenceDiagram
participant ProxmoxVE as ProxmoxVE Cluster
participant TrueNAS as TrueNAS Core (inc. iSCSI & ZFS Storage)
ProxmoxVE->>TrueNAS: Cluster VM node connects via SSH to create ZVol for VM
TrueNAS->>TrueNAS: Create ZVol in ZFS storage pool
TrueNAS->>TrueNAS: Bind ZVol to iSCSI LUN
ProxmoxVE->>TrueNAS: Connect to iSCSI & attach ZVol as VM storage
ProxmoxVE->>TrueNAS: (On-Demand) Connect via SSH to create VM snapshot of ZVol
TrueNAS->>TrueNAS: Create Snapshot of ZVol/VM
```
## All ZFS Storage Nodes / TrueNAS Servers
### Configure SSH Key Exchange
You first need to make some changes to the SSHD configuration of the ZFS server(s) storing data for your cluster. This is fairly straight-forward and only needs two lines adjusted. This is based on the [Proxmox ZFS over ISCSI](https://pve.proxmox.com/wiki/Legacy:_ZFS_over_iSCSI) documentation. Be sure to restart the SSH service or reboot the storage server after making the changes below before proceeding onto the next steps.
=== "OpenSSH-based OS"
```jsx title="/etc/ssh/sshd_config"
UseDNS no
GSSAPIAuthentication no
```
=== "Solaris-based OS"
```jsx title="/etc/ssh/sshd_config"
LookupClientHostnames no
VerifyReverseMapping no
GSSAPIAuthentication no
```
## All ProxmoxVE Cluster Nodes
### Configure SSH Key Exchange
The first step is creating SSH trust between the ProxmoxVE cluster nodes and the TrueNAS storage appliance. You will leverage the ProxmoxVE `shell` on every node of the cluster to run the following commands.
**Note**: I will be writing the SSH configuration with the name `192.168.101.100` for simplicity so I know what server the identity belongs to. You could also name it something else like `storage.bunny-lab.io_id_rsa`.
``` sh
mkdir /etc/pve/priv/zfs
ssh-keygen -f /etc/pve/priv/zfs/192.168.101.100_id_rsa # (1)
ssh-copy-id -i /etc/pve/priv/zfs/192.168.101.100_id_rsa.pub root@192.168.101.100 # (2)
ssh -i /etc/pve/priv/zfs/192.168.101.100_id_rsa root@192.168.101.100 # (3)
```
1. Do not set a password. It will break the automatic functionality.
2. Send the SSH key to the TrueNAS server.
3. Connect to the TrueNAS server at least once to finish establishing the connection.
### Install & Configure Storage Provider
Now you need to set up the storage provider in TrueNAS. You will run the commands below within a ProxmoxVE shell, then when finished, log out of the ProxmoxVE WebUI, clear the browser cache for ProxmoxVE, then log back in. This will have added a new storage provider called `FreeNAS-API` under the `ZFS over iSCSI` storage type.
``` sh
keyring_location=/usr/share/keyrings/ksatechnologies-truenas-proxmox-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/ksatechnologies/truenas-proxmox/gpg.284C106104A8CE6D.key' | gpg --dearmor >> ${keyring_location}
#################################################################
cat << EOF > /etc/apt/sources.list.d/ksatechnologies-repo.list
# Source: KSATechnologies
# Site: https://cloudsmith.io
# Repository: KSATechnologies / truenas-proxmox
# Description: TrueNAS plugin for Proxmox VE - Production
deb [signed-by=${keyring_location}] https://dl.cloudsmith.io/public/ksatechnologies/truenas-proxmox/deb/debian any-version main
EOF
#################################################################
apt update
apt install freenas-proxmox
apt full-upgrade
systemctl restart pvedaemon
systemctl restart pveproxy
systemctl restart pvestatd
```
## Primary ProxmoxVE Cluster Node
From this point, we are ready to add the shared storage provider to the cluster via the primary node in the cluster. This is not strictly required, just simplifies the documentation.
Navigate to **"Datacenter (BUNNY-CLUSTER) > Storage > Add > ZFS over iSCSI"**
| **Field** | **Value** | **Additional Notes** |
| :--- | :--- | :--- |
| ID | `bunny-zfs-over-iscsi` | Friendly Name |
| Portal | `192.168.101.100` | IP Address of iSCSI Portal |
| Pool | `PROXMOX-ZFS-STORAGE` | This is the ZFS Storage Pool you will use to store GuestVM Disks |
| ZFS Block Size | `4k` | |
| Target | `iqn.2005-10.org.moon-storage-01.ctl:proxmox-zfs-storage` | The iSCSI Target |
| Target Group | `<Leave Blank>` | |
| Enable | `<Checked>` | |
| iSCSI Provider | `FreeNAS-API` | |
| Thin-Provision | `<Checked>` | |
| Write Cache | `<Checked>` | |
| API use SSL | `<Unchecked>` | Disabled unless you have SSL Enabled on TrueNAS |
| API Username | `root` | This is the account that is allowed to make ZFS zvols / datasets |
| API IPv4 Host | `192.168.101.100` | iSCSI Portal Address |
| API Password | `<Root Password of TrueNAS Box>` | |
| Nodes | `proxmox-node-01,proxmox-node-02` | All ProxmoxVE Cluster Nodes |
!!! success "Storage is Provisioned"
At this point, the storage should propagate throughout the ProxmoxVE cluster, and appear as a location to deploy virtual machines and/or containers. You can now use this storage for snapshots and live-migrations between ProxmoxVE cluster nodes as well.

View File

@@ -0,0 +1,51 @@
**Purpose**: Rancher Harvester is an awesome tool that acts like a self-hosted cloud VDI provider, similar to AWS, Linode, and other online cloud compute platforms. In most scenarios, you will deploy "Rancher" in addition to Harvester to orchestrate the deployment, management, and rolling upgrades of a Kubernetes Cluster. You can also just run standalone Virtual Machines, similar to Hyper-V, RHEV, oVirt, Bhyve, XenServer, XCP-NG, and VMware ESXi.
:::note Prerequisites
This document assumes your bare-metal host has at least 32GB of Memory, 200GB of Disk Space, and 8 processor cores. See [Recommended System Requirements](https://docs.harvesterhci.io/v1.1/install/requirements)
:::
## First Harvester Node
### Download Installer ISO
You will need to navigate to the Rancher Harvester GitHub to download the [latest ISO release of Harvester](https://releases.rancher.com/harvester/v1.1.2/harvester-v1.1.2-amd64.iso), currently **v1.1.2**. Then image it onto a USB flashdrive using a tool like [Rufus](https://github.com/pbatard/rufus/releases/download/v4.2/rufus-4.2p.exe). Proceed to boot the bare-metal server from the USB drive to begin the Harvester installation process.
### Begin Setup Process
You will be waiting a few minutes while the server boots from the USB drive, but you will eventually land on a page where it asks you to set up various values to use for networking and the cluster itself.
The values seen below are examples and represent how my homelab is configured.
- **Management Interface(s)**: `eno1,eno2,eno3,eno4`
- **Network Bond Mode**: `Active-Backup`
- **IP Address**: `192.168.3.254/24` *<---- **Note:** Be sure to add CIDR Notation*.
- **Gateway**: `192.168.3.1`
- **DNS Server(s)**: `1.1.1.1,1.0.0.1,8.8.8.8,8.8.4.4`
- **Cluster VIP (Virtual IP)**: `192.168.3.251` *<---- **Note**: See "VIRTUAL IP CONFIGURATION" note below.*
- **Cluster Node Token**: `19-USED-when-JOINING-more-NODES-to-EXISTING-cluster-55`
- **NTP Server(s)**: `0.suse.pool.ntp.org`
:::caution Virtual IP Configuration
The VIP assigned to the first node in the cluster will act as a proxy to the built-in load-balancing system. It is important that you do not create a second node with the same VIP (Could cause instability in existing cluster), or use an existing VIP as the Node IP address of a new Harvester Cluster Node.
:::
:::tip
Based on your preference, it would be good to assign the device a static DHCP reservation, or use numbers counting down from **.254** (e.g. `192.168.3.254`, `192.168.3.253`, `192.168.3.252`, etc...)
:::
### Wait for Installation to Complete
The installation process will take quite some time, but when it is finished, the Harvester Node will reboot and take you to a splash screen with the Harvester logo, with indicators as to what the VIP and Management Interface IPs are configured as, and whether or not the associated systems are operational and ready. **Be patient until both statuses say `READY`**. If after 15 minutes the status has still not changed to `READY` both for fields, see the note below.
:::caution Issues with `rancher-harvester-repo` Image
During my initial deployment efforts with Harvester v.1.1.2, I noticed that the Harvester Node never came online. That was because something bugged-out during installation and the `rancher-harvester-repo` image was not properly installed prior to node initialization. This will effectively soft-lock the node unless you reinstall the node from scratch, as the Docker Hub Registry that Harvester is looking for to finish the deployment does not exist anymore and depends on the local image bundled with the installer ISO.
If this happens, you unfortunately need to start over and reinstall Harvester and hope that it works the second time around. No other workarounds are currently known at this time on version 1.1.2.
:::
## Additional Harvester Nodes
If you work in a production environment, you will want more than one Harvester node to allow live-migrations, high-availability, and better load-balancing in the Harvester Cluster. The section below will outline the steps necessary to create additional Harvester nodes, join them to the existing Harvester cluster, and validate that they are functioning without issues.
### Installation Process
Not Documented Yet
### Joining Node to Existing Cluster
Not Documented Yet
## Installing Rancher
If you plan on using Harvester for more than just running Virtual Machines (e.g. Containers), you will want to deploy Rancher inside of the Harvester Cluster in order or orchestrate the deployment, management, and rolling upgrades of various forms of Kubernetes Clusters (RKE2 Suggested). The steps below will go over the process of deploying a High-Availability Rancher environment to "adopt" Harvester as a VDI/compute platform for deploying the Kubernetes Cluster.
### Provision ControlPlane Node(s) VMs on Harvester
Not Documented Yet
### Adopt Harvester as Cluster Target
Not Documented Yet
### Deploy Production Kubernetes Cluster to Harvester
Not Documented Yet