**Purpose**: Deploying a Rancher RKE2 Cluster-based Ansible AWX Operator server. This can scale to a larger more enterprise environment if needed. !!! note Prerequisites This document assumes you are running **Ubuntu Server 22.04** or later with at least 16GB of memory, 8 CPU cores, and 64GB of storage. ## Deploy Rancher RKE2 Cluster You will need to deploy a [Rancher RKE2 Cluster](https://docs.bunny-lab.io/Docker%20%26%20Kubernetes/Servers/Kubernetes%20Clusters/Rancher%20RKE2/) on an Ubuntu Server-based virtual machine. After this phase, you can focus on the Ansible AWX-specific deployment. A single ControlPlane node is all you need to set up AWX, additional infrastructure can be added after-the-fact. !!! tip "Checkpoint/Snapshot Reminder" If this is a virtual machine, after deploying the RKE2 cluster and validating it functions, now would be the best time to take a checkpoint / snapshot of the VM before moving forward, in case you need to perform rollbacks of the server(s) if you accidentally misconfigure something during deployment. ## Server Configuration The AWX deployment will consist of 3 yaml files that configure the containers for AWX as well as the NGINX ingress networking-side of things. You will need all of them in the same folder for the deployment to be successful. For the purpose of this example, we will put all of them into a folder located at `/awx`. ``` sh # Make the deployment folder mkdir -p /awx cd /awx ``` We need to increase filesystem access limits: Temporarily Set the Limits Now: ``` sh sudo sysctl fs.inotify.max_user_watches=524288 sudo sysctl fs.inotify.max_user_instances=512 ``` Permanently Set the Limits for Later: ```jsx title="/etc/sysctl.conf" # fs.inotify.max_user_watches = 524288 fs.inotify.max_user_instances = 512 ``` Apply the Settings: ``` sh sudo sysctl -p ``` ### Create AWX Deployment Donfiguration Files You will need to create these files all in the same directory using the content of the examples below. Be sure to replace values such as the `spec.host=awx.bunny-lab.io` in the `awx-ingress.yml` file to a hostname you can point a DNS server / record to. === "awx.yml" ```yaml title="/awx/awx.yml" apiVersion: awx.ansible.com/v1beta1 kind: AWX metadata: name: awx spec: service_type: ClusterIP ``` === "ingress.yml" ```yaml title="/awx/ingress.yml" apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: ingress spec: rules: - host: awx.bunny-lab.io http: paths: - pathType: Prefix path: "/" backend: service: name: awx-service port: number: 80 ``` === "kustomization.yml" ```yaml title="/awx/kustomization.yml" apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - github.com/ansible/awx-operator/config/default?ref=2.10.0 - awx.yml - ingress.yml images: - name: quay.io/ansible/awx-operator newTag: 2.10.0 namespace: awx ``` ## Ensure the Kubernetes Cluster is Ready Check that the status of the cluster is ready by running the following commands, it should appear similar to the [Rancher RKE2 Example](https://docs.bunny-lab.io/Containers/Kubernetes/Rancher%20RKE2/Rancher%20RKE2%20Cluster/#install-helm-rancher-certmanager-jetstack-rancher-and-longhorn): ``` export KUBECONFIG=/etc/rancher/rke2/rke2.yaml kubectl get pods --all-namespaces ``` ## Ensure the Timezone / Date is Accurate You want to make sure that the Kubernetes environment and Node itself have accurate time for a number of reasons, least of which, is if you are using Ansible with Kubernetes authentication, if the date/time is inaccurate, things will not work correctly. ``` sh sudo timedatectl set-timezone America/Denver ``` ## Deploy AWX using Kustomize Now it is time to tell Kubernetes to read the configuration files using Kustomize (*built-in to newer versions of Kubernetes*) to deploy AWX into the cluster. !!! warning "Be Patient" The AWX deployment process can take a while. Use the commands in the [Troubleshooting](https://docs.bunny-lab.io/Containers/Kubernetes/Rancher%20RKE2/AWX%20Operator/Ansible%20AWX%20Operator/#troubleshooting) section if you want to track the progress after running the commands below. If you get an error that looks like the below, re-run the `kubectl apply -k .` command a second time after waiting about 10 seconds. The second time the error should be gone. ``` sh error: resource mapping not found for name: "awx" namespace: "awx" from ".": no matches for kind "AWX" in version "awx.ansible.com/v1beta1" ensure CRDs are installed first ``` To check on the progress of the deployment, you can run the following command: `kubectl get pods -n awx` You will know that AWX is ready to be accessed in the next step if the output looks like below: ``` NAME READY STATUS RESTARTS AGE awx-operator-controller-manager-7b9ccf9d4d-cnwhc 2/2 Running 2 (3m41s ago) 9m41s awx-postgres-13-0 1/1 Running 0 6m12s awx-task-7b5f8cf98c-rhrpd 4/4 Running 0 4m46s awx-web-6dbd7df9f7-kn8k2 3/3 Running 0 93s ``` ``` sh cd /awx kubectl apply -k . ``` !!! warning "Be Patient - Wait 20 Minutes" The process may take a while to spin up AWX, postgresql, redis, and other workloads necessary for AWX to function. Depending on the speed of the server, it may take between 5 and 20 minutes for AWX to be ready to connect to. You can watch the progress via the CLI commands listed above, or directly on Rancher's WebUI at https://rancher.bunny-lab.io. ## Access the AWX WebUI behind Ingress Controller After you have deployed AWX into the cluster, it will not be immediately accessible to the host's network (such as your personal computer) unless you set up a DNS record pointing to it. In the example above, you would have an `A` or `CNAME` DNS record pointing to the internal IP address of the Rancher RKE2 Cluster host. The RKE2 Cluster will translate `awx.bunny-lab.io` to the AWX web-service container(s) automatically. SSL certificates are not covered in this documentation, but suffice to say, the can be configured on another reverse proxy such as Traefik or via Cert-Manager / JetStack. The process of setting this up goes outside the scope of this document. !!! success "Accessing the AWX WebUI" If you have gotten this far, you should now be able to access AWX via the WebUI and log in. - AWX WebUI: https://awx.bunny-lab.io ![Ansible AWX WebUI](awx.png) You may see a prompt about "AWX is currently upgrading. This page will refresh when complete". Be patient, let it finish. When it's done, it will take you to a login page. AWX will generate its own secure password the first time you set up AWX. Username is `admin`. You can run the following command to retrieve the password: ``` kubectl get secret awx-admin-password -n awx -o jsonpath="{.data.password}" | base64 --decode ; echo ``` ## Change Admin Password You will want to change the admin password straight-away. Use the following navigation structure to find where to change the password: ``` mermaid graph LR A[AWX Dashboard] --> B[Access] B --> C[Users] C --> D[admin] D --> E[Edit] ``` ## Upgrading from 2.10.0 to 2.19.1 There is a known issue with upgrading / install AWX Operator beyond version 2.10.0, because of how the PostgreSQL database upgrades from 13.0 to 15.0, and has changed permissions. The following workflow will help get past that and adjust the permissions in such a way that allows the upgrade to proceed successfully. If this is a clean installation, you can also perform this step if the fresh install of 2.19.1 is not working yet. (It wont work out of the box because of this bug). ### Create a Temporary Pod to Adjust Permissions We need to create a pod that will mount the PostgreSQL PVC, make changes to permissions, then destroy the v15.0 pod to have the AWX Operator automatically regenerate it. ```yaml title="/awx/temp-pod.yml" apiVersion: v1 kind: Pod metadata: name: temp-pod namespace: awx spec: containers: - name: temp-container image: busybox command: ['sh', '-c', 'sleep 3600'] volumeMounts: - mountPath: /var/lib/pgsql/data name: postgres-data volumes: - name: postgres-data persistentVolumeClaim: claimName: postgres-15-awx-postgres-15-0 restartPolicy: Never ``` ``` sh # Deploy Temporary Pod kubectl apply -f /awx/temp-pod.yaml # Open a Shell in the Temporary Pod kubectl exec -it temp-pod -n awx -- sh # Adjust Permissions of the PostgreSQL 15.0 Database Folder chown -R 26:root /var/lib/pgsql/data exit # Delete the Temporary Pod kubectl delete pod temp-pod -n awx # Delete the Crashlooped PostgreSQL 15.0 Pod to Regenerate It kubectl delete pod awx-postgres-15-0 -n awx # Track the Migration kubectl get pods -n awx kubectl logs -n awx awx-postgres-15-0 ``` !!! warning "Be Patient" This upgrade may take a few minutes depending on the speed of the node it is running on. Be patient and wait until the output looks something similar to this: ``` root@awx:/awx# kubectl get pods -n awx NAME READY STATUS RESTARTS AGE awx-migration-24.6.1-bh5vb 0/1 Completed 0 9m55s awx-operator-controller-manager-745b55d94b-2dhvx 2/2 Running 0 25m awx-postgres-15-0 1/1 Running 0 12m awx-task-7946b46dd6-7z9jm 4/4 Running 0 10m awx-web-9497647b4-s4gmj 3/3 Running 0 10m ``` If you see a migration pod, like seen in the above example, you can feel free to delete it with the following command: `kubectl delete pod awx-migration-24.6.1-bh5vb -n awx`. ## Troubleshooting You may wish to want to track the deployment process to verify that it is actually doing something. There are a few Kubernetes commands that can assist with this listed below. ### AWX-Manager Deployment Logs You may want to track the internal logs of the `awx-manager` container which is responsible for the majority of the automated deployment of AWX. You can do so by running the command below. ``` kubectl logs -n awx awx-operator-controller-manager-6c58d59d97-qj2n2 -c awx-manager ``` !!! note The `-6c58d59d97-qj2n2` noted at the end of the Kubernetes "Pod" mentioned in the command above is randomized. You will need to change it based on the name shown when running the `kubectl get pods -n awx` command. ## Kerberos Implementation You may find that you need to be able to run playbooks on domain-joined Windows devices using Kerberos. You need to go through some extra steps to set this up after you have successfully deployed AWX Operator into Kubernetes. ### Configure Windows Devices You will need to prepare the Windows devices to allow them to be remotely controlled by Ansible playbooks. Run the following powershell script on all of the devices that will be managed by the Ansible AWX environment. - [WinRM Prerequisite Setup Script](https://docs.bunny-lab.io/Docker%20%26%20Kubernetes/Servers/AWX/AWX%20Operator/Enable%20Kerberos%20WinRM/) ### Create an AWX Instance Group At this point, we need to make an "Instance Group" for the AWX Execution Environments that will use both a Keytab file and custom DNS servers defined by configmap files created below. Reference information was found [here](https://github.com/kurokobo/awx-on-k3s/blob/main/tips/use-kerberos.md#create-container-group). This group allows for persistence across playbooks/templates, so that if you establish a Kerberos authentication in one playbook, it will persist through the entire job's workflow. Create the following files in the `/awx` folder on the AWX Operator server you deployed earlier when setting up the Kubernetes Cluster and deploying AWX Operator into it so we can later mount them into the new Execution Environment we will be building. === "Custom DNS Records" ```yaml title="/awx/custom_dns_records.yml" apiVersion: v1 kind: ConfigMap metadata: name: custom-dns namespace: awx data: custom-hosts: | 192.168.3.25 LAB-DC-01.bunny-lab.io LAB-DC-01 192.168.3.26 LAB-DC-02.bunny-lab.io LAB-DC-02 192.168.3.4 VIRT-NODE-01.bunny-lab.io VIRT-NODE-01 192.168.3.5 BUNNY-NODE-02.bunny-lab.io BUNNY-NODE-02 ``` === "Kerberos Keytab File" ```yaml title="/awx/krb5.conf" [libdefaults] default_realm = BUNNY-LAB.IO dns_lookup_realm = false dns_lookup_kdc = false [realms] BUNNY-LAB.IO = { kdc = 192.168.3.25 kdc = 192.168.3.26 admin_server = 192.168.3.25 } [domain_realm] 192.168.3.25 = BUNNY-LAB.IO 192.168.3.26 = BUNNY-LAB.IO .bunny-lab.io = BUNNY-LAB.IO bunny-lab.io = BUNNY-LAB.IO ``` Then we apply these configmaps to the AWX namespace with the following commands: ``` sh cd /awx kubectl -n awx create configmap awx-kerberos-config --from-file=/awx/krb5.conf kubectl apply -f custom_dns_records.yml ``` - Open AWX UI and click on "**Instance Groups**" under the "**Administration**" section, then press "**Add > Add container group**". - Enter a descriptive name as you like (e.g. `Kerberos`) and click the toggle "**Customize Pod Specification**". - Put the following YAML string in "**Custom pod spec**" then press the "**Save**" button ```yaml title="Custom Pod Spec" apiVersion: v1 kind: Pod metadata: namespace: awx spec: serviceAccountName: default automountServiceAccountToken: false initContainers: - name: init-hosts image: busybox command: - sh - '-c' - cat /etc/custom-dns/custom-hosts >> /etc/hosts volumeMounts: - name: custom-dns mountPath: /etc/custom-dns containers: - image: quay.io/ansible/awx-ee:latest name: worker args: - ansible-runner - worker - '--private-data-dir=/runner' resources: requests: cpu: 250m memory: 100Mi volumeMounts: - name: awx-kerberos-volume mountPath: /etc/krb5.conf subPath: krb5.conf volumes: - name: awx-kerberos-volume configMap: name: awx-kerberos-config - name: custom-dns configMap: name: custom-dns ``` ### Job Template & Inventory Examples At this point, you need to adjust your exist Job Template(s) that need to communicate via Kerberos to domain-joined Windows devices to use the "Instance Group" of "**Kerberos**" while keeping the same Execution Environment you have been using up until this point. This will change the Execution Environment to include the Kerberos Keytab file in the EE at playbook runtime. When the playbook has completed running, (or if you are chain-loading multiple playbooks in a workflow job template), it will cease to exist. The kerberos keytab data will be regenerated at the next runtime. Also add the following variables to the job template you have associated with the playbook below: ``` yaml --- kerberos_user: nicole.rappe@BUNNY-LAB.IO kerberos_password: ``` You will want to ensure your inventory file is configured to use Kerberos Authentication as well, so the following example is a starting point: ``` ini virt-node-01 ansible_host=virt-node-01.bunny-lab.io bunny-node-02 ansible_host=bunny-node-02.bunny-lab.io [virtualizationHosts] virt-node-01 bunny-node-02 [virtualizationHosts:vars] ansible_connection=winrm ansible_port=5986 ansible_winrm_transport=kerberos ansible_winrm_scheme=https ansible_winrm_server_cert_validation=ignore #kerberos_user=nicole.rappe@BUNNY-LAB.IO #Optional, if you define this in the Job Template, it is not necessary here. #kerberos_password= #Optional, if you define this in the Job Template, it is not necessary here. ``` !!! failure "Usage of Fully-Quality Domain Names" It is **critical** that you define Kerberos-authenticated devices with fully qualified domain names. This is just something I found out from 4+ hours of troubleshooting. If the device is Linux or you are using NTLM authentication instead of Kerberos authentication, you can skip this warning. If you do not define the inventory using FQDNs, it will fail to run the commands against the targeted device(s). In this example, the host is defined via FQDN: `virt-node-01 ansible_host=virt-node-01.bunny-lab.io` ### Kerberos Connection Playbook At this point, you need a playbook that you can run in a Workflow Job Template (to keep things modular and simplified) to establish a connection to an Active Directory Domain Controller via Kerberos. The following playbook is an example pulled from https://git.bunny-lab.io !!! note "Playbook Redundancies" I have several areas where I could optimize this playbook and remove redundancies. I just have not had enough time to iterate through it deeply-enough to narrow down exact things I can remove, so for now, it will remain as-is, since it functions as-expected with the example below. ```yaml title="Establish_Kerberos_Connection.yml" --- - name: Generate Kerberos Ticket to Communicate with Domain-Joined Windows Devices hosts: localhost vars: kerberos_password: "{{ lookup('env', 'KERBEROS_PASSWORD') }}" # Alternatively, you can set this as an environment variable # BE SURE TO PASS "kerberos_user: nicole.rappe@BUNNY-LAB.IO" and "kerberos_password: " to the template variables when running this playbook in a template. tasks: - name: Generate the keytab file ansible.builtin.shell: | ktutil <