# The Ops Stack Practical troubleshooting notes, lab builds, and operator tools for people who keep infrastructure running. The Ops Stack publishes infrastructure troubleshooting guides and hands-on lab walkthroughs for operators working with Windows, Azure, PowerShell, networking, containers, monitoring, backup, remote access, and self-hosted services. ## Primary Sections - [Home](https://theops-stack.com/): Current entry point for the publication. - [Insights](https://theops-stack.com/insights): Troubleshooting guides organized by symptoms, environments, causes, resolution steps, and edge cases. - [Topic Hubs](https://theops-stack.com/topics): Platform and problem-space hubs for internal navigation. - [Labs](https://theops-stack.com/labs): Hands-on infrastructure builds, automation projects, and homelab guides. - [Operations Toolchest](https://theops-stack.com/toolchest): Reusable checks, script starters, templates, and practical tool recommendations. - [RSS Feed](https://theops-stack.com/rss.xml): Recent troubleshooting and lab content. - [Sitemap](https://theops-stack.com/sitemap.xml): Complete crawl map. ## Operations Toolchest - [DNS and DHCP Health Check](https://theops-stack.com/toolchest/dns-dhcp-health-check): A read-only DNS and DHCP triage checklist that captures client-side evidence, compares DNS paths, and narrows the failure domain before anyone flushes caches or changes records. - [PowerShell server connectivity quick check](https://theops-stack.com/toolchest/powershell-server-connectivity-quick-check): A concise read-only connectivity triage script that separates DNS, ICMP reachability, and expected TCP-port failures before escalation. - [Windows server health snapshot](https://theops-stack.com/toolchest/windows-server-health-snapshot): A read-only Windows Server health snapshot that returns one compact row per host for uptime, disk pressure, memory headroom, stopped automatic services, and recent system errors. - [Pending reboot detection across Windows servers](https://theops-stack.com/toolchest/pending-reboot-detection-across-windows-servers): A read-only pending reboot check for Windows servers before patching, application installs, or maintenance-window closure. - [Disk space cleanup candidate report](https://theops-stack.com/toolchest/disk-space-cleanup-candidate-report): A read-only disk-pressure report that captures low-space context and returns targeted cleanup candidates from known folders without deleting, compressing, or moving anything. - [Certificate expiration scanner](https://theops-stack.com/toolchest/certificate-expiration-scanner): A read-only certificate inventory that finds local-machine store certificates nearing expiration and captures certificates presented by known TLS endpoints for review. - [IIS site and binding inventory](https://theops-stack.com/toolchest/iis-site-and-binding-inventory): A read-only IIS inventory that correlates sites, bindings, ports, host headers, app-pool identities, content paths, and certificate thumbprints for migration or renewal work. - [Local administrator group audit across Windows endpoints](https://theops-stack.com/toolchest/local-administrators-group-audit-across-windows-endpoints): A read-only local administrator audit that records privileged group membership across Windows endpoints for review. - [File share permission audit](https://theops-stack.com/toolchest/file-share-permission-audit): A read-only file share audit that records SMB share permissions, NTFS access, and ownership evidence for review. - [Service account usage finder](https://theops-stack.com/toolchest/service-account-usage-finder): A read-only service account discovery pass for Windows services, scheduled tasks, and IIS application pools. - [Scheduled task inventory and failure report](https://theops-stack.com/toolchest/scheduled-task-inventory-and-failure-report): A read-only scheduled task inventory that highlights failed runs, missed runs, disabled tasks, and ownership gaps. - [DHCP scope utilization report](https://theops-stack.com/toolchest/dhcp-scope-utilization-report): A read-only DHCP scope report that surfaces high utilization, exhausted ranges, and cleanup candidates. - [Windows firewall rule audit](https://theops-stack.com/toolchest/windows-firewall-rule-audit): A read-only Windows Firewall audit that records enabled allow rules, ports, profiles, and address scopes. - [DNS resolution and reverse lookup audit](https://theops-stack.com/toolchest/dns-resolution-and-reverse-lookup-audit): A read-only DNS audit that compares forward and reverse lookup results across host lists and expected DNS servers. - [RDP Connectivity Checklist](https://theops-stack.com/toolchest/rdp-connectivity-checklist): A structured check for RDP failures before changing firewall rules, user rights, or server policy. - [Installed Application Inventory for Windows Endpoints](https://theops-stack.com/toolchest/powershell-software-inventory): A read-only PowerShell inventory starter for collecting installed applications from local or remote Windows endpoints. - [Backup Restore Drill Evidence Checklist](https://theops-stack.com/toolchest/backup-verification-checklist): A restore-drill evidence template for proving backups are usable, measuring recovery time, and turning failed assumptions into repair tasks before an outage. - [Windows Update Repair Checks](https://theops-stack.com/toolchest/windows-update-repair-checks): A staged Windows Update troubleshooting path that starts read-only and escalates only when needed. - [RDP failure triage script](https://theops-stack.com/toolchest/rdp-failure-triage-script): A read-only RDP triage script pattern for DNS, TCP 3389, listener state, firewall evidence, sessions, and event logs. - [Robocopy job template and log parser](https://theops-stack.com/toolchest/robocopy-job-template-and-log-parser): A safer Robocopy job template with dry-run review, log capture, exit-code interpretation, and migration evidence. - [Sysinternals first-response kit guide](https://theops-stack.com/toolchest/sysinternals-first-response-kit-guide): A practical Sysinternals first-response map for process, file handle, startup, network, login, and registry symptoms. - [Wireshark packet capture triage guide](https://theops-stack.com/toolchest/wireshark-packet-capture-triage-guide): A packet-capture triage guide for DNS, TLS, DHCP, SMB, RDP, retransmissions, and sensitive-data handling. - [Free network scanner and port inventory guide](https://theops-stack.com/toolchest/free-network-scanner-and-port-inventory-guide): A practical guide to free network scanning options for host discovery, port inventory, and safe scan scoping. - [Windows Update readiness and repair evidence pack](https://theops-stack.com/toolchest/windows-update-readiness-and-repair-evidence-pack): A patch readiness and repair evidence pack for reboot state, servicing health, update logs, and approved repair actions. - [AD stale computer cleanup report](https://theops-stack.com/toolchest/ad-stale-computer-cleanup-report): A read-only Active Directory stale computer report for last logon, OU, operating system, enabled state, and cleanup planning. - [Incident Note Template](https://theops-stack.com/toolchest/incident-note-template): A compact operator note format for capturing symptoms, checks, decisions, and follow-up while the issue is fresh. - [Robocopy Job Template](https://theops-stack.com/toolchest/robocopy-job-template): A safer starting point for repeatable Windows file copy jobs with logging and dry-run review. - [Uptime Kuma Monitoring Starter](https://theops-stack.com/toolchest/uptime-kuma-monitoring-starter): A simple monitoring starter for internal services, homelab systems, and small-office status checks. - [Azure Arc onboarding preflight checklist](https://theops-stack.com/toolchest/azure-arc-onboarding-preflight-checklist): Preflight checklist for onboarding Windows servers to Azure Arc. Confirms supported OS state, outbound connectivity, proxy/TLS behavior, local admin rights, target Azure placement, tagging, pilot scope, and rollback notes before any agent install. - [Azure Arc bulk onboarding CSV and logging starter](https://theops-stack.com/toolchest/azure-arc-bulk-onboarding-csv-logging): Reusable starter for Azure Arc onboarding waves using a host CSV, dry-run expectations, per-host logging, and repeatable result tracking suitable for tickets, change records, and post-wave reporting. - [Azure Update Manager patch wave planning template](https://theops-stack.com/toolchest/azure-update-manager-patch-wave-planning): Operator-ready planning template for Azure Update Manager patch waves covering scope, maintenance windows, reboot tolerance, exclusions, soak periods, rollback contacts, and stop-go criteria before scheduled patching. - [Azure Update Manager compliance workbook starter](https://theops-stack.com/toolchest/azure-update-manager-compliance-workbook-starter): Starter template for an Azure Workbook plus Resource Graph evidence pack that shows patch compliance, pending updates, unsupported coverage, and patch-group drift across Azure and Arc-enabled machines. - [All-DC lastLogon collector and stale-user evidence report](https://theops-stack.com/toolchest/all-lastlogon-collector-stale-user-evidence): Collect non-replicated lastLogon values from every writable domain controller, calculate the newest observed logon per account, and export evidence suitable for stale-user or stale-computer cleanup decisions without relying on replicated lastLogonTimestamp alone. - [Inactive AD user disable review workflow](https://theops-stack.com/toolchest/inactive-user-disable-review-workflow): Two-phase review checklist for identifying inactive AD user accounts, validating inactivity evidence, applying exclusions, capturing approval, and preparing rollback details before any disable action. - [Authenticated Users drive ACL scanner](https://theops-stack.com/toolchest/authenticated-users-drive-acl-scanner): PowerShell scanner that checks fixed local drives on Windows servers for root ACL entries where Authenticated Users have broad access. Produces console and CSV evidence so admins can review exposure before any ACL changes. - [RADIUS and NPS server detection report](https://theops-stack.com/toolchest/radius-nps-server-detection-report): Read-only PowerShell reporting script pattern to identify likely Microsoft NPS or other RADIUS-capable Windows servers using multiple evidence sources: NPS service presence, NPAS role/feature state, IAS/NPS event log activity, UDP 1812/1813 listener evidence, and registry indicators. Designed for migration discovery, audit support, and authentication troubleshooting. - [Robocopy migration cutover checklist and evidence pack](https://theops-stack.com/toolchest/robocopy-migration-cutover-checklist-evidence-pack): Operator checklist and evidence structure for file migration cutovers using Robocopy. Covers pre-copy checks, dry-run evidence, final sync readiness, exclusion review, validation samples, rollback details, and signoff artifacts suitable for tickets and change records. - [PowerShell operations reporting foundation](https://theops-stack.com/toolchest/powershell-operations-reporting-foundation): Reusable reporting contract for PowerShell evidence scripts that need stable run metadata, normalized result rows, row-level and target-level summaries, and repeatable HTML, CSV, JSON, and log artifacts. - [Build the Ops reporting foundation helper](https://theops-stack.com/toolchest/ops-reporting-foundation-helper-contract): Build the real local PowerShell helper used by Ops Stack reporting pages, including the file to create, the functions it must expose, and a usable starter implementation that exports HTML, CSV, JSON, and log artifacts. - [PowerShell HTML operations report starter](https://theops-stack.com/toolchest/html-operations-email-reporting-starter): Concrete PowerShell reporting pattern for turning host-check results into an HTML operations summary with a status rollup, per-host table, failure section, saved local artifacts, and optional email delivery. - [Internal IIS site rollout checklist](https://theops-stack.com/toolchest/internal-iis-site-rollout-checklist): Operator checklist for launching an internal IIS-hosted site with evidence capture for IIS role presence, site folder layout, bindings, app pool identity, DNS readiness, browser validation, and rollback notes. ## Topic Hubs - [Windows](https://theops-stack.com/topics/windows): Windows client, server, update, recovery, and identity troubleshooting. - [Networking](https://theops-stack.com/topics/networking): DNS, DHCP, VPN, switching, proxy, and connectivity diagnostics. - [Azure](https://theops-stack.com/topics/azure): Azure platform, identity, networking, app service, and storage issues. - [Containers](https://theops-stack.com/topics/containers): Docker, Kubernetes, ingress, and container runtime troubleshooting. - [Automation](https://theops-stack.com/topics/automation): Terraform, Ansible, CI/CD, and infrastructure automation reliability topics. - [Linux](https://theops-stack.com/topics/linux): Linux host administration, Ubuntu, Debian, and service failure analysis. - [Storage & Data](https://theops-stack.com/topics/sql): Storage, migrations, file services, SQL, and data validation or maintenance failures. - [PowerShell](https://theops-stack.com/topics/powershell): PowerShell automation, remoting, scripting, and admin workflow guides. ## Labs Pillars - [Network and DNS](https://theops-stack.com/labs/pillars/network-and-dns): Local DNS, Remote Networking, Network Visibility - [Storage and Backup](https://theops-stack.com/labs/pillars/storage-and-backup): Backup Platforms, Recovery Validation - [Virtualization and Containers](https://theops-stack.com/labs/pillars/virtualization-and-containers): Docker Stacks, Raspberry Pi Hosting, Virtualization Platforms - [Monitoring and Observability](https://theops-stack.com/labs/pillars/monitoring-and-observability): Uptime and Status, Logs and Alerting - [Remote Access and Security](https://theops-stack.com/labs/pillars/remote-access-and-security): Admin Access, Identity and Secrets - [PowerShell and Admin Automation](https://theops-stack.com/labs/pillars/powershell-and-admin-automation): Reporting and Audits, Ops Toolkits - [Home Automation and IoT](https://theops-stack.com/labs/pillars/home-automation-and-iot): Home Assistant Core, Sensor and Camera Automation - [Self-Hosted Services and Productivity](https://theops-stack.com/labs/pillars/self-hosted-services-and-productivity): Docs and File Workflows, Service Portals and Media ## Representative Troubleshooting Guides - [Troubleshooting Azure Application Gateway: Fixing DNS Configuration to Resolve Internal Container App Connection Issues](https://theops-stack.com/insights/azure-dns-troubleshooting-application-gateway-fixing) - [Resolving Azure SAS Tokens Returning 403 Authorization Failure](https://theops-stack.com/insights/azure-failure-resolving-sas-tokens-returning) - [Troubleshooting Azure Blob Upload Failures Due to CSP in ASP.NET WebForms](https://theops-stack.com/insights/azure-troubleshooting-blob-upload-failures-due) - [Troubleshooting Azure VM RDP and MSTSC Connection Failures](https://theops-stack.com/insights/azure-troubleshooting-mstsc-connection-failures-post) - [Troubleshooting Azure OpenAI Realtime API Server Errors During Response Processing](https://theops-stack.com/insights/azure-troubleshooting-openai-realtime-api-server) - [Troubleshooting Azure VPN Client 3.4.0.0: Resolving Authentication Expiration with Microsoft Entra](https://theops-stack.com/insights/azure-vpn-entra-troubleshooting-client-3) - [Troubleshooting Azure Function App Publish Failures: 'Value cannot be null. Parameter 'input'' on Windows using PowerShell in Premium or Consumption Plan](https://theops-stack.com/insights/azure-windows-powershell-troubleshooting-function) - [Troubleshooting dnsmasq Service Not Loading DNS Servers from /etc/resolv.conf After Reboot](https://theops-stack.com/insights/dns-troubleshooting-dnsmasq-service-loading-servers) - [Troubleshooting Docker Container Communication Issues: Ping vs HTTP Requests](https://theops-stack.com/insights/docker-troubleshooting-container-communication-issue) - [Troubleshooting Docker Container Exit Code 0 and Dependency Failures](https://theops-stack.com/insights/docker-troubleshooting-container-exit-code-0) - [Troubleshooting RustRover Remote Docker Connections over SSH](https://theops-stack.com/insights/docker-troubleshooting-ssh-connection-issues-jetbrai) - [Troubleshooting AADSTS50020 Error in Azure DevOps OAuth App Access for External Accounts](https://theops-stack.com/insights/error-azure-troubleshooting-aadsts50020-devops-oauth) - [Troubleshooting AADSTS500200 Error When Using Personal Microsoft Account for Azure Resource Manager Access](https://theops-stack.com/insights/error-azure-troubleshooting-aadsts500200-personal-17) - [Troubleshooting CORS Error: Permission Denied for Requests in Chrome on Office Network](https://theops-stack.com/insights/error-denied-network-troubleshooting-cors-permission) - [Troubleshooting 'Error Reading File Content' in Helm Template on Kubernetes](https://theops-stack.com/insights/error-helm-kubernetes-troubleshooting-reading-file) - [Troubleshooting No Endpoints Available Error for DTC Between Domain and Non-Domain SQL Servers](https://theops-stack.com/insights/error-sql-troubleshooting-endpoints-available-dtc) - [Troubleshooting Gmail 550 5.7.25 Error: PTR Record Mismatch for Sending IP](https://theops-stack.com/insights/error-troubleshooting-gmail-550-5-7) - [Troubleshooting MQL5 SocketConnect Error 4014 When Connecting to Local TCP Server](https://theops-stack.com/insights/error-troubleshooting-mql5-socketconnect-4014-connec) - [Troubleshooting Connection Reset by Peer Error in Android VPN App Using SOCKS5 Proxy](https://theops-stack.com/insights/error-vpn-troubleshooting-connection-reset-peer) - [Troubleshooting the NVLDDMKM Not Found Error in Windows 11](https://theops-stack.com/insights/error-windows-comprehensive-troubleshooting-guide) - [Resolving 0x800f0954 Error When Installing .NET Features on Windows Server 2022](https://theops-stack.com/insights/error-windows-resolving-0x800f0954-installing-net) - [Troubleshooting VMware Workstation Pro Error 0xc000007b on Windows](https://theops-stack.com/insights/error-windows-troubleshooting-vmware-workstation-pro) - [Error 0x80070490 When Uninstalling Windows Update](https://theops-stack.com/insights/error-windows-update-0x80070490-uninstalling) - [Troubleshooting VSCode SSH Connection Issues to CentOS: Failed to Parse Remote Port from Server Output](https://theops-stack.com/insights/failed-troubleshooting-vscode-ssh-connection-issues) ## Representative Lab Guides - [Build a Lightweight Internal Git and Script Catalog with Gitea](https://theops-stack.com/labs/build-lightweight-internal-git-script-catalog): A practical setup for a lightweight internal Git repository and script catalog using Gitea, a self-hosted Git service. It includes deployment notes to help you manage your scripts cleanly. - [Create a Local Camera and Sensor Event Pipeline with Frigate, MQTT, and Home Assistant Automations](https://theops-stack.com/labs/create-local-camera-sensor-event-pipeline): Build a local event pipeline that turns camera and sensor events into Home Assistant automations without depending on a cloud service. - [Create a Self-Hosted Document Workflow with Paperless-ngx, OCR, and Backup Validation](https://theops-stack.com/labs/create-self-hosted-document-workflow-paperless): Build a self-hosted Paperless-ngx workflow that turns scanned documents into searchable records and includes a backup check you can repeat. - [Create a WireGuard plus Split DNS Lab for Secure Remote Access to Self-Hosted Services](https://theops-stack.com/labs/dns-create-wireguard-plus-split-lab): By completing this guide, you will establish a secure remote access setup using WireGuard and implement split DNS for your self-hosted services. This will enhance your network security and ensure that internal resources are accessible only through the VPN. - [Build a Homelab CI Pipeline for Docker Services with Rollback Support](https://theops-stack.com/labs/docker-build-homelab-pipeline-services-rollback): A practical setup for a Continuous Integration (CI) pipeline in your homelab that automatically deploys Docker services from GitHub repositories and includes rollback capabilities. - [Build a Small Office Network Monitoring Stack with LibreNMS, Syslog, and Alert Routing](https://theops-stack.com/labs/network-build-small-office-monitoring-stack): A practical setup for a small office network monitoring stack using LibreNMS for network visibility, syslog for log management, and alert routing for notifications. - [Build a Reproducible Devcontainer Environment for PowerShell, Terraform, and Azure CLI](https://theops-stack.com/labs/powershell-azure-build-reproducible-devcontainer-env): Build a reproducible devcontainer for PowerShell, Terraform, and Azure CLI work so the toolchain is easy to rebuild. - [PowerShell Onboarding Toolkit for Workstation Setup](https://theops-stack.com/labs/powershell-onboarding-toolkit-workstation-setup): Create a reusable PowerShell onboarding script for new Windows workstations: install standard apps, apply baseline settings, and leave behind a process the next tech can run. - [Raspberry Pi Home Assistant Utility Node with MQTT and Zigbee2MQTT](https://theops-stack.com/labs/raspberry-home-assistant-utility-node-mqtt): Turn a Raspberry Pi into a Home Assistant utility node for MQTT and Zigbee2MQTT with an update path you can repeat. - [Build a Windows Event Log Collector with PowerShell and Scheduled Tasks](https://theops-stack.com/labs/windows-powershell-build-event-log-collector): A practical build for a lightweight Windows event log collector using PowerShell scripts and scheduled tasks, supporting cleanly incident triage and log management. - [Build a Smart Power Monitoring Lab with Home Assistant](https://theops-stack.com/labs/build-smart-power-monitoring-lab-home): A smart power monitoring lab that turns Home Assistant energy data into dashboards, thresholds, and household alerts. - [PowerShell Health-Check Pack for Active Directory, DNS, DHCP, and Certificate Expiration](https://theops-stack.com/labs/powershell-dns-dhcp-health-check-pack): Create a PowerShell health-check pack for Active Directory, DNS, DHCP, and certificate checks in a small Windows network. - [Build a Secure Remote Admin Toolkit with Tailscale, RDP Hardening, and Access Controls](https://theops-stack.com/labs/rdp-build-secure-remote-admin-toolkit): A practical build for a secure remote administration toolkit using Tailscale for secure networking, along with RDP hardening techniques and access control measures to ensure a safe remote management experience. - [Build a Log Aggregation Starter Lab with Loki, Grafana, and Service-Specific Dashboards](https://theops-stack.com/labs/build-log-aggregation-starter-lab-loki): Set up a starter log stack with Loki and Grafana so service logs land in one place and can be checked during incidents. - [PowerShell Toolkit for Rotating Local Admin Passwords and Auditing Privileged Access Drift](https://theops-stack.com/labs/powershell-toolkit-rotating-local-admin-passwords): Create a PowerShell toolkit for rotating local administrator passwords and checking privileged access drift across Windows machines. - [Create a Lightweight k3s Lab for Self-Hosted Services](https://theops-stack.com/labs/create-lightweight-k3s-lab-self-hosted): By following this guide, you will set up a lightweight k3s cluster capable of running self-hosted services with ingress and persistent volumes. This setup will enable you to efficiently manage applications like recipe and inventory systems in a local lab environment. - [Self-Hosted Dashboard Homepage for Homelab Services](https://theops-stack.com/labs/self-hosted-dashboard-homepage-homelab-services): Create a self-hosted dashboard that provides live status tiles for your homelab services and quick access to maintenance links. This project will enhance your productivity by centralizing service monitoring and management. - [Build a Home Assistant Starter Lab with Dashboards, Backups, and Room-by-Room Entity Organization](https://theops-stack.com/labs/build-home-assistant-starter-lab-dashboards): By following this guide, you will set up a Home Assistant lab that includes customizable dashboards, automated backups, and organized entities for each room in your home. This setup will enhance your home automation experience and provide a solid foundation for future expansions. - [Build a Home Network Visibility Dashboard with ntopng, Syslog, and VLAN Traffic Summaries](https://theops-stack.com/labs/network-build-home-visibility-dashboard-ntopng): Build a home network visibility dashboard with ntopng, syslog, and VLAN summaries for quick traffic checks. - [Proxmox Starter Cluster Guide with Templates, Backups, and Service Placement Rules](https://theops-stack.com/labs/proxmox-starter-cluster-guide-templates-backups): Build a small Proxmox starter cluster with templates, backups, and placement rules you can reuse for later services. - [Small Office DHCP and DNS Audit Toolkit with PowerShell](https://theops-stack.com/labs/dhcp-dns-powershell-small-office-audit): A practical toolkit for a DHCP and DNS audit toolkit using PowerShell. The toolkit will include scripts for checking lease conflicts and exporting the results for further analysis. - [Create a Raspberry Pi DNS Secondary Server with Pi-hole Sync and Failover Testing](https://theops-stack.com/labs/dns-create-raspberry-secondary-server-hole): Build a secondary Pi-hole DNS node on a Raspberry Pi, sync core configuration from the primary resolver, and prove client failover before changing router DHCP options. - [Build a Reusable PowerShell Script for Software Inventory Reporting](https://theops-stack.com/labs/powershell-build-reusable-script-software-inventory): Build a reusable PowerShell software inventory script that exports clean CSV reports from Windows endpoints. - [Build a Self-Hosted Password Vault Lab with Vaultwarden](https://theops-stack.com/labs/build-self-hosted-password-vault-lab): A self-hosted Vaultwarden password vault with backups, recovery notes, and practical safeguards for family use. ## Use Notes Prefer canonical URLs from this file, the XML sitemap, and visible page links when referencing The Ops Stack. Summaries should preserve technical caveats, validation steps, and environment constraints from the source guide.