Mirco-S2D

Table of Contents

Preface

I know we all hate nothing more than a long story before the recipe, but this one has some real explaining to do. Lucky you, I have a table of contents and you can skip over my woes.

First, this is not what this project was supposed to be. I chose the Minisforum MS-01 to make a 2-node S2D cluster due to the NVMe slots, dual 20g thunderbolt, and dual 10g SFP+. I was hoping to use the dual 20g thunderbolt ports for storage & live migration, and the 10g X710 ports for Management & Compute. I even picked up some QNAP cards so I could do 5x NVMe per node.

Unfortunately, I could not get the thunderbolt ports to work for SMB Multichannel or Network ATC, so using them for storage was off the table. They did however come in handy as a way to RDP from one node to the other when I was messing with network changes, so they’re not entirely useless. Next problem, I could not get the X710’s to behave. I tried firmware updates, latest drivers, oldest drivers, etc. I just could not get them to behave properly, even for compute & management traffic.

Additionally, I started off with a big lot of enterprise 1.92TB M.2 PM983 NVMe’s. These refused to work when booted into an OS from an NVMe. Regardless of which slot, how many drives, etc. I ended up getting Samsung 9100 Pros that work fine, but are about half the size and don’t have the same Power Loss Prevention capabilities.

So I eventually settled on putting a Mellanox ConnectX-4 lx nic in each node, and put Storage, Compute, and Management traffic over those. That of course meant no QNAP card, and only 3x SSD per node. It also meant that I needed some RDMA capable switches, which was not my initial intention. It did however give me a chance to try out cumulus. Let me emphasize, nearly every part of this is unsupported by microsoft. You should not use this for your business. But, it is quite the fun little homelab cluster.

Landscape

Hardware Setup

Hardware

First up, the hardware I used:

Part Model
CPU i9-12900H (14c/20t)
RAM 2x 48GB DDR5 Crucial
Boot SSD Random old NVMe
Data SSD 2x Samsung 9100 Pro 1TB
Nic Mellanox ConnectX-4 lx

Step 1 is of course to install everything. When it comes to SSD placement, make sure to put the boot drive in the slot furthest to the right. That slot only runs at PCIe 3.0x2 so we don’t want to use that for a real data drive. Also, make sure that the M.2/U.2 switch is set to M.2, or you’re going to have a really bad day.

Landscape

Landscape

BIOS

Once they’re all put together, onto the BIOS Setup.

  • All of my nodes shipped with BIOS 1.26, which seemed to work fine for me, but if you bought used I’d make sure both nodes are on the same version.
  • If you bought used, go ahead and reset to factory defaults. If you bought new, maybe do it anyways. I’ve seen some weird things with these guys.
  • Advanced menu.
    • Trusted Computing:
      • Enable SHA384 and SM3_256 PCR Banks Landscape
    • Onboard Devices Settings:
      • DVMT Pre-Allocated: 48M
      • Aperture Size: 128MB Landscape
    • ACPI Settings:
      • Restore (Restory?) on AC Power Loss: Last State Landscape
    • HW Monitor & Smart Fan:
      • Set all fans to “Full Mode”. These boxes are still going to run plenty warm, and at full speed they’re still quieter than my other servers. Landscape
  • I also set the ME Password, but haven’t yet started to do anything with it.

AD prep

Lots of this is optional, and environmental based. But this is what I did ahead of time to make things easier.

  • Create OU for HV Clusters, with a sub OU for this particular cluster.
  • Create and link GPO to allow RDP and Remote Powershell.
  • Create and link GPO to configure W32TM (NTP) settings.
  • Create and link GPO to use Deliver Optimization on the local network.
  • Create and link GPO to disable Interactive Logon CTRL+ALT+DEL.ß
  • Create and link GPO to allow necessary Windows Firewall rules.
    • RDP
    • ICMP (Ping)
    • File and Printer Sharing (SMB)
    • WinRM
    • WMI
    • Delivery Optimization
    • Performance Logs and Alerts
    • Virtual Machine Monitoring

Initial OS Setup

I used the absolute latest version of Windows Server vNext, but feel free to use the retail release of Windows Server 25 if you’d like.

  • Prep Windows Server on a USB drive
  • Boot, and install Windows Server to the boot NVMe.
    • If you did vNext, you can use the public vNext activation key “2KNJJ-33Y9H-2GXGX-KMQWH-G6H67”
  • Set an admin password, and login.
  • Do a Rename-Computer and restart
    1Rename-Computer -NewName HV01 -Restart
  • Set static IP on a Nic. My preference is the 1st ConnectX-4 that shows up
  • Join the Domain
    • Move the computer objects to the dedicated OU now, not later.
  • Restart the computer
    1Restart-Computer
  • Set the timezone.
    1Set-TimeZone -Id "Central Standard Time"
  • Set Power Plan to High Performance
    1Powercfg -setactive 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c
  • Set minimum processor state to 25%
    1Powercfg -setacvalueindex 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c 54533251-82be-4824-96c1-47b60b740d00 893dee8e-2bef-41e0-89c6-b55d0929964c 25

Drivers

Host Network Prep

This part is pretty important, and will help set you up for Failover Clustering health checks, and Network ATC Configuration.

  • Now that you have the chipset drivers, if you connected the nodes with a thunderbolt link, that will pop up as a network connection. Set static IPs on both ends with no gateway.

As an example, I’ve included this IP table to help. I would highly recommend not to touch the storage networks (VLan 711 and 712), but feel free to modify the others for whatever meets your environment.

Cluster IP: 10.10.0.40/23

NIC HV01 HV02 VLAN
vManagement 10.10.0.41/23 10.10.0.42/23 1
vSMB1 10.71.1.41/24 10.71.1.42/24 711
vSMB2 10.71.2.41/24 10.71.2.42/24 712
TB1 10.72.1.41/24 10.72.1.42/24 N/A

Rename Network Adapters

You will need to substitute the proper -NewName depending on the order in which your adapters got named by default. You can see in this screenshot that the adapters on this node got added in a different order.

1Get-NetAdapter | Sort Name
2
3Rename-NetAdapter -Name 'Ethernet' -NewName 'CX4-1'
4Rename-NetAdapter -Name 'Ethernet 2' -NewName 'CX4-2'
5Rename-NetAdapter -Name 'Ethernet 3' -NewName 'X710-1'
6Rename-NetAdapter -Name 'Ethernet 4' -NewName 'X710-2'
7Rename-NetAdapter -Name 'Ethernet 5' -NewName 'TB1'
Landscape

Cluster Configuration

  • First, we’re going to check that the data drives are ready to be pooled. At this point, you will have to run this on each node.
    1Get-PhysicalDisk
    Landscape
  • Add Roles and Features. You 100% want to do this from powershell, not server manager.
    1Install-WindowsFeature -Name "Hyper-V", "Failover-Clustering", "Data-Center-Bridging", "RSAT-Clustering-PowerShell", "Hyper-V-PowerShell", "FS-FileServer”, “NetworkATC” -IncludeAllSubFeature -IncludeManagementTools -Restart
  • FCM Validation:
    • Open Failover Cluster Manager. If you did server core, you will need to do this from a management machine.
    • In the top right, click “Validate Configuration” Landscape
    • It will take a few minutes to validate, and provide a report. There may be some errors or warnings, and you should view the report and check them all. Landscape
    • Now, if you set a static IP on CX4-1, but not CX4-2, this validation will complain about a DHCP mismatch on a cluster network. This is fine and can be ignored. It’s also not abnormal that one of the two nodes got a defender update that the other one didn’t get. I always ignore that.
    • Assuming everything else in the validation report looks good, click the “Create the cluster now using the validated nodes” box, and click finish.
    • Fill in the name you’d like to provide the cluster, click next a couple of times, and click create.
    • From the cluster overview page, go to “Cluster Core Resources”, and right click on the “Server Name” object with your clusters name. Go to properties, Select the network address and click edit, then enter the IP address you’d like to use for the cluster then click OK. Optionally also select “publish PRT records” and click OK again. Landscape

Network ATC

  • This section assumes you are using the same exact network setup as me, but this will hopefully at least give you a good starting point to understand ATC. You only have to do this on one node, and it will apply to the whole cluster.
  • First, create the intent
    1Add-NetIntent -ClusterName HV-CLUS1 -Name ConvergedIntent -Management -Compute -Storage -AdapterName CX4-1, CX4-2
  • Next, we need to get the status and make sure it’s succeeded. DO NOT CONTINUE UNTIL THIS IS COMPLETE
    1Get-NetIntentStatus
    Landscape
  • Next, we’ve got some overrides. Network ATC will override settings set manually in other places. This is part of it’s intentional configuration drift capabilities.
    1$ClusterOverride = New-NetIntentGlobalClusterOverrides
    2$ClusterOverride.EnableNetworkNaming = $True
    3$ClusterOverride.EnableLiveMigrationNetworkSelection = $True
    4$ClusterOverride.EnableVirtualMachineMigrationPerformanceSelection = $True
    5$ClusterOverride.VirtualMachineMigrationPerformanceOption = "SMB"
    6$ClusterOverride.MaximumVirtualMachineMigrations = "2"
    7Set-NetIntent -GlobalClusterOverrides $ClusterOverride
  • And just like before, we want to watch until it’s completed with the following commands.
    1Get-NetIntentStatus
    2Get-NetIntentStatus -GlobalOverrides
  • IF “Get-NetIntentStatus -GlobalOverrides” comes back with the Error “WindowsFeatureNotInstalled”, run this command then configure settings from WAC.
    1Remove-NetIntent -GlobalOverrides
  • If you’re looking for some alternate Network ATC setup configurations, Lee has a great blog post here: https://www.hciharrison.com/azure-stack-hci/network-atc/
  • If you’re feeling really adventurous, and trying to get the X710’s to work, Network ATC will fail because they do not support RDMA. This is the same problem you’d have trying to use Network ATC in a VM or with any other non-RDMA Nics. Here are the necessary overrides to make it work. DO NOT DO THIS IF YOU HAVE RDMA CAPABLE NICS.
    1$Override = New-NetIntentAdapterPropertyOverrides
    2$Override.JumboPacket = "9000"
    3$Override.NetworkDirect = $false

Setup Cluster Aware Updating (CAU)

  • First step here is often overlooked, and VERY important. Give the cluster object full control of it’s OU!
  • These are the parameters I chose to use for this lab cluster. I expect most of you to somewhat modify these settings.
     1$Parameters = @{
     2    ClusterName = HV-CLUS1'
     3    DaysOfWeek = Monday, Friday
     4    WeeksOfMonth = 1, 2, 3, 4
     5    MaxFailedNodes = 0
     6    MaxRetriesPerNode = 3
     7    RebootTimeoutMinutes = 30
     8    SuspendClusterNodeTimeoutMinutes = 30
     9    SuspendRetriesPerNode = 2
    10    WaitForStorageRepairTimeoutMinutes = 60
    11    RequireAllNodesOnline = $true
    12    AttemptSoftReboot = $true
    13    EnableFirewallRules = $true
    14    Force = $true
    15    }
    16Add-CauClusterRole @Parameters 
  • Then, we’ll check that it’s working.
    1Get-CauClusterRole
    Landscape
  • Assuming it says Online, and the settings look right, you should be good. I did however run into an odd issue a few days after deployment. My CAU status went to Offline. Thankfully, super easy to check, and super easy to fix.
    1Get-CauClusterRole
    2Enable-CauClusterRole
    Landscape

Set Cluster Witness

  • Since this is a 2-node cluster, we definitely want a cluster witness. This is super easy, you just need to point it at a SMB share that both nodes can write to.
    1Set-ClusterQuorum -Cluster HV-CLUS1 -FileShareWitness \\SERVER-1\HV-CLUS1-SHARE -Credential (Get-Credential)
    2Get-ClusterQuorum -Cluster HV-CLUS1
    • You should then check that the cluster witness shows as “online” in FCM. Landscape

Setup S2D

  • Finally the fun part!
    1Enable-ClusterStorageSpacesDirect -PoolFriendlyName MICRO-S2D -Verbose
  • And check that the storage pool is online and healthy.
    1Get-StoragePool
    Landscape
  • Or if you want more details:
    1Get-StoragePool -IsPrimordial $false | FL
    Landscape
  • Validate that the ClusterPerformanceHistory volume has been created. This can take up to ~15 minutes.
    1Get-Volume
    Landscape

S2D Volume Creation

  • First step here is really to understand how much space is available in your pool.
  • Since I used 2x1TB SSD per node, I’m going to “lose” one drive worth of capacity per node to the recommended reserve, then I’m using 2-way mirroring across the two nodes, leaving me with just under 1TB usable, but I don’t want to run at 100% right out of the gate, so I’m going with a 600GB CSV (Cluster Shared Volume) to start.
    1$VolumeName = "Workloads1"
    2$StoragePool = Get-Storagepool -IsPrimordial $False
    3New-Volume -StoragePool $StoragePool -FriendlyName $VolumeName -FileSystem CSVFS_ReFS -Size 600GB -ResiliencySettingName "Mirror" -ProvisioningType "Fixed"
  • And for fun, I wanted a dedicated CSV for the PDC that I’m going to run on this cluster.
    1$VolumeName = "DC1"
    2$StoragePool = Get-Storagepool -IsPrimordial $False
    3New-Volume -StoragePool $StoragePool -FriendlyName $VolumeName -FileSystem CSVFS_ReFS -Size 128GB -ResiliencySettingName "Mirror"-ProvisioningType "Fixed"

Hyper-V Tweaks

  • Create (my) standard folder structure.
    1New-Item -Path C:\ClusterStorage\Workloads1\VMs -ItemType Directory
    2New-Item -Path C:\ClusterStorage\Workloads1\VHDs -ItemType Directory
    3New-Item -Path C:\ClusterStorage\Workloads1\ISOs -ItemType Directory
  • Set default locations for VM Creation.
    1Get-ClusterNode | Foreach { Set-VMHost -ComputerName $_.Name -VirtualMachinePath 'C:\ClusterStorage\Workloads1\VMs’ }
    2Get-ClusterNode | Foreach { Set-VMHost -ComputerName $_.Name -VirtualHardDiskPath 'C:\ClusterStorage\Workloads1\VHDs' }
  • Increase Failover Cluster load balancer aggressiveness (Yes, this is actually the valid powershell way to set this…)
    1(Get-Cluster).AutoBalancerLevel = 2
  • Set maximum parallel migrations. Don’t ask me why this isn’t covered with all the other things we’ve set.
    1(get-Cluster).MaximumParallelMigrations = 3

Double checks, extra validation, and tidbits

  • I had plenty of weird things going on with WAC Network ATC on this deployment. I blame it on using the latest vNext version of Windows Server, latest preview version of WAC, and some other unnamed new features. This is basically a list of good things to double check.

  • Double check that Network ATC extension works in WAC. Mine did not, and I had to add a different extension feed to get a newer version of Netowrk ATC than what was in the default feed.

  • Double check Network ATC Global Overrides. I had some funky stuff with my Network ATC Powershell module, so I double checked everything looked good in WAC.

    • WAC > Cluster > Network ATC CLuster Settings

    Landscape

  • Set CPU Scheduler to the newer Core Scheduler. I haven’t found a way to do this in Powershell yet. If you find one, let me know!

    • WAC > Cluster > Settings > General

    Landscape

  • Enable In-Memory read cache. I have enough RAM, so I’m happy to give some up to in-memory read cache. This is basically what every linux ZFS system is doing with ARC.

    • WAC > Cluster > Settings > In-memory Cache

    Landscape

  • Set Storage Spaces Repair Speed

    • WAC > Cluster > Settings > Storage Spaces and Pools

    Landscape

  • Check SMB Multichannel

    1Get-SmbMultichannelConnection -SmbInstance SBL

  • Getting rid of a removed adapters from showing up as partitioned in FCM

    • The docs claim that you can use Add-ClusterExcludedAdapter commands to fix this, but I have never been able to get them to work. Here’s the actual fix:
    • Open Regedit, navigate to “Computer\HKEY_LOCAL_MACHINE\Cluster\NetworkInterfaces" and delete all references to the removed adapter. Having good friendly names will help here.
  • Affinity rules vs Preferred Owners

    • Affinity rules can be configured in WAC or Powershell, and allow you to try and keep two or more VMs together or apart. For most production VMs, this is what you probably want to use. You can also enable “soft anti-affinity” so if you don’t have enough nodes in the cluster to keep them apart, it will keep them running. Perfect use case for this is having two domain controllers on a 3-node cluster.
    • Preferred Owner allows you to specify which node in the cluster you’d like the role (VM) to run on. This can have some interesting use cases. In my case, I want to keep DC1 on HV01, and DC2 on HV02 so I chose to use preferred owners. Preferred owner does however have some… unintuitive logic to it. If you think you want to use this, you should read the doc thoroughly. https://learn.microsoft.com/en-us/troubleshoot/windows-server/high-availability/groups-fail-logic-three-more-cluster-node-members
  • Ubuntu VM optimizations

    1sudo apt update
    2sudo apt install linux-azure linux-image-azure linux-headers-azure linux-tools-common linux-cloud-tools-common linux-tools-azure linux-cloud-tools-azure
    3sudo apt full-upgrade
    4sudo apt install openssh-server
    5sudo reboot