Hi MrvSAN,
firstly thank you for your comitment to the community. I’m facing a Understanding diffeculty with vSAN RoBo deployment direct connect. following configuration
2x 10Gb nic each node
VMK1 as vSAN traffic (default IP-Stack)
VMK2 as vMotion traffic (vMotion Stack)
DvPortGroup vSAN Uplinnk5, Uplink6 A/A
DvPortGroup vMotion Uplinnk5, Uplink6 A/A
Routing based on Physical Nic
with this configuration the vmks are spreeded over the Uplinks, till now everything is fine. Now when an uplink fails on one node the vmk of that nic is changed to the working one and then vMotion do not work. In a Activ Standby also the same when an Uplink fails and the vmk is moved to the surviving nic, there is no vMotion then. Can you suggest how i can handel Redundancy in that case???
In my configuration I have configured a LAG for the two uplinks, and I have done some testing on this and vMotion seems to work fine even with a NIC failure. My question is how are you testing the NIC Failure?
I have a couple of questions:
1. Can you or anyone recommend an all-flash caching device? I am looking for the best price/performance device.
2. Same question for capacity drives.
3. We have a 4 node vSAN Cluster, each node with 1x 960GB SSD (Samsung SM863 – SATA) for caching and 3x 1.92TB SSD (Samsung PM863 – SATA) for capacity.
The cluster runs on 10Gbit/s network, with failover.
Which is the best way to increase performance in this setup? Is it better/faster caching, more capacity drives, or more nodes?
4. Dissimilar size capacity drives. We have 1.92TB capacity drives, but the manufacturer also makes 3.84TB versions. Could we put 1 of these in each node, and add to our existing vSAN pool, or should that be a separate pool with separate caching?
5. If I understand correctly, then vSAN caching is for write buffer only. If so, should we not have a pair of mirrored caching devices?
What happens if a caching device fails?
The best way to increase performance is to have multiple disk groups per host, it is also best to keep the disks uniform in size, vSAN places object components on current used capacity, so if you have a 1.92TB drive and a 3.84TB Drive, and place 960GB of data on each, the 1.92TB Disk will be 50% full but the 3.84 only 25% full, therefore the 3.84TB Diskk will get more components placed upon it and may then introduce hot spots.
As for the “Mirrored” cache disks, this would be unsupported, vSAN does not support disks that are in a RAID configuration.
Hello MrVSAN,
I hope this is the correct way to ask questions,
my Company is new to the vSAN world, and we are a bit cautious about the Installation. We are currently deployind a 2 Node Cluster. this will be used for VDIs. We are leaning to the layer 2 Witness constellation, using VLANs and Kernel Ports for the Separation. Another Consultant recommended the Layer 3 Witness Setup. Do you have a Preference? Or can you at least give me some pros/cons to the variations?
I attended the vSAN class you ran last week in central Glasgow and we talked briefly about performance monitoring. You mentioned that you have a ticket logged with Extreme for some Brocade switches that are dropping VMware packets. We are about to go-live on Brocade switches the same model as yours.
Would you be able to provide a bit more detail around the issue and its status that would enable me to determine if we have the same issue please? I’ve just asked our network admins to take a look at the port counters on the switches which is a start. It would help to know the technical conditions (ESXi / Brocade firmware versions, all VLANs?, drop volume, load when dropping, etc.) under which you see the problem.
I am fairly new to vSAN and working through how best to architect the storage layer of Elasticsearch (a stateful document storage database) deployed on a vSphere using vSAN.
My first quandary is whether or not to use ftt-0 or ftt-1 at the vSAN layer. The Elasticsearch layer uses primary and replica copies already and we can guarantee from that layer that the primary and replica copies never reside on the same VM host machine. It seems logical that ftt-0 would be the best choice for performance and storage, but my some of my co-workers think there is value in ftt-1 at the vSAN layer, but they are having a hard time articulating why. Can you offer some advice here?
Hey Brad
vSAN supports what is referred to as “Shared Nothing” where the workload can also be pinned to where the storage object it located, this is used where as you say the application takes care of the replication so therefore it is not needed to be resilient at the disk level
Hi MrvSAN,
firstly thank you for your comitment to the community. I’m facing a Understanding diffeculty with vSAN RoBo deployment direct connect. following configuration
2x 10Gb nic each node
VMK1 as vSAN traffic (default IP-Stack)
VMK2 as vMotion traffic (vMotion Stack)
DvPortGroup vSAN Uplinnk5, Uplink6 A/A
DvPortGroup vMotion Uplinnk5, Uplink6 A/A
Routing based on Physical Nic
with this configuration the vmks are spreeded over the Uplinks, till now everything is fine. Now when an uplink fails on one node the vmk of that nic is changed to the working one and then vMotion do not work. In a Activ Standby also the same when an Uplink fails and the vmk is moved to the surviving nic, there is no vMotion then. Can you suggest how i can handel Redundancy in that case???
Thank you in advance
Tariq
Hi Tariq
In my configuration I have configured a LAG for the two uplinks, and I have done some testing on this and vMotion seems to work fine even with a NIC failure. My question is how are you testing the NIC Failure?
I have a couple of questions:
1. Can you or anyone recommend an all-flash caching device? I am looking for the best price/performance device.
2. Same question for capacity drives.
3. We have a 4 node vSAN Cluster, each node with 1x 960GB SSD (Samsung SM863 – SATA) for caching and 3x 1.92TB SSD (Samsung PM863 – SATA) for capacity.
The cluster runs on 10Gbit/s network, with failover.
Which is the best way to increase performance in this setup? Is it better/faster caching, more capacity drives, or more nodes?
4. Dissimilar size capacity drives. We have 1.92TB capacity drives, but the manufacturer also makes 3.84TB versions. Could we put 1 of these in each node, and add to our existing vSAN pool, or should that be a separate pool with separate caching?
5. If I understand correctly, then vSAN caching is for write buffer only. If so, should we not have a pair of mirrored caching devices?
What happens if a caching device fails?
Hi
The best way to increase performance is to have multiple disk groups per host, it is also best to keep the disks uniform in size, vSAN places object components on current used capacity, so if you have a 1.92TB drive and a 3.84TB Drive, and place 960GB of data on each, the 1.92TB Disk will be 50% full but the 3.84 only 25% full, therefore the 3.84TB Diskk will get more components placed upon it and may then introduce hot spots.
As for the “Mirrored” cache disks, this would be unsupported, vSAN does not support disks that are in a RAID configuration.
Hello MrVSAN,
I hope this is the correct way to ask questions,
my Company is new to the vSAN world, and we are a bit cautious about the Installation. We are currently deployind a 2 Node Cluster. this will be used for VDIs. We are leaning to the layer 2 Witness constellation, using VLANs and Kernel Ports for the Separation. Another Consultant recommended the Layer 3 Witness Setup. Do you have a Preference? Or can you at least give me some pros/cons to the variations?
many thanks in advance!
Mike
Hi Mike
The witness has to exist on a different subnet to the actual ESXi hosts, this is to prevent vSAN I/O traffic traversing through the witness
Hi,
I attended the vSAN class you ran last week in central Glasgow and we talked briefly about performance monitoring. You mentioned that you have a ticket logged with Extreme for some Brocade switches that are dropping VMware packets. We are about to go-live on Brocade switches the same model as yours.
Would you be able to provide a bit more detail around the issue and its status that would enable me to determine if we have the same issue please? I’ve just asked our network admins to take a look at the port counters on the switches which is a start. It would help to know the technical conditions (ESXi / Brocade firmware versions, all VLANs?, drop volume, load when dropping, etc.) under which you see the problem.
Thanks
Sorry, only just seen this comment, I think we have had an email thread on this anyway 🙂
Good Job , very like your blog !
Thanks
Hey MrVSAN,
I am fairly new to vSAN and working through how best to architect the storage layer of Elasticsearch (a stateful document storage database) deployed on a vSphere using vSAN.
My first quandary is whether or not to use ftt-0 or ftt-1 at the vSAN layer. The Elasticsearch layer uses primary and replica copies already and we can guarantee from that layer that the primary and replica copies never reside on the same VM host machine. It seems logical that ftt-0 would be the best choice for performance and storage, but my some of my co-workers think there is value in ftt-1 at the vSAN layer, but they are having a hard time articulating why. Can you offer some advice here?
Thank you,
Brad
Hey Brad
vSAN supports what is referred to as “Shared Nothing” where the workload can also be pinned to where the storage object it located, this is used where as you say the application takes care of the replication so therefore it is not needed to be resilient at the disk level