How traffic routes between VMs on ESX hosts

Here is an interesting article I read about routing (or to be more precise switching) of traffic between VMs on ESX hosts.

The article talks about three cases.

Different vSwitches, same port group and VLAN

Same vSwitch, different port group and VLAN

Same vSwitch, same port group and VLAN

and correctly concludes that in the first two cases the traffic leaves the ESX server, goes to the physical switch, and comes back, while in the third case, the traffic stays within the ESX server.

However, there is a fourth case

Same vSwitch, different port group and same VLAN – The way you would set this up in ESX is to create separate portgroups, then go into both portgroups and set the same VLAN id (e.g., 300) in it. In this case also, the traffic stays within the ESX server. Moreover, our performance tests show that its just as fast as the same vswitch, same portgroup, same VLAN case. The advantage of this setup, as explained inVLANs vs vSwitches, is that you can change the VLAN easily without having to change the portgroup.

The reason I say switching instead of routing is that the traffic in most of these cases stays within a single L2 domain, and is never routed. Routing involves leaving an L2 domain, bumping up the stack from Ethernet to IP, getting routed over a different interface and going down into a different L2 domain. The only case in which that happens is when we have different VLANs. Then the traffic must go through a router (perhaps only a switch acting as a router) which forwards the traffic from one VLAN to another.

Consequently, that brings up an interesting point. The only reason the traffic has to leave the ESX server in the second case (same vSwitch, different port group and VLAN) is to get to a router that can take it off VLAN 1 and put it back on VLAN 2. If you deploy a virtual router or a virtual firewall (such as the TBDVirtualFirewall), you can forward traffic from the first VLAN to the second using two interfaces of the VFW, and avoid leaving the ESX server. Again gives you a big performance boost, at the cost of some CPU cycles for running the VFW.

November 13, 2008 at 7:24 pm Leave a comment

VMware Network Performance

I want to follow up on my June 5 post about virtual network performance. We did some further testing and now have some concrete numbers to talk about.

Lets first set the context. We are using a VM as a virtual firewall inside of an ESX server. So the question arises, how much does that slow us down? How fast can you move packets from a VM to the VFW out the interface of the VFW to the second VM?

We used netperf to do the testing, and used the range test.

The throughput you can squeeze out depends on the size of the datagram, so we end up with a table that looks something like this. We used tripped down Linux VMs running netperf (specifically the tcp_range-script) to generate the test packets and measure the throughput.

Message Size (bytes) Throughput (Mbits/sec)
1 4.82
4 18.30
16 60.91
64 108.85
256 118.03
1024 123.85
4098 127.46
16384 130.13
65536 130.63

That means that if you send teeny tiny 1 byte packets, you can only send a bit less than 5 Mbps (because of the per packet copy overhead), but if you send big packets (looks like anything above 4K packets)  you can get close to 130 Mbps.

But is that good or bad? How fast could you go if you went directly VM to VM. Also, can anything to be done to speed this up?

We looked at the VMware paper on performance mentioned in my June 5 entry. http://www.vmware.com/files/pdf/ESX_networking_performance.pdf
and realized that you can get a significant speedup by using the optimized network driver. So we went ahead and did that on the test VMs, and then looked at the comparison.

Message Size (bytes) VM  to VM (vmxnet drivers) With VFW (default drivers
1 2.71 4.82
4 10.87 18.30
16 43.41 60.91
64 130.74 108.85
256 238.01 118.03
1024 330.61 123.85
4098 359.67 127.46
16384 366.29 130.13
65536 365.39 130.63

That clearly looks bad, but we have not yet optimized the VFW drivers. So lets do that and compare again.

Message Size (bytes) VM  to VM (vmxnet drivers) With VFW (default drivers With VFW (vmxnet drivers)
1 2.71 4.82 4.26
4 10.87 18.30 16.13
16 43.41 60.91 66.07
64 130.74 108.85 230.65
256 238.01 118.03 292.02
1024 330.61 123.85 348.46
4098 359.67 127.46 371.69
16384 366.29 130.13 382.5
65536 365.39 130.63 382.16

That looks pretty good. Even with the VFW in between we can go as fast as the direct VM to VM communication. The numbers actually look like we can go faster with the VFW, although, we don’t have a really good explanation for that. Perhaps some kind of pipelining effect.

August 7, 2008 at 4:52 am Leave a comment

Virtual Network Performance

How fast can you drive the network from a VM?

VMware claims that they can go as fast as native hardware…
http://www.vmware.com/files/pdf/ESX_networking_performance.pdf

But a bit of deeper reading into vmware’s claims seems to indicate that in order to get the network performance up to native hardware levels, you have to install the vmxnet driver. Also, you have to enable a few things (Jumbo frames, TCP Segmentation Offloading).

We are still playing around with the various settings in our lab. While the performance is significantly improved from the starting point, we are still not at the point comparable to native hardware. Stay tuned for more results.

June 5, 2008 at 10:01 pm Leave a comment

Application Virtualization

I promised to do some research into Application Virtualization and how it relates to network virtualization, so here goes.

The idea behind application virtualization is that applications are not installed into user computers, but rather streamed to them on demand. Some motivations why you might want to do something like this are:

  • The OS image is not modified by installing the application so the images for all users remain the same regardless of what apps they use, making it easier to maintain the images. This reduces the cost of maintaining the desktops of large numbers of users.
  • The environment isolates the OS from the app, and vice versa. This might help an application run in an environment where it otherwise would not run (e.g., an applicate requiring superuser privileges to run without) or it might protect the OS or other apps from a poorly written app.

Some of the application virtualizations approaches are from

  • Microsoft. The SoftGrid Application Virtualization. Streams apps to a Vista based desktop environment. Acquired from Softtricity.com in ’06
  • Citrix.  Client-side Application Virtualization. Stream apps to Citrix Presentation Server (renamed to XenApp after they acquired Xen, but nothing to do with Xen other than marketing). Here is an interesting blog entry from a colleague.

A completely different approach with some of the same benefits is Google apps. Google apps run in your browser as light weight javascript based applications.  Looks like the most light weight approach for the user, but of course, you have to (or rather google has to) rewrite every app.

I think there is a positive interaction between App Virtualization and Desktop Virtualization. App virtualization allows you to deploy exactly the same image to all the VMs. If the Desktop Virtualization system is able to take advantage of that (and as far as I can tell, some of them do) then this means that you can deploy a very large number of virtual desktops with very low additional memory per new desktop. Then you stream the applications to whomever needs it, and those additional bits end up also being shared by the users that use the same apps.

The implications on Network Virtualization are similar. On the one hand, you have to stream the application across the network to where it is running (the final desktop in the case of MS or the server running the virtual desktops in the case of Citrix) so that would increase the network traffic. On the other hand, the Desktop Virtualization in the Citrix system might deduplicate these bits, bringing the traffic down again. However, you have to get from the DV system to the final desktop to actually display the stuff (and carry mouse clicks back), so that’s network traffic again. So in both cases network traffic will definitely go up. The network design would at least have to take that into account for sizing purposes. Also, you have to configure the network to allow traffic from the terminal to the DV system and from the DV system to the app server (if they are not on the same machine).

May 29, 2008 at 4:21 pm Leave a comment

Follow up to VLANs and VSwitches

Seems that my post a couple of weeks ago on VLANs vs. Vswitches was somewhat confusing to Keshav. Let me try to clarify with a quick follow up.

A portgroup is the equivalent (in the ESX virtual network) of the port on a physical switch. The way you connect a VM to a VSwitch in ESX is to connect the interface of the VM to a port group. So you need the port group anyway, regardless of whether you define VLANs in the VSwitch or not.

Once you do that, then you have the option of changing the VLAN setting from the default (variously denoted as * or 0 in the ESX server, and meaning don’t do any VLANs) to a specific VLAN tag. If you do that, it means that portgroup will only pass packets for that specific VLAN, and not pass anything else, in the direction from the switch to the VM. In the other direction, the VM sends packets without any VLAN tags, and the VSwitch puts the VLAN tag on it for the purposes of packet forwarding.

If it is going to send the packet to another VM on another portgroup in the same VSwitch, then it will just take the tag off again, so effectively it only uses the vlan tag on the various port groups to decide which set of portgroups to forward packets between.

If it is going to forward that packet through a physical network interface card (pnic) to a physical switch and if the pnic is in trunking mode, it will send the packet with the tags to the physical switch so that the physical switch can look at the vlan tag to decide where to forward the packet. In the other direction, if the vswitch gets a tagged packet from the pnic, it will only forward it to your port group if the tags match.

If forwarding to another vswitch… well, ESX does not support connecting vswitches together. Probably because they don’t want to or have not yet implemented the spanning tree protocol and all of its extentions, so that’s not an option.

So the short answer is, portgroups are there anyway, just to allow packets to go from VMs to VSwitches, and that’s also where you put the VLAN tag in a VSwitch. You have to use portgroups to do pretty much anything (including VLANs) in VSwitches.

May 22, 2008 at 3:04 pm Leave a comment

IO Virtualization and Network Virtualization

There’s something happening in the
storage world which closely parallels Network Virtualization. They call
it IO Virtualization, and as far as I understand it, the idea is to
unify the wires coming out of the back of the server. You used to have
an Ethernet cable to connect you to the Internet, and a Fiber-channel
to connect to your SAN. Then, with FiberChannel over Ethernet (FCOE) or
iSCSI,  it became possible to run the storage protocols over the
network card, opening up the possibility of converging the wiring.
Given that wiring is one of the constraining parameters of a data
center, this is a very attractive possibility. The only problem was
speed. Even over a 1 Gig Ethernet, a shared network card is just not as
fast as a dedicated FC connection.

Now
they are coming up with 10 Gig unified Ethernet, which has the
potential to become the unified fabric of the data center. They are
working on extensions of the basic protocol to support things like
bandwidth allocation (802.1Qaz) and reliability (802.1AG) and failover
(802.1Qay).

The
interactions with Network Virtualization start from setting up VLANs to
separate the SAN traffic from the network data traffic. You may, of
course, create additional separate VLANs to separate network data
traffic from different applications. You may also create separate VLANs
for different SAN logical disks. And you could allocate all of these
different QoS parameters, such as bandwidth.

If
you are using iSCSI, you can also extend the virtual network across the
WAN using a tunnel. The SAN you are accessing might be at the other end
of the country. You probably cannot do anything about the latency, but
you can guarantee some throughput, so for throughput bound applications
this might be acceptable.

And
if any of these things dynamically move around, network virtualization
can keep track of the moving targets, and make sure the network
configuration follows the changes. So if VMotion moves a VM to another
ESX server, network virtualization could make sure that the SAN
connection it needs is still there, by reconfiguring the VLAN that the
SAN needs to go to the new ESX server.

May 14, 2008 at 2:03 pm Leave a comment

VLANs vs Vswitches

Let’s say you want to separate out some VMs from other VMs in the network inside of an ESX server. How should you do it? Should you:

  • Create a new vswitch and put one group in vswitch A and the other group in vswitch B?
  • Create separate VLANs within the same vswitch and put the different VMs into VLAN A and VLAN B

Both will give you separation of traffic. And with the caveat of bugs in the security implementation of the hypervisor, both are reasonably secure. So is there a strong reason to go one way or the other?

VLANs are more flexible than Vswitches just because the VLAN setting is easier to change. If we want to move a VM from one Vswitch to another, either we have to shutdown the VM, delete the portgroup from the first vswitch, create another portgroup in another vswitch with the same name, and then start the VM again. Or you can shutdown the VM, and then change the portgroup the VM is in and start it again. Either way, you have to shutdown the VM.

To move the VM to another VLAN, all you have to do is change the VLAN tag of the portgroup. No need to shutdown the VM.

You get the most flexibility if you put each VM into its own portgroup. Then you can move a single VM to another VLAN. Otherwise, you have to move all the VMs in the portgroup, since you only have one slot for VLAN tag on the portgroup.

May 7, 2008 at 3:45 pm 1 comment


Categories

  • Blogroll

  • Feeds


    Follow

    Get every new post delivered to your Inbox.