April 8, 2009

Hardening the VMX File: How Your Servers May Already be 0wned by Your Users

If you've been working with any of the VMware products for even a short amount of time, you've probably explored the .vmx file. This single text file stores almost all of a virtual machine's configurable parameters, and has several optional settings that are not available from the administration tools and must be added by hand to the file. In this tutorial, we'll explore a few parameters that can be used to eliminate some serious vulnerabilities with VMware Tools and lock down the communications channel between a guest VM and the ESX server hosting it.

The .vmx file parameters we'll discuss have already been recommended by different sources on the Internet, so this may seem like old news. But there is very little information describing exactly what they do and what issues they are intended to fix. Many of the lists out there have parameters that don't even apply to the VI3 / ESX 3.5 products! So rather than just post another list, we're going to really explore what each parameter changes or fixes. If you have not gotten around to hardening your virtual machine configuration files, keep reading, as this may open your eyes to some pretty serious security holes in your environment.

I like using a combination of putty and vi to make changes to .vmx files. Just copy the list of parameters at the end of this posting, open the .vmx in vi, and right click with vi in insert mode to paste them all in. If you've never used vi, or just find it scary, see the very bottom for a quick tutorial.

You can also add these parameters from the VI Client by right clicking a VM, choosing Edit Settings..., Options tab, Advanced - General, and clicking the Configuration Parameters... button. But that is a lot of work compared to the vi method. Note that for whatever method you choose to add the parameters, the VM must be shut down. If you add anything to the .vmx of a running VM, your changes are going to be discarded when the VM shuts down.

Seeing is believing
You really have to see the vulnerabilities in VMware Tools with your own eyes to get a sense of their scope. To demonstrate, we're going to bring up a new virtual machine installed with Windows 2003 Server Standard and VMware Tools. We'll add a new local user account to the server named testuser, and add this user to the Remote Desktop Users group. This account should only be a member of the Users and Remote Desktop Users groups, and should not be in the Administrators group. This is exactly how a typical Citrix user account is configured, so if you've got virtualized Citrix servers in your environment, you're going to want to test this out for yourself.

And if you're thinking this doesn't apply to your environment because you have hidden the VMware Tools icon in your VMs, think again, as any user can simply open up a Run window and launch "C:\Program Files\VMware\VMware Tools\VMControlPanel.cpl".

Dis-connectable
We'll start by looking at the ability to connect and disconnect CD-ROM, floppy, and network devices from the VMware tools applet. Open a Remote Desktop Connection session to your test server and log in with the testuser account. Once you have a desktop, right click the VMware Tools icon and open it, then click on the Devices tab. Uncheck the NIC device and click Apply, and then watch as your RDP connection dies. That's right, a non-administrator account just disconnected the VM's NIC, taking it completely off the network! If you have virtual Citrix servers, you should already be getting an idea of just how serious a vulnerability this is.

Lucky for us, this is very easy to disable by simply adding these two directives to the .vmx file:

isolation.device.connectable.disable = "true"
isolation.device.edit.disable = "true"

Timecop
This next VMware Tools vulnerability could also ruin your day if one of your Citrix users gets some malicious ideas. Open another Remote Desktop Connection to your test VM, and log in as testuser. Open the VMware Tools applet again and on the Options tab, uncheck Time synchronization between the virtual machine and the host operating system. Click Apply, and then OK to close the window. Now open the VMware Tools applet again and notice that the change you made as a non-administrator was saved!

If you are using the default W32Time service in your Windows guests, and disabling time synchronization between your guest VMs and ESX hosts, you may be thinking this is not a big deal. But if a user logs in and enables guest-host time synchronization, you now have two processes synchronizing time that are totally unaware of each other. And if you are disabling W32Time in your guests, and relying on the VMware Tools synchronization like we are, this could be a really big problem. If a user changes this without your knowledge, the VM could slip out of synch with the domain and cause you much grief.

Hang on, something else must be going on here, that setting is stored in the .vmx file as the tools.syncTime = "TRUE" parameter, and there is no way a regular Windows user account is changing the .vmx on the ESX host!

Let's see for our ourselves, open up a SSH session to the ESX server hosting our test VM, elevate to root, and run this command with the VMware Tools checked for Time synchronization between the virtual machine and the host operating system:

grep tools.syncTime /PathToTheVM/vm.vmx

Our terminal output tools.syncTime = "TRUE". Now let's make the change to the VMware Tools time sync option as testuser, apply it and close the VMware Tools applet, and run the same grep command on the ESX host. Our terminal returns tools.syncTime = "FALSE". Yikes, this really is happening!

The documentation on the optional .vmx file parameters is pretty sparse, so it took quite a bit of trial and error to finally figure out how to prevent non-privileged users from changing the time synchronization behavior of virtualized servers. As it turns out, the parameter to disable this option, isolation.tools.setOption.disable, is on many lists as a recommended parameter, but I couldn't find a good explanation of what it actually does. When in place, a user can check and uncheck the time synchronization option, click Apply and OK, and it looks like the change might save. But when you look at the .vmx file, the parameter does not change, and when you open the VMware Tools applet again, the check box is back to the state it was before you attempted the change. As far as I can tell, this directive only affects the ability to change the tools.syncTime option, so we are going to apply it:
isolation.tools.setOption.disable = "true"

Wait a second...
I was putting the finishing touches on this article when it occurred to me that I have changed the time sync option with a batch script in the past, with the intent of scheduling the script to flip the time sync bit off and back on again to force an immediate sync with the host ESX server. Could a non-privileged user make this change from the command line? I went back to my notes, and found the command, C:\Program Files\VMware\VMware Tools\VMwareService.exe -cmd "vmx.set_option synctime 0 1", and without the setOption.disable parameter in place, this works for the testuser account from the command line just as it did from the applet!

Ok, that's interesting, set_option matches the setOption.disable parameter we used to close this hole. Now things are becoming a little bit clearer. Let's explore the VMwareService.exe command further, and see if there is a help screen: VMwareService.exe -help

Usage: c:\Program Files\VMware\VMware Tools\VMwareService.exe {-v,-i,-u,-kill,-cmd "<command>"}

Well that's not especially helpful, what are the commands we can send it? There's a little bit of information in the SDK documentation, but not enough. So I finally went and downloaded the source code for the Open Virtual Machine Tools and searched through it. After a lot of grepping, I focused in on a couple of remote procedure calls that can be made through what the developers refer to as the "backdoor".

This command allows you to make log entries to the virtual machine's vmware.log file on the ESX server:

C:\Program Files\VMware\VMware Tools\VMwareService.exe -cmd "log %string %string"

And this one allows you to create custom GuestInfo variables to store information about the virtual machine in the ESX server's memory:

C:\Program Files\VMware\VMware Tools\VMwareService.exe -cmd "info-set guestinfo.%variable %string"

And to read a custom GuestInfo variable:

C:\Program Files\VMware\VMware Tools\VMwareService.exe -cmd "info-get guestinfo.%variable"

Note that these custom GuestInfo variables are not stored permanently. They will survive a VM reboot, but will disappear with a VM shutdown. This command looked familiar, so I went and searched some more and found it in the VMware Scripting API documentation.

These backdoor capabilities are also mentioned in the VI3 Security Hardening white paper from VMware in the Limit Data Flow from the Virtual Machine to the Datastore section. And yes, amazingly both RPC commands can be initiated by a non-privileged user!

Close the backdoor
To see if I could use these commands maliciously, I wrote up a couple of quick scripts and launched them as testuser. For the ESX log spamming attack, I attempted to write 1 KB log entries in a loop that would repeat one million times. I then watched the log file grow, and it got up to about 1 MB in size, and then stopped growing. A quick tail on the log revealed the last message was <<< Log Throttled >>>. So that's a good sign, looks like VMware has placed some kind of sanity check on how these log files can grow. But I'd really prefer that this backdoor didn't exist at all, so let's try this obvious directive and see if it shuts the functionality off:
isolation.tools.log.disable = "true"

Good! That shut off logging through the backdoor, but not logging in general to the vmware.log file, which we want in place for troubleshooting.

Then I checked if it was possible to spam the ESX host's memory with fake custom variables. After writing up a quick VBScript that would cycle through a loop, increment an 'i' variable by one each time, and then create a custom variable 'test-i' and store 32 KB of string data in it, I found that I could only store 32 individual variables. That adds up to be exactly 1 MB, which is exactly how much data the VI3 Security Hardening white paper says we're allowed to write.

The VI3 Security Hardening white paper gives us two options for limiting this backdoor RPC channel. We could turn off setting GuestInfo variables from the VM altogether, with this directive:

isolation.tools.setInfo.disable = "true"

But I can't wholeheartedly recommend that, as it also prevents the VM from sending back it's IP address and DNS name information to the host, which means they are no longer displayed in the VI Client. There is no guarantee this won't break something like VCB or a third-party application.

We also have the option to lower the amount of memory available for storing GuestInfo variables, with this directive:

tools.setInfo.sizeLimit = "some # in bytes"

But really the 1 MB limit is pretty reasonable in the first place. I would suggest that if you can live without a VM's IP and DNS information in the VI Client, completely disabling GuestInfo variables is probably the safest way to go. But since the possibility for breakage is there, we won't list this parameter in a our final recommendations.

Shrinkage factor
If you open up the VMware Tools applet on your test VM again, you'll find the curious Shrink tab. I'm not describing it as curious because of its function, I can see where shrinking a virtual disk could be useful, though I have yet to use this feature in a production environment. The Shrink feature is curious to me because VMware has decided to make the option visible to any user, even non-administrators. When a Shrink operation is running, the system operates at 100% processor usage and becomes almost unusable, making it a nice little DoS attack for someone with malicious intent. However, the Shrink feature will not actually work unless the user has write permission to the root of the drive selected for shrinking, C:\ for instance. In a default Windows 2003 Server installation, the built-in Users group does not have write access to the root of the system partition, or to the roots of new volumes you create with the Disk Management console, so you may not be exposed to the Shrink attack vector.

But if another administrator, or an application, were to add write permissions to the root of a partition for the Users or Everyone groups, the potential to exploit the Shrink feature would become a real threat, especially on a Citrix server! Unless you are using Shrink in your environment, it's best to disable it altogether with these two directives:

isolation.tools.diskWiper.disable = "true"
isolation.tools.diskShrink.disable = "true"

PXE dust off
I came across this next virtual machine vulnerability while troubleshooting a SAN issue. The ESX hosts were rudely disconnected from their iSCSI LUNs for an unknown reason overnight, and when we opened up console connections to the VMs, we were surprised by what we found. All of the VMs were still in a running state, but had rebooted themselves after essentially having their hard drives ripped out while they were powered on. Since the .vmx configuration file for each VM had been loaded into memory on the ESX hosts, ESX didn't care that the iSCSI storage that contained the .vmx file was gone, so the VM went through its list of bootable devices, and landed on the PXE boot option when no others were available.

After seeing this first hand, I realized it could be used as a powerful attack vector. If a malicious person were able to connect a PXE boot server to the network, and then launch a DoS attack against the iSCSI network, they could reboot VMs with their own images loaded with the tools they need to launch further attacks. Fortunately, the PXE boot OPROM on the vlance and vmxnet virtual adapters is easy to disable:

vlance.noOprom = "true"
vmxnet.noOprom = "true"

If you are using the e1000 adapter in any virtual machines, you'll need to disable the PXE boot OPROM using a different method, as there is no .noOprom option for the e1000. The directive below will reduce the memory size that the BIOS makes available to the OPROM to zero, effectively preventing it from loading. Note that this is a per-adapter setting:

ethernet0.opromsize = "0"

No paste for you
To prevent the possibility that confidential information, such as passwords, could leak from the clipboard of the VI Client to the clipboard of a VM, or vice versa, the copy and paste capability between the VI Client and VM console should be disabled. This is easy enough to do with the following directives, and should have little impact on your environment unless you are using the VI Client for day to day administration of your servers:

isolation.tools.copy.disable = "true"
isolation.tools.paste.disable = "true"

A third directive is often mentioned with the copy.disable and paste.disable parameters, isolation.tools.setGUIOptions.enable = "false", but I can not find any documentation on exactly what this parameter does. Another curious thing is that it is sometimes written with GUI in all caps, and sometimes Gui with a capital G. You can also find references to the similar parameter: isolation.tools.setGUIOptions.disable, but again there is no documentation on this command, and after trying all the various forms, I can not find any difference with this directive set to "true" or "false". We'll include it in our list of directives to apply however because it is listed in several security and hardening guides, and it doesn't seem to affect anything whether it's on or off:

isolation.tools.setGUIOptions.enable = "false"

Log bucking
The last set of directives we'll recommend deal with the virtual machine log file, vmware.log. We've already seen how this log could potentially be abused through the virtual machine backdoor, but there is another threat from this file. I actually just witnessed this, we had a virtual machine that showed no signs of trouble, but was failing VCB backups every evening. Thankfully we were using VCB Wrangler to manage our virtual machine backups ( <- shameless plug ), a scheduled script that launches vbcMounter.exe and emails the backup results and a list of the files backed up with each VM. While looking through the email for any obvious signs of what was causing the failures, we noticed that the vmware-####.log files had rotated into the 7000 range! Issuing a ls -la command in the VM's directory from an SSH session, we could see that the log files had all rotated within the last hour. Some process was causing a lot of logging, and luckily we had already put these directives in place to rotate through 10 logs of 100000 bytes each:

log.rotateSize = "100000"
log.keepOld = "10"

Had these parameters not been in place, the VM's log files could have completely consumed the available free space on the VMFS LUN, causing a disaster of unknown potential.

Test test test
If you've made these recommended changes, and are testing things out within a VM, you may have noticed that on the Options tab in the VMware Tools applet, there are still two check boxes that a non-privileged user can modify and save, Show VMware Tools in the taskbar, and Notify if update is available. These two parameters are actually stored in the registry in the HKEY_CURRENT_USER hive and are therefore stored separately for each user account. You can watch them change to see for yourself by opening regedit and drilling down to HKEY_CURRENT_USER\Software\VMware, Inc.\VMware Tools\ShowTray. Since these aren't global parameters, and they don't really do anything (a non-administrator wouldn't be able to install a VMware Tools update), we'll just live with them.

Some are recommending that the file security settings be changed on the C:\Program Files\VMware\VMware Tools folder to restrict non-privileged user access to the VMware Tools settings. VMware has even issued a KB article with instructions on how to restrict the folder and completely disable access. The problem with this method is that even after you have completely restricted the folder, if a user is able to get VMControlPanel.cpl or VMwareService.exe from another source and copy it to their desktop, they can still use the VMware Tools control panel or the command line application to make changes.

You should really try this for yourself, copy VMControlPanel.cpl and VMwareService.exe to the testuser account's desktop folder, and then delete the C:\Program Files\VMware\VMware Tools folder, which will first require stopping all VMware services and killing the processes in Task Manager. Now log in as testuser and successfully make all of the changes discussed in this article! If you reboot the server after deleting the C:\Program Files\VMware\VMware Tools folder the VMware Tools services won't even start, but you are still able to make the changes!

I recommend you forget about this option and add the parameters to each .vmx file in your environment. This is the only way to fully remediate these vulnerabilities, besides uninstalling VMware Tools completely.

That's it for our recommended list of .vmx file hardening parameters. If you do some further research (and you should), you'll find that there are a lot of additional parameters not listed here, and many of them are included in security focused white papers as required parameters. I recommend you extensively test any parameters not listed here before rolling them out in a production environment. All of the parameters included in this article, except for isolation.tools.setOption.disable, vlance.noOprom, and vmxnet.noOprom, are listed as recommended in the VI3 Security Hardening white paper.

Recommended VMX Directives:

isolation.device.connectable.disable = "true"
isolation.device.edit.disable = "true"
isolation.tools.setOption.disable = "true"
isolation.tools.log.disable = "true"
isolation.tools.diskWiper.disable = "true"
isolation.tools.diskShrink.disable = "true"
isolation.tools.copy.disable = "true"
isolation.tools.paste.disable = "true"
isolation.tools.setGUIOptions.enable = "false"
log.rotateSize = "100000"
log.keepOld = "10"
vlance.noOprom = "true"
vmxnet.noOprom = "true"

# PXE boot on the e1000 vNIC can be disabled with this directive:
ethernet0.opromsize = "0"

Live and let vi
Here's a quick vi tutorial for those who haven't discovered its awesome power. Open a SSH session to a test ESX server and type vi at the prompt. If you press i on your keyboard, you'll notice -- INSERT -- appears in the lower left corner. Type in some text, and then press Esc on your keyboard, the -- INSERT -- disappears, and you can't just enter text like before. vi has two modes, command mode and insert mode, and after hitting Esc, you returned to command mode. To edit text like you would in any wysiwyg text editor, simply press i and make sure you see -- INSERT -- in the lower left corner.

The real power of vi lies in its 'modal' nature, but if you are just starting out, make all of your edits in -- INSERT -- mode, hit Esc to enter command mode, and then one of the following:

:wq (that's a colon, a w for write, and a q for quit)

or

:q! (that's a colon, a q for quit, and an exclamation for don't save anything)

Those are the only two commands you need for right now in command mode. If you like your changes, :wq, if you mess up and get lost, :q!

And before making any changes to a .vmx file, make sure the virtual machine is shut down, and back up the .vmx first, with a command like this (as root):

cp /vmfs/volumes/somevolume/somevm/somevm.vmx /root

That will copy the .vmx file to the root user's home directory, where it will be nice and safe in case something happens.

To edit a .vmx file with vi, just type:

vi /vmfs/volumes/somevolume/somevm/somevm.vmx


6 comments:

  1. Thanks for the article.
    I'm just curious - a lot of the stuff you've referenced here is predicated on the fact that the user has access to the VMware Tools cpl.

    I'd guess there are other more worrying things a user can access if they know what they are doing. Wouldn't the kinds of problems you're referencing here be addressed by a robust server desktop security configuration (i.e. using the example of Citrix, locking down the desktop so that users can't open cpls, can't start a run prompt/command shell etc).

    Just a thought...

    Jon

    ReplyDelete
  2. Jon, that's a great point and I may just have to set up a test lab with a lock down policy and see what I can do.

    I guess the point I was trying to make, that I somehow omitted from the post, was that VMware has forgotten to build in basic application-level security for each feature of VMware Tools. It's especially distressing because this app wields great low-level power, and it runs on multi-user systems.

    I probably picked on the Citrix thing too much, but it was a great demo for how even an unsophisticated user could do some ugly things.

    ~Robert

    ReplyDelete
  3. After adding the recommendations to the vmx file, is a reboot of the guest required?

    ReplyDelete
  4. Hey Andy, if the guest isn't completely shut down when you add the .vmx directives, the changes you make won't stick. So in a sense, it does require a reboot, but it's more of shut down then boot back up.

    ReplyDelete
  5. Great post.

    I have created VMware support requests for this because it's a very big security issue. They have proposed me the same solution: modifying the VMX file. However, this is not a solution, but a workaround IMHO.

    For a start, these VMX settings apply also to administrators, who should legitimally be allowed to perform the operations.

    Secondly, everytime VMware adds a "feature" to their tools, everybody will have to add the corresponding disabling lines to each VMX of each server in each of their datacenters. Completely stupid.

    The real issue is the VMware backdoor. The backdoor itselfs should make an authorization check on the executing user. THAT would solve all issues, once and for all.

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete