April 29, 2009

Protect the Service Console Network With a Virtual Firewall

Unless you've been living under a rock for a while, you've probably read the recommendations to completely isolate the ESX console network. This sage advice addresses a serious threat in a virtual infrastructure: if a malicious person out in the LAN were able to gain SSH access to an ESX console, they would have an unholy amount of power. With just two commands at a shell prompt, they could snapshot a virtual disk, FTP it off somewhere, and then use an assortment of tools on the completely exposed filesystem from the comfort of their lair. This network isolation best practice could be the single most important security consideration in an environment, and should also be extended to the VMotion and storage networks.

In fact, if security were the primary architecture concern, the ESX service console network and VirtualCenter server would be plugged into their own isolated network switch. The only way to log into VirtualCenter or SSH to an ESX server would be to march down the hall and pull out the monitor and keyboard in the server room. But obviously that isn't feasible for most organizations, so some sort of compromise between security and administrative access must be made.

One option is to install two NICs in the VirtualCenter servers; one in the LAN, and one plugged in to an isolated service console network switch. You could further enhance this by placing a physical firewall between the VirtualCenter server and the LAN, only allowing the specific ports needed for Active Directory authentication and VI client access from specific workstations to the VirtualCenter server.

With this design in place, an administrator would be able to get many things done from the comfort of a LAN workstation. But unfortunately there are a few critical elements that require direct network access to the ESX servers, such as virtual machine console access, physical-to-virtual migrations with Converter Enterprise, and VCB backups.

We could require that each workstation that needs VI Client or SSH access to the service console network have a second NIC and be physically patched into the isolated switch. But that's not very scalable, and a burden in all but the smallest of environments. There's a better option, and it provides a higher level of security: a virtual firewall with vNICs in two different port groups that only permits specific traffic between the console network and the LAN.

There are several good virtual firewall options out there, and many of them are ready-to-use virtual appliances. But for this application, a virtual appliance isn't going to cut it. We want to know exactly how the firewall VM was built, what applications were installed, and what services were enabled. Turns out that the Vyatta Community Edition open source networking software is just about perfect for what we're trying to accomplish here.

Vyatta is a full-featured networking operating system that provides routing, firewalling, VPN, IDS, DHCP, dynamic routing protocols, and pretty much all the features you would only expect from one of the major networking vendors. Think of it like a Linux-based version of Cisco IOS that can be installed on x86 hardware. And in the latest version, VC5, just released in March, they've really upped the ante by providing a web-based GUI.

Give it a spin, it's free!
Head over to the Vyatta.org site and you'll find download links that don't require registration. You'll probably notice that Vyatta also offers VC5 in a VMware Appliance format, but we'll be using the live CD image. Check out the video tutorials at the Vyatta.com site as well if you want to get a feel for what we're about to cover.

Once you've got the VC5 ISO file downloaded, set up a new virtual machine, perform a custom install, and give it Vyatta's minimum recommended settings:

  • 512 MB of RAM

  • 2 GB virtual hard disk

  • Other Linux (32-bit) for the guest operating system

  • Two virtual NICs, one in a port group on the same vSwitch as the Service Console, and one in a port group that uplinks to the LAN

Don't power up the VM just yet. First edit the VM's settings, and remove the floppy as we don't need it, and point the CD-ROM drive to the VC5 ISO.

The version I tested wasn't loading the advanced vmxnet virtual NIC, and instead was defaulting to the pcnet32 device, which is a very basic 10/100 NIC with limited features. So I went ahead and forced the VM to use an Intel PRO/1000 vNIC by editing the .vmx file for the VM and changing the vNIC to the e1000 device:

  • Open a SSH session to the ESX server hosting the virtual router

  • Browse to the datastore folder that contains the VM, and edit the .vmx with vi

  • Add a line for each vNIC:
    
    ethernet0.virtualDev = "e1000"
    
    ethernet1.virtualDev = "e1000"
    


  • And delete these lines to force a regeneration of the MAC addresses:
    
    ethernet0.addressType = "vpx"
    
    ethernet0.generatedAddress = "00:50:56:af:18:93"
    
    ethernet1.addressType = "vpx"
    
    ethernet1.generatedAddress = "00:50:56:bd:27:1e"
    
    


  • Now from vi press Esc and then :wq to save the .vmx file
Boot the VM, and when it arrives at a login prompt, log on to the system as root with the password vyatta

At the prompt enter this command:
install-system


The Vyatta installer will launch, an interactive script that prompts you for some basic information and confirmation during the install. Just choose all of the defaults during the install script, until you reach the prompt This will destroy all data on /dev/xxx, type Yes here to continue.

You'll be prompted to set the passwords for the system accounts, type Yes and enter strong passwords for the root and vyatta accounts.

When the installer finishes, shut the vm down with this command:

shutdown -h now
Disconnect the CD-ROM, check Connect at power on on the vNIC connected to the service console network, and boot the VM back up.

When the system boots up to the vyatta login: prompt, log on as user vyatta with the password you established during the install.

We first need to give the service console facing interface an IP address. To accomplish this, we'll enter configuration mode and assign an IP address, replacing the IP below with one appropriate for the environment. We're going to set ours to the gateway address we've already set on our ESX servers:


configure
set interfaces ethernet eth0 address 172.20.1.254/24
commit



All changes from the Vyatta console begin with the configure command to enter configuration mode, and end with a commit command to apply the changes.

We should save the changes now, so the configuration we made will survive a reboot of the Vyatta virtual machine. To save the configuration to the hard drive of the VM, just enter:


save



If you've set everything up correctly to this point, you should be able to ping that address from an ESX server.

If you type exit, you'll exit configuration mode, and be in 'operational' mode. If you have spent any time with IOS, you should be pretty familiar with this type of modal work flow. In fact, you can only view the Vyatta configuration while in operational mode, with the command show configuration, just like in IOS, which doesn't make the show run command available in configuration mode. Also, just like in IOS, if you hit tab, you should get command completion in the Vyatta CLI.

I've had enough of configuring the Vyatta VM through the VI Client, so let's enable SSH access and the Vyatta web GUI with these commands:


set service ssh
set service https
commit
save



Now you should be able to use putty to log in as the user vyatta via SSH. Let's try out the web console, open a web browser from a machine with access to the service console network to the IP address we set up earlier, in our case it will be https://172.20.1.254

Log in with the vyatta user account and password

If you take a look around the web GUI, it should be pretty obvious that this is not a system you're going to just intuitively figure out. The GUI represents the CLI commands as objects, so without knowing the CLI logic, the GUI really isn't that helpful. To see what I mean, drill down to interfaces > ethernet > eth0, selecting eth0. Check out the address: attribute, and notice that the path we took through the web GUI reflects the command we entered through the CLI: set interfaces ethernet eth0 address 172.20.1.254/24

So let's configure the LAN facing interface through the web GUI. Drill down to interfaces > ethernet > eth1, selecting eth1. In the address: attribute section in the right pane, enter the IP address you want to use, in our case we'll enter 10.1.1.254/24. The other settings are good, so we'll press the Set button.

Just like the CLI, we need to commit the changes before they become active, so press the Commit button from the upper right of the web GUI. If you connect the second vNIC from the VI Client now, you should be able to ping that address from the LAN. If everything is set up correctly, the Vyatta virtual machine should be routing between the service console network and the LAN. If you are not able to ping from both sides, then there's likely a routing or default gateway issue on one or both sides.

Let's configure a few system settings from the web GUI. Select system from the left side, and you'll see we can set the domain-name, gateway-address, host-name, name-server, ntp-server, and time-zone settings.

Start your filtering


In order to follow along with the rest of this tutorial, you'll need a few things in place:

  • An isolated ESX service console network.

  • A good relationship with the network administrator in your environment, or something to blackmail them with, because they'll need to add a static route to the ESX console network to the layer three device routing between the subnets in your LAN. If you're using RIP or OSPF in your network, the Vyatta VM is capable of running one of those dynamic routing protocols, but that's way beyond the scope of this.



In our test environment, we've got a simplified setup: the ESX hosts are using the service console facing interface IP of the Vyatta VM as their default gateway, and the LAN connected Windows VM we're using to connect to the ESX servers is using the LAN facing interface IP of the Vyatta VM as its default gateway.


ESX Hosts <-> Vyatta VM <-> LAN VM
172.20.1.x : 172.20.1.254 eth0 : 10.1.1.254 eth1 : 10.1.1.33


So at this point, we are routing between the networks and able to ping from both sides of the Vyatta VM. We want to restrict traffic in both directions, protecting the console network from the LAN, and also limiting the access the console network has to the LAN, so we'll need to create two firewall rule sets and apply them to the two interfaces.

This first rule set, lan_in, grants SSH and VI Client access to a range of ESX hosts (172.20.1.1 to 172.20.1.5) from host 10.1.1.33, and this is how we would add it using the CLI:


vyatta@vyatta:~$ configure
vyatta@vyatta# set firewall name lan_in rule 10 action accept
vyatta@vyatta# set firewall name lan_in rule 10 description "Allow SSH and VI client access from 10.1.1.33"
vyatta@vyatta# set firewall name lan_in rule 10 destination address 172.20.1.1-172.20.1.5
vyatta@vyatta# set firewall name lan_in rule 10 destination port ssh,https,902-903
vyatta@vyatta# set firewall name lan_in rule 10 protocol tcp
vyatta@vyatta# set firewall name lan_in rule 10 source address 10.1.1.33
vyatta@vyatta# set firewall name lan_in rule 10 state established enable
vyatta@vyatta# set firewall name lan_in rule 10 state new enable
vyatta@vyatta# set firewall name lan_in rule 10 state related enable
vyatta@vyatta# set firewall name lan_in rule 10 state invalid disable
vyatta@vyatta# set interfaces ethernet eth1 firewall in name lan_in
vyatta@vyatta# commit
vyatta@vyatta# save


To set up the matching rule set on the interface in the console network, we just flip the source and destination, and configure it like this from the CLI:


vyatta@vyatta:~$ configure
vyatta@vyatta# set firewall name console_in rule 10 action accept
vyatta@vyatta# set firewall name console_in rule 10 description "SSH and VI client replies to 10.1.1.33"
vyatta@vyatta# set firewall name console_in rule 10 destination address 10.1.1.33
vyatta@vyatta# set firewall name console_in rule 10 protocol tcp
vyatta@vyatta# set firewall name console_in rule 10 source address 172.20.1.1-172.20.1.5
vyatta@vyatta# set firewall name console_in rule 10 source port ssh,https,902-903
vyatta@vyatta# set firewall name console_in rule 10 state established enable
vyatta@vyatta# set firewall name console_in rule 10 state related enable
vyatta@vyatta# set firewall name console_in rule 10 state new disable
vyatta@vyatta# set firewall name console_in rule 10 state invalid disable
vyatta@vyatta# set interfaces ethernet eth0 firewall in name console_in
vyatta@vyatta# commit
vyatta@vyatta# save


Now if you attempt to ping an ESX host from the LAN side it should fail as each firewall rule set has an implicit deny as the last rule. In order to restore the ability to ping, you would need to add a rule to both the lan_in and console_in rule sets.

We'll add another accept rule for an additional administrator's workstation. This time we'll do it using the web GUI, so open a web browser, log in to the web GUI, drill down to firewall > name > lan_in > rule, and select rule. In the text box on the right, type 20 and then click Set

>. From the screen that opens up, we will set up the second rule just like the first. Drop down to accept in the action: box, enter a description like Allow SSH and VI client access from 10.1.1.62, enter tcp for the protocol:, and click the Set button.

From the left pane now, select destination, click the Create button, and type 172.20.1.1-172.20.1.5, our range of ESX server IP addresses, in the address: box. In the port: box, enter ssh,https,902-903 and click the Set button.

*By the way, if you're curious as to how the protocol names map to the numbered ports, just type less /etc/protocols

From the left pane again, select source, click the Create button, in the address: box enter 10.1.1.62 and click the Set button.

And finally, from the left pane, select state, click the Create button, check established:, new:, and related:, and then click the Set button.

We need a corresponding accept rule for the reply traffic back to the LAN workstation, so drill down to firewall > name > console_in > rule, and select rule. In the text box on the right, type 20 and then click the Set button.

Drop down the action: box to accept, enter the description SSH and VI client replies to 10.1.1.62, enter tcp for the protocol:, and click the Set button.

From the left pane, select destination, click the Create button, type 10.1.1.62 in the address: box, and click the Set button.

From the left pane again, select source, click the Create button, and in the address: box, enter 172.20.1.1-172.20.1.5. In the port: box, enter ssh,https,902-903 and click the Set button.

And finally, from the left pane, select state, click the Create button, check established: and related:, then click the Set button.

From the top of the web GUI, click the Commit button, and then the Save button, and you should now have VI Client and SSH access to the isolated ESX service console network from the second workstation.

The last configuration we should make to the virtual firewall is to lock down the Vyatta web GUI and SSH access from the LAN to just the workstations we defined above. To do that we'll create another rule set, but his time apply it as local to the interface rather than in. This will filter packets sent directly to the interface IP address.

From the left side of the web GUI, drill down to firewall > name, and select name. In the text box on the right, type lan_admin and then click the Set button.

Enter a description for the new rule set, like "Restricts administrative access from the LAN"

On the left, select rule. In the text box on the right, type 10 and then click the Set button.

Drop down the action: box to accept, enter the description Allow SSH and web GUI access from 10.1.1.33, enter tcp for the protocol:, and click the Set button.

From the left pane, select destination, click the Create button, type ssh,https in the port: box, and click the Set button.

From the left pane again, select source, click the Create button, in the address: box enter 10.1.1.33 and click the Set button.

And finally, from the left pane, select state, click the Create button, check established:, new:, and related:, and then click the Set button.

On the left side, drill down to interfaces > ethernet > eth1 > firewall > local, selecting local, and click the Create button. In the text box on the right, type lan_admin and then click the Set button.

Save the changes by clicking the Commit button, and then the Save button. You could now repeat those steps, creating a rule 20 under rule set lan_admin for the second administrator's workstation, 10.1.1.62.

Gleaming the cube
We've only touched the surface of the Vyatta networking software, and if you head over to the Vyatta website, you can download a lot of good reference material after a quick registration. If you take the time to read some of the documentation, you'll get a real sense for how dense and powerful the Vyatta operating system is, especially for extending virtual networking. Stay tuned, as we'll be doing some interesting projects with Vyatta in the future.
...read more

April 23, 2009

Find the update version of an ESX 3.5 server

Need to know the release version (Update 3, Update 4, etc.) of the ESX 3.5 servers in an environment, but don't want to consult a chart of build numbers? Just execute this command as root:


esxupdate query | grep -o 'ESX Server 3.5.0 Update.*' | sort | tail -n 1

If the server is a build of ESX prior to Update 1, the command will return nothing. It's nice to have the update version in this familiar format for documentation.

...read more

April 19, 2009

The Ultimate Kickstart File - Part 3

In the third and final chapter of our kickstart adventure, we're going to confront a scenario that would be a nightmare without a kickstart process in place: install and configure 100 ESX hosts. Good thing we've got our kickstart file dialed, because there is no way we're going to do that by hand. But we're still looking at some serious work in the UDA web console, adding 100 templates is a lot of clicks and copy and paste operations.

Thankfully, the UDA configuration files are pretty straightforward and easy to manipulate through the file system, and the underlying operating system is Fedora, so all the tools we need are available. Each individual UDA template is a plain text file with a .cfg extension, and they are stored in the /var/public/www/kickstart folder. The index file for the templates, /var/public/conf/templates.conf, is also a text file and can be directly edited with vi or modified from a shell script.

If you've been following this series, and have your kickstarts configured so that the only item that needs to be changed is the hostname, then the strategy is fairly simple:
  • Duplicate the master template .cfg file 100 times

  • Update the one configuration item in each template, the hostname

  • Add the 100 templates to the /var/public/conf/templates.conf file

Prep the master template
We're going to use sed to replace the hostname parameter in each template, so in the master template, use a marker that stands out from anything in the script. In this example we'll change esx02.vmnet.local from the kickstart script in Part 2 to 00HOSTNAME00.vmnet.local to make it easy to filter. Also, we don't want to overwrite our master template during this process, so make sure you give the master template a name that is not going to be in the range of names for the 100 cloned templates.

Make sure the perl regular expression you use to get the HOSTIP index works for the naming scheme you are going to use. In this example, we'll be naming the hosts pdx-labXXX.vmnet.local, and with the dash in there, the perl one-liner we wrote in Part 1 is not going to match. We need to change it in the master template to read:

HOSTIP=`/usr/bin/perl -e 'use Sys::Hostname; hostname() =~ /^[a-z-A-Z]+([0-9]+)\.?.*/ ; printf "%d", $1'`

The one-liner will now account for the dash by just adding it between the 'a-z' and 'A-Z'. It's a good idea to always test this out before committing to using a master template because if it doesn't get the IP from the hostname, the install is going to fail in a big way. Test out a few hostnames by hard coding them in a perl command on a test ESX server;
perl -e '$_ = "pdx-lab044.vmnet.local"; /^[a-z-A-Z]+([0-9]+)\.?.*/ ; printf "%d", $1'

One hundred templates with two commands
Let's break down the two one-liners we're going to use to create the templates. First, change directories into the UDA folder that stores the kickstart files:

cd /var/public/www/kickstart

Here's how the first one-line bash script will work:
  • First use seq to print the series of numbers 1 to 100 using a three digit format in a for loop

  • Then for each number, cat the master template into a sed filter

  • Use sed to replace the 00HOSTNAME00 tag with the specific hostname and number

  • And finally redirect each instance of the loop into a config file named with the sequence number


for i in `seq -f %03g 1 100`; do cat mastr.cfg | sed "s/00HOSTNAME00/pdx-lab$i/" > ./pd$i.cfg; \ 
chown apache:apache ./pd$i.cfg; done

There's a big gotcha here, remember that UDA insists you give each template a five character template ID, and we want all of our file names and template IDs to match up. In order for the file names from this command to end up being 5 characters long, we can only use two characters in front of the three digits generated from the seq command.

Now we need to update the templates.conf file to include all of the new kickstarts we just created. If you cat templates.conf, you'll notice that this is where the boot parameters for each template are kept. The boot parameters are visible in the UDA web console when you edit a template, in the text box just below the Save button. In our master template, the line is append ip=dhcp ksdevice=eth0 load_ramdisk=1 initrd=initrd.esx301 network ks=http://172.20.1.245/kickstart/mastr.cfg

The UDA templates.conf file has a unique line for each template with the boot parameters, and prepends a section in front with the unique name of the template, the OS for the template, and the Bind to MAC: address if you are using it:
mastr;esx301;configfilename;;00-00-00-00-00-00;
We'll need to modify each new line as we add it to the file, changing mastr to the five character template ID we used when cloning the templates. Copy the boot parameters line from your master template in the UDA web console to use in the command below.

First we'll change directories into the /var/public/conf folder:

cd /var/public/conf

And back up templates.conf first just in case:

cp -p ./templates.conf ./backup_templates.conf

This is how the second one-line bash script will work:
  • Use seq again to print 1 to 100 using a three digit format in a for loop

  • Echo the boot parameters for each kickstart file, substituting the value of i each time

  • Append each unique line of the loop into the ./templates.conf file


for i in `seq -f %03g 1 100`; do echo \
"pd$i;esx301;configfilename;;00-00-00-00-00-00;append ip=dhcp ksdevice=eth0 \
load_ramdisk=1 initrd=initrd.esx301 network ks=http://172.20.1.245/kickstart/pd$i.cfg" \
>> ./templates.conf; done

Right after you issue that command, if you click on Templates in the UDA web console, you should see all 100 of the new templates. Click Edit Configuration on one of them and check to make sure the hostname changed.

We're almost done, but if you boot up an unconfigured server hoping to use the new templates, you'll find that the TFTP boot menu doesn't have them.

The UDA wasn't meant to be used in this way, it only updates the boot menu when a template is added or removed. So all we need to do is create a dummy template, name it 12345, and just save it without configuring anything. Then go right back to Templates and delete it. That will rebuild the boot menu with all 100 options. You'll find the new TFTP boot menu scrolls all the way down the list to the last in the series of available templates, which is sort of lame, but it's not a big enough deal to try and figure out how the menu is created.

I messed up, how can I delete these without clicking 100 times in the UDA?!
If you backed up templates.conf with the command above before trying any of this out, just use mv with the -f (force) option to restore your original templates.conf file:

cd /var/public/conf
mv -f ./backup_templates.conf ./templates.conf

Now you can also delete all the template files, but first test the bulk delete with an ls command:

cd /var/public/www/kickstart
ls pd*.cfg

If ls gives you a list of the specific files you want to delete, then you can safely remove them with rm:

cd /var/public/www/kickstart
rm -f pd*.cfg

Disclaimer: Running these commands on your UDA virtual appliance could potentially spam it with thousands of bogus templates or ruin the template index file if you are not careful, so please back it up or take a snapshot before attempting this. The specific commands listed above were tested in the lab, but you are on your own when modifying the examples for use in your environment. Be careful and know what the commands are going to do before executing them!

Also, the creator of the UDA probably never meant for the appliance to be used in this way, so we should apologize and promise to never request assistance with this or complain if we ruin our UDAs with these commands. :-J

And finally, pay close attention to when backticks (`) are used as opposed to single quotes ('). I should be using the $( ) syntax for command substitution, but I just don't like the way it looks, I'm stuck on backticks.


# Create 100 templates, replacing 00HOSTNAME00 from a template 
# named /var/public/www/kickstart/mastr.cfg with pdx-labXXX
# and naming each new template pdXXX.cfg
 
for i in `seq -f %03g 1 100`; \
do cat /var/public/www/kickstart/mastr.cfg | sed "s/00HOSTNAME00/pdx-lab$i/" \
> /var/public/www/kickstart/pd$i.cfg; \
chown apache:apache /var/public/www/kickstart/pd$i.cfg; \
done


# Backup /var/public/conf/templates.conf, preserving ownership

cp -p /var/public/conf/templates.conf /var/public/conf/backup_templates.conf


# Add the 100 templates named pdXXX.cfg to the UDA template index 

for i in `seq -f %03g 1 100`; do echo \
"pd$i;esx301;configfilename;;00-00-00-00-00-00;append ip=dhcp ksdevice=eth0 \
load_ramdisk=1 initrd=initrd.esx301 network ks=http://172.20.1.245/kickstart/pd$i.cfg" \
>> /var/public/conf/templates.conf; done

...read more

April 14, 2009

The Ultimate Kickstart File - Part 2

In Part 2 of our quest for The Ultimate Kickstart File, we'll get down and dirty with the %post section. We'll create a shell script and set it to execute after reboot, and use the strategy we explored in Part 1 for deriving the IP addresses for the various networking components by grabbing the digits from the ESX hostname. The kickstart we build here could be used to install hundreds of ESX hosts, requiring only one change per host.

Since we'll be changing the IP configuration of the Service Console after the kickstart installation, use DHCP for the boot protocol when setting up a template with UDA.

There's a lot going on here, so we'll break out each section as we go. To see the whole kickstart file come together, just scroll down to the bottom.

In the first part of our %post section, we'll get some of our Linux customizations out of the way. As we saw in Part 1, commands in the %post section will execute before the ESX host reboots for the first time, so there are no VMware services running yet. Our VMware commands will need to wait until we construct the shell script that will run after the first reboot. These first commands will affect the Linux service console only.

As you are working your way through these commands, notice when we are using the > symbol to clobber and create new files, and the >> symbol to append to the end of existing files.

We'll create our SSH banner and append sshd_config with its location:

/bin/cat > /etc/ssh/banner <<'SSHEOF' 
         ========================================= 
          WARNING: UNAUTHORIZED USE IS PROHIBITED 
         ----------------------------------------- 

     All Virtual Foundry maintained telecommunications, 
     data information systems,  and  related  equipment 
     are  for the communication, transmission, process- 
     ing, and  storage of Virtual  Foundry  information 
     only,  and  should  only be accessed by authorized 
     Virtual  Foundry   employees.  These  systems  and 
     equipment  are subject to authorized monitoring to 
     ensure  proper  functioning,  to  protect  against 
     unauthorized  use,  and to verify the presence and 
     performance of applicable security features.  Such 
     monitoring  may result in the acquisition, record- 
     ing, and analysis of all data being  communicated, 
     transmitted,  processed,  or stored in this system 
     by a user. 
     If monitoring reveals possible evidence of  crimi- 
     nal activity, such evidence may be provided to law 
     enforcement personnel.  Anyone using  this  system 
     expressly consents to such monitoring. 

SSHEOF 

/bin/echo "banner /etc/ssh/banner" >> /etc/ssh/sshd_config 

This will send warnings and errors to the various consoles, which can come in handy during those crazy 'the network is down!' moments when you are running to the server racks. You can press Alt-F2, Alt-F3, and Alt-F4 to access the different consoles and see various degrees of message importance:

/bin/cp /etc/syslog.conf /etc/backup_syslog.conf

/bin/cat >> /etc/syslog.conf <<'EOFSYSLOG'
# Send error msgs to tty2-tty4 for troubleshooting
*.crit /dev/tty2
*.err /dev/tty3
*.warning /dev/tty4
EOFSYSLOG

Now we'll customize the root account's .bashrc file and add some custom colors so any VMware specific file extensions will stand out when we list directories. I always add an alias named lah for ls -lah, as I use this command constantly:

/bin/cp /root/.bashrc /root/backup_bashrc 

/usr/bin/dircolors -p > /root/.dircolors 

/bin/cat >> /root/.bashrc <<'BASHRCEOF' 
[ -e "$HOME/.dircolors" ] && DIR_COLORS="$HOME/.dircolors" 
[ -e "$DIR_COLORS" ] || DIR_COLORS="" 
eval "`dircolors -b $DIR_COLORS`" 
alias ls='ls --color=auto' 
alias lah='ls -lah' 
BASHRCEOF 

/bin/cat >> /root/.dircolors <<'DIRCOLORSEOF' 
# VMware files 
.vmx 00;36 
.vmdk 01;35 
.vmtx 01;32 
.vmsn 01;31 
DIRCOLORSEOF

I like to have the customizations made to root's bash shell available to the non-root user automatically, so I'll copy root's .bashrc and .dircolors files to the /etc/skel directory, which is like a template directory for new user accounts. There are several different ways you could do this, like using /etc/profile and /etc/bashrc for instance, but I like the flexibility of giving each user a base set of rc files that they are free to customize as they see fit:

mv -f /etc/skel/.bashrc /root/backup_skel_bashrc
cp /root/.bashrc /etc/skel/.bashrc 
cp /root/.dircolors /etc/skel/.dircolors 

Adding the non-root user is simple enough with useradd, but first we need to generate an encrypted password. To do this, just execute /sbin/grub-md5-crypt from the console of one of your ESX servers and enter the password you want to use. grub-md5-crypt will output an encrypted password that we can use with useradd. Make sure you single quote the password, we don't want bash doing any parameter substitution when it encounters special shell characters:

useradd -p '$1$IIoWw$UBdW2FnKMwci0OeBXmM.i0' admin

Now we'll start writing our post-reboot configuration script. When these commands are executed, we'll have running VMware services, so the vmware-vim-cmd and esxcfg-* commands are at our disposal. Also notice that we're creating the post-reboot script in the root directory. Since this script will have encrypted passwords and a good deal of information about the ESX server that was configured by it, we should place it in a secure location rather than on /tmp, which is world writable by default:

/bin/cat > /root/esx_script.sh <<'ESXCFG'
#!/bin/bash

It's always a good practice to define constants for variables you know might change at some point in the future, so we'll define our DNS, NTP, and iSCSI server IP addresses. If you do this, remember to enclose the constants in double quotes rather than single quotes if you need to quote them so bash will perform parameter substitution. Also, we have to place these within the post-reboot cat script, as they won't be expanded until the post-reboot script runs because we have quoted our cat script limit string.

CDNSSERVER1='172.20.1.240'
CDNSSERVER2='172.20.1.241'
CNTPSERVER1='172.20.1.240'
CISCSITARG1='172.21.1.250'

We'll have to pause for a bit after the reboot to give the VMware services time to initialize. Four minutes is probably overkill, but it's safe:

sleep 4m

We need to add our DNS servers to resolv.conf, as the ESX host is currently picking those up from DHCP:

/bin/echo "nameserver $CDNSSERVER1" > /etc/resolv.conf
/bin/echo "nameserver $CDNSSERVER2" >> /etc/resolv.conf

Let's grab our host IP address from our hostname like we covered in Part 1:

HOSTIP=`perl -e 'use Sys::Hostname; hostname() =~ /^[a-zA-Z]+([0-9]+)\.?.*/ ; printf "%d", $1'`

We'll change the Service Console IP address from the DHCP setting we put in the kickstart configuration section to the real IP address we derived from the hostname:

/usr/sbin/esxcfg-vswif --ip 172.20.1.$HOSTIP --netmask 255.255.255.0 vswif0

We're also going to change the name of the Service Console to differentiate it from the iSCSI Service Console we'll need to configure as well:

/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--portgroup-name="ESX_Console" vSwitch0 "Service Console"

Then we'll add a port group named Console_Net that we can use to home VMs in the ESX Console Network:

/usr/sbin/esxcfg-vswitch --add-pg="Console_Net" vSwitch0

Now we'll add another vSwitch, linking vmnic1 to it, and create our primary VM network, Server_Net:

/usr/sbin/esxcfg-vswitch --add vSwitch1
/usr/sbin/esxcfg-vswitch --link=vmnic1 vSwitch1
/usr/sbin/esxcfg-vswitch --add-pg="Server_Net" vSwitch1

Let's create our iSCSI networking at this point. Notice we're using the 172.21.1.1 - 100 range for the vmkernel IP addresses, and the 172.21.1.101 - 200 range for the iSCSI Service Console. In order to get the iSCSI Service Console IP address from the hostname, we tell bash to perform arithmetic expansion with the $(( )) command and add 100 to our host IP:

/usr/sbin/esxcfg-vswitch --add vSwitch2
/usr/sbin/esxcfg-vswitch --link=vmnic2 vSwitch2
/usr/sbin/esxcfg-vswitch --add-pg="iSCSI_Console" vSwitch2
/usr/sbin/esxcfg-vswif --add vswif1 --ip 172.21.1.$(($HOSTIP+100)) \
--netmask 255.255.255.0 --portgroup "iSCSI_Console"

/usr/sbin/esxcfg-vswitch --add-pg="iSCSI_VMkernel" vSwitch2
/usr/sbin/esxcfg-vmknic --add --ip 172.21.1.$HOSTNUMBER \
--netmask 255.255.255.0 "iSCSI_VMkernel"

/usr/sbin/esxcfg-route --add default 172.21.1.254

We've got one more vSwitch to add, this one will handle VMotion traffic:

/usr/sbin/esxcfg-vswitch --add vSwitch3
/usr/sbin/esxcfg-vswitch --link=vmnic3 vSwitch3
/usr/sbin/esxcfg-vswitch --add-pg="VMotion_VMkernel" vSwitch3
/usr/sbin/esxcfg-vmknic --add --ip 172.22.1.$HOSTIP \
--netmask 255.255.255.0 "VMotion_VMkernel"

/usr/bin/vmware-vim-cmd hostsvc/vmotion/vnic_set vmk1

Our security policy dictates that we disable Promiscuous Mode, MAC Address Changes, and Forged Transmits on all vSwitches and port groups, so we'll put those commands here:

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch0

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch0

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch0

/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--securepolicy-promisc=false vSwitch0 ESX_Console

/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--securepolicy-macchange=false vSwitch0 ESX_Console

/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--securepolicy-forgedxmit=false vSwitch0 ESX_Console

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch1

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch1

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch1

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch2

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch2

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch2

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch3

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch3

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch3

Since we started out with a DHCP assigned address, we'll have to manually add an entry to /etc/hosts with the proper IP:

printf "172.20.1.$HOSTIP\t\t`hostname` $BAREHOST\n" >>/etc/hosts

Now we'll allow iSCSI traffic through the firewall, enable the software iSCSI adapter, set an IQN name, and add a send target. We'll even set up our CHAP password:

BAREHOST=`hostname | cut --delimiter . --field 1`

/usr/bin/vmware-vim-cmd hostsvc/firewall_enable_ruleset swISCSIClient
/usr/bin/vmware-vim-cmd hostsvc/storage/software_iscsi_enabled true
/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_set_name \
vmhba32 iqn.1998-01.com.vmware:$BAREHOST

/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_add_send_target \
vmhba32 $CISCSITARG1

# Set CHAP password
/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_enable_chap \
vmhba32 iqn.1998-01.com.vmware:$BAREHOST "0123456789abcdef"

EDIT - this section had turned into an essay on random number generation, so I removed the bad options and am just listing the one now ~ RP

Notice that we specified the CHAP password in clear text here. That's not very secure as it will be hanging around on our UDA server. So let's do that a little more securely, by first using a bash one-liner to generate a random password into a text file on the ESX server itself, and then we'll read it from the text file for our iscsi_enable_chap command:

until [ ${#KEY} -eq 16 ]; do D=$(od -A n -N 1 -t u1 </dev/random); \
if ([ $D -ge 48 ] && [ $D -le 57 ]) || \
   ([ $D -ge 65 ] && [ $D -le 90 ]) || \
   ([ $D -ge 97 ] && [ $D -le 122 ]); \
then KEY="$KEY"$(printf \\$(printf '%03o' $D)); fi; done; \
printf $KEY >/root/chap_pwd.txt; unset KEY

We're telling bash to loop until the key is 16 characters long (${#var} returns variable length), read one byte from /dev/random into od (octal dump), not show us the offset value (-A n) as we just want the data, and output it in unsigned decimal format (-t u1). This gives us an integer ranging in value from 0 to 255. This range nicely covers the decimal values for the ASCII character sets we want to grab, so we'll run each number through a series of short circuits to test if it is in the range of the character sets:[0-9], [a-z], [A-Z]. If a number is in the range, we'll add it to the end of the key by using printf to convert the decimal to an octal value, and printf again to print the ASCII character that corresponds to the octal value.

Now make sure the chap_pwd.txt file is only readable by root:

chmod 600 /root/chap_pwd.txt

And we'll read the password file into the iscsi_enable_chap command:

# Set CHAP password
CHAP_PWD=`cat /root/chap_pwd.txt`; \
/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_enable_chap \
vmhba32 iqn.1998-01.com.vmware:$BAREHOST $CHAP_PWD

On to the NTP setup. This was gleaned from viewing the ntp.conf and step-tickers files after setting up NTP through the VI Client:

/usr/sbin/esxcfg-firewall --enableService ntpClient

/bin/mv -f /etc/ntp.conf /etc/backup_ntp.conf

/bin/cat > /etc/ntp.conf <<NTPEOF
restrict kod nomodify notrap noquery nopeer
restrict 127.0.0.1
server 127.127.1.0
server $CNTPSERVER1
driftfile /var/lib/ntp/drift
NTPEOF

/bin/mv -f /etc/ntp/step-tickers /etc/ntp/backup_step-tickers
 
/bin/cat > /etc/ntp/step-tickers <<STEPEOF
server 127.127.1.0
server $CNTPSERVER1
STEPEOF

chkconfig --level 2345 ntpd on
service ntpd restart

Our security policies dictate that we set the non-root user account used for SSH access, admin in this example, to have a minimum password age of 2 days, a maximum of 90 days between password changes, and 14 days notice before the password expires. We'll use the chage command to change the password expiration settings. The esxcfg-auth command will enable the PAM module pam_tally.so, which tracks failed login attempts, and uses by default a failed attempt counter database /var/log/faillog, which we'll create and restrict to root access only. Finally, we'll use the faillog command to set the admin account to 5 login attempts with a 120 minute lockout duration:

chage -m 2 -M 90 -W 14 admin

# enables pam_tally.so for failed attempt lockout
esxcfg-auth --maxfailedlogins=5

# create a failed login log file
touch /var/log/faillog
chown root:root /var/log/faillog
chmod 600 /var/log/faillog

# set max attempts to 5 and duration to 120 minutes for user admin
faillog -u admin -m 5 -l 7200

We'll end the cat script with the closing limit string:

ESXCFG

And add the executable bit to our shell script and remove all access for everyone but root:

chmod 700 /root/esx_script.sh

So we've created a shell script, but what's going to execute it before we log in to the ESX server for the first time? We're going to use the rc.local script to run our shell script after the ESX installation process reboots the host. The rc.local script is the last init script to run before the system presents a login prompt, and it's perfect for launching a one-time configuration script.

First of all, let's make a backup of rc.local:

cp /etc/rc.d/rc.local /etc/rc.d/rc.local.backup

Now we'll use a cat script to append our commands on to the end of rc.local:

cat >> /etc/rc.d/rc.local <<RCCFGEOF
/root/esx_script.sh
# Move rc.local.backup back to rc.local, restoring the original
mv -f /etc/rc.d/rc.local.backup /etc/rc.d/rc.local
RCCFGEOF

We're done! We'll paste our worked up %post section into a template through the UDA web console and fire up an unconfigured server.

During the kickstart install, you can see what's going on behind the scenes by pressing Alt-F3, Alt-F4, and Alt-F5 to see the logging from various processes as the system installs. From the Alt-F3 screen, you can see if there are any errors in your script as the %post section is parsed.

Use the source
If you watch the console of the ESX server as it is being deployed automatically, press Alt-F1 when the ESX splash screen comes up and you can see the post-reboot processes execute. You'll notice that some of the commands being executed by the shell script are sending their output to the terminal. It would be nice if we could write our own messages during the post-reboot shell script, not only to see the script's progress as it executes, but also to see if any of the commands fail. It would be even better if we could format the output to match the output of the init scripts, with the fancy green OKs and red FAILs. But we don't have time for that and it would be a pain, wouldn't it?

It actually couldn't be any easier! Take a look at the shell scripts in /etc/init.d, they almost all contain this line near the top:

. /etc/init.d/functions

- or -

. /etc/rc.d/init.d/functions

The period is a bash shortcut for the source builtin command, which runs the script specified under the current process, giving the current shell access to the subroutines and variables in the sourced script. And since /etc/init.d is just a symbolic link to /etc/rc.d/init.d, they are all sourcing the same file.

If you look through /etc/rc.d/init.d/functions, you'll see the functions that the startup scripts are calling, such as success, failure, action, and you can try them out in your shell. Just source functions in your shell by typing, . /etc/rc.d/init.d/functions

Then type success to see the familiar [ OK ], or failure to get a [FAILED].

You can browse through the startup scripts to get all kinds of ideas for how to use these, but we're going to use the action function in our Ultimate Kickstart File. Calling action will print out the descriptive string we want, run the command we specify, and then give us an [ OK ] or [FAILED] depending on the exit status of the command, plus it will even log the results to /var/log/messages for us!

We could do one big action, using it to run the entire post-reboot script:

cat >> /etc/rc.d/rc.local <<RCCFGEOF
action "Executing ESX Configuration Script: " /root/esx_script.sh
# Move rc.local.backup back to rc.local, restoring the original
action "Restoring orginal rc.local: " mv -f /etc/rc.d/rc.local.backup /etc/rc.d/rc.local
RCCFGEOF

But we want to know when the script hits a few key places, so when we're staring at the boot screen, we have something to look at. Plus it makes things much more exciting when showing your boss the kickstart script you've just spent the past two weeks customizing! In the final kickstart script at the bottom, you'll notice a few action calls sprinkled around. Notice that they also provide some handy comments throughout the script, documenting things as we go. One pitfall, make sure you don't action any commands that pipe their output to another file or command, as you'll actually be redirecting the output from action. Instead, just tell action to run /bin/true, which does nothing more than return successfully.

Known issues
I've heard of folks having issues with physical NICs getting swapped around during a kickstart deployment, and have seen this issue myself on a HP DL360. This article from Frank Denneman's excellent blog seems to point to a solution, but unfortunately I don't have any HP hardware right now to test with. If you have seen this issue, and have a working solution, posting a comment with the details would be greatly appreciated.

There are also a couple of syntactic gotchas to watch out for. Make sure there is no white space after your limit strings in the cat scripts, you know, the BLAHBLAHEOF things scattered all over this script. Just one space at the end of a limit string will cause everything below it to be included in the cat command. Also, watch for when a backtick (`) is used as opposed to just a single quote (').

And finally, this kickstart script isn't meant to address every configuration item you'll need in your environment, there's a lot of security tweaking you can put in here as well. But hopefully this will give you some ideas to play with when you are getting it dialed in.

Pretty sweet, but...
So that's slick and all, but what if we had 100 ESX hosts to build?! We've gotten each template down to only one change per ESX host, but that's still 100 edits to make by hand! Plus we'll have to add 100 templates to the UDA web console, not cool!

In Part 3, the final part of our kickstart crushing series, we'll look at this exact scenario, and configure 100 UDA templates for 100 unique ESX hosts with just two commands.

Sample script:
Many of the commands listed in this script were generously shared with the community by virtualization and kickstart pioneers. Many thanks and kudos to the blogs and user forum members who have developed and given us this info!


# Regional Settings
keyboard us
lang en_US
langsupport --default en_US
timezone America/Los_Angeles

# Installatition settings
skipx
mouse none
firewall --disabled
rootpw --iscrypted  $1$hL0l./$ifxVO8MxcXwYoQ5sfdzQn0
reboot
install
url --url http://172.20.1.245/esx/esx301/

# Driver disks

# Load drivers

# Bootloader options
bootloader --location=mbr --driveorder=sda  

# Authentication
auth --enableshadow --enablemd5

# Partitioning
clearpart --all --drives=sda --initlabel
part /boot --fstype ext3  --size 250  --ondisk=sda --asprimary
part / --fstype ext3  --size 5120  --ondisk=sda --asprimary
part swap   --size 1600  --ondisk=sda --asprimary
part /var/log --fstype ext3  --size 4096  --ondisk=sda 
part /var --fstype ext3  --size 4096  --ondisk=sda 
part /opt --fstype ext3  --size 2048  --ondisk=sda 
part None --fstype vmfs3  --size 1 --grow --ondisk=sda 
part /tmp --fstype ext3  --size 2048  --ondisk=sda 
part /home --fstype ext3  --size 2048  --ondisk=sda 
part None --fstype vmkcore  --size 100  --ondisk=sda 


#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#+ Initial Network Configuration
#+ Change --hostname
#+
network --device eth0 --bootproto dhcp --hostname esx02.vmnet.local --addvmportgroup=0 
#+
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

vmaccepteula

%packages
@base
@ everything

%post

/bin/cat > /etc/ssh/banner <<SSHEOF
         ========================================= 
          WARNING: UNAUTHORIZED USE IS PROHIBITED 
         ----------------------------------------- 

     All Virtual Foundry maintained telecommunications, 
     data information systems,  and  related  equipment 
     are  for the communication, transmission, process- 
     ing, and  storage of Virtual  Foundry  information 
     only,  and  should  only be accessed by authorized 
     Virtual  Foundry   employees.  These  systems  and 
     equipment  are subject to authorized monitoring to 
     ensure  proper  functioning,  to  protect  against 
     unauthorized  use,  and to verify the presence and 
     performance of applicable security features.  Such 
     monitoring  may result in the acquisition, record- 
     ing, and analysis of all data being  communicated, 
     transmitted,  processed,  or stored in this system 
     by a user. 
     If monitoring reveals possible evidence of  crimi- 
     nal activity, such evidence may be provided to law 
     enforcement personnel.  Anyone using  this  system 
     expressly consents to such monitoring. 

SSHEOF

/bin/echo "banner /etc/ssh/banner" >> /etc/ssh/sshd_config

/bin/cp /etc/syslog.conf /etc/backup_syslog.conf

/bin/cat >> /etc/syslog.conf <<'EOFSYSLOG'
# Send error msgs to tty2-tty4 for troubleshooting
*.crit /dev/tty2
*.err /dev/tty3
*.warning /dev/tty4
EOFSYSLOG

/bin/cp /root/.bashrc /root/backup_bashrc

/usr/bin/dircolors -p > /root/.dircolors

/bin/cat >> /root/.bashrc <<'BASHRCEOF'
[ -e "$HOME/.dircolors" ] && DIR_COLORS="$HOME/.dircolors"
[ -e "$DIR_COLORS" ] || DIR_COLORS=""
eval "`dircolors -b $DIR_COLORS`"
alias ls='ls --color=auto'
alias lah='ls -lah'
BASHRCEOF

/bin/cat >> /root/.dircolors <<'DIRCOLORSEOF'
# VMware files
.vmx 00;36
.vmdk 01;35
.vmtx 01;32
.vmsn 01;31
DIRCOLORSEOF

mv -f /etc/skel/.bashrc /root/backup_skel_bashrc
cp /root/.bashrc /etc/skel/.bashrc
cp /root/.dircolors /etc/skel/.dircolors

useradd -p '$1$IIoWw$UBdW2FnKMwci0OeBXmM.i0' admin

#########################################################################################
#
# The section below is the post-reboot script, VMware commands should be placed here
#
/bin/cat > /root/esx_script.sh <<'ESXCFG'
#!/bin/bash

. /etc/rc.d/init.d/functions

# Define constants for various servers here
CDNSSERVER1='172.20.1.240'
CDNSSERVER2='172.20.1.241'
CNTPSERVER1='172.20.1.240'
CISCSITARG1='172.21.1.250'

action "   Sleeping four minutes: " sleep 4m

BAREHOST=`hostname | cut --delimiter . --field 1`

action "   Using hostname $BAREHOST: " /bin/true

HOSTIP=`/usr/bin/perl -e \
'use Sys::Hostname; hostname() =~ /^[a-z-A-Z]+([0-9]+)\.?.*/ ; printf "%d", $1'`

action "   Using host IP index of $HOSTIP: " /bin/true

action "   Setting Service Console IP address: " \
/usr/sbin/esxcfg-vswif --ip 172.20.1.$HOSTIP --netmask 255.255.255.0 vswif0

action "   Adding DNS server info to resolv.conf: " /bin/true

/bin/echo "nameserver $CDNSSERVER1" > /etc/resolv.conf
/bin/echo "nameserver $CDNSSERVER2" >> /etc/resolv.conf

action "   Setting up virtual networking: " /bin/true

/usr/bin/vmware-vim-cmd /hostsvc/net/portgroup_set \
--portgroup-name="ESX_Console" vSwitch0 "Service Console"

/usr/sbin/esxcfg-vswitch --add-pg="Console_Net" vSwitch0

/usr/sbin/esxcfg-vswitch --add vSwitch1
/usr/sbin/esxcfg-vswitch --link=vmnic1 vSwitch1
/usr/sbin/esxcfg-vswitch --add-pg="Server_Net" vSwitch1

/usr/sbin/esxcfg-vswitch --add vSwitch2
/usr/sbin/esxcfg-vswitch --link=vmnic2 vSwitch2
/usr/sbin/esxcfg-vswitch --add-pg="iSCSI_Console" vSwitch2
/usr/sbin/esxcfg-vswif --add vswif1 --ip 172.21.1.$(($HOSTIP+100)) \
--netmask 255.255.255.0 --portgroup "iSCSI_Console" >/dev/null 2>&1

/usr/sbin/esxcfg-vswitch --add-pg="iSCSI_VMkernel" vSwitch2
/usr/sbin/esxcfg-vmknic --add --ip 172.21.1.$HOSTIP \
--netmask 255.255.255.0 "iSCSI_VMkernel"

/usr/sbin/esxcfg-route --add default 172.21.1.254 >/dev/null 2>&1

/usr/sbin/esxcfg-vswitch --add vSwitch3
/usr/sbin/esxcfg-vswitch --link=vmnic3 vSwitch3
/usr/sbin/esxcfg-vswitch --add-pg="VMotion_VMkernel" vSwitch3
/usr/sbin/esxcfg-vmknic --add --ip 172.22.1.$HOSTIP \
--netmask 255.255.255.0 "VMotion_VMkernel"

/usr/bin/vmware-vim-cmd hostsvc/vmotion/vnic_set vmk1

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch0
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch0
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch0

/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--securepolicy-macchange=false vSwitch0 ESX_Console
/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--securepolicy-promisc=false vSwitch0 ESX_Console
/usr/bin/vmware-vim-cmd hostsvc/net/portgroup_set \
--securepolicy-forgedxmit=false vSwitch0 ESX_Console

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch1
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch1
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch1

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch2
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch2
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch2

/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-promisc=false vSwitch3
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-macchange=false vSwitch3
/usr/bin/vmware-vim-cmd hostsvc/net/vswitch_setpolicy \
--securepolicy-forgedxmit=false vSwitch3

action "   Adding entry to /etc/hosts: " /bin/true
printf "172.20.1.$HOSTIP\t\t`hostname` $BAREHOST\n" >>/etc/hosts

action "   Opening iSCSI firewall ports: " \
/usr/bin/vmware-vim-cmd hostsvc/firewall_enable_ruleset swISCSIClient

action "   Enabling software iSCSI: " \
/usr/bin/vmware-vim-cmd hostsvc/storage/software_iscsi_enabled true

action "   Setting iSCSI IQN name: " \
/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_set_name \
vmhba32 iqn.1998-01.com.vmware:$BAREHOST

action "   Adding iSCSI send target: " \
/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_add_send_target \
vmhba32 $CISCSITARG1

action "   Generating random iSCSI CHAP password: " /bin/true

until [ ${#KEY} -eq 16 ]; do D=$(od -A n -N 1 -t u1 </dev/random); \
if ([ $D -ge 48 ] && [ $D -le 57 ]) || \
   ([ $D -ge 65 ] && [ $D -le 90 ]) || \
   ([ $D -ge 97 ] && [ $D -le 122 ]); \
then KEY="$KEY"$(printf \\$(printf '%03o' $D)); fi; done; \
printf $KEY >/root/chap_pwd.txt; unset KEY

chmod 600 /root/chap_pwd.txt

CHAP_PWD=`cat /root/chap_pwd.txt`

action "   Setting iSCSI CHAP password: " \
/usr/bin/vmware-vim-cmd hostsvc/storage/iscsi_enable_chap \
vmhba32 iqn.1998-01.com.vmware:$BAREHOST $CHAP_PWD

unset CHAP_PWD

action "   Setting up NTP: " /bin/true

/usr/sbin/esxcfg-firewall --enableService ntpClient
/bin/mv -f /etc/ntp.conf /etc/backup_ntp.conf
/bin/cat > /etc/ntp.conf <<NTPEOF
restrict kod nomodify notrap noquery nopeer
restrict 127.0.0.1
server 127.127.1.0
server $CNTPSERVER1
driftfile /var/lib/ntp/drift
NTPEOF

/bin/mv -f /etc/ntp/step-tickers /etc/ntp/backup_step-tickers

/bin/cat > /etc/ntp/step-tickers <<STEPEOF
server 127.127.1.0
server $CNTPSERVER1
STEPEOF

chkconfig --level 2345 ntpd on
service ntpd restart >/dev/null 2>&1

action "   Configuring user account limits: " /bin/true

chage -m 2 -M 90 -W 14 admin

# enables pam_tally.so for failed attempt lockout
esxcfg-auth --maxfailedlogins=5

# create a failed login log file
touch /var/log/faillog
chown root:root /var/log/faillog
chmod 600 /var/log/faillog

# set max attempts to 5 and duration to 120 minutes for user admin
faillog -u admin -m 5 -l 7200

ESXCFG
#
# limit string ESXCFG marks the end of the post-reboot script
#########################################################################################

chmod 700 /root/esx_script.sh

cp /etc/rc.d/rc.local /etc/rc.d/rc.local.backup

cat >> /etc/rc.d/rc.local <<RCCFGEOF
. /etc/init.d/functions
action "Running ESX Configuration Script: "  /bin/true
# Execute post-reboot config script
/root/esx_script.sh
# Move rc.local.backup back to rc.local, restoring the original
action "   Restoring orginal rc.local: " mv -f /etc/rc.d/rc.local.backup /etc/rc.d/rc.local
RCCFGEOF
...read more

April 12, 2009

The Ultimate Kickstart File - Part 1

If you've built a few ESX servers by hand in a moderately complex Virtual Infrastructure environment, you know that it can be a very tedious task. The CD-ROM based installation portion of the install is simple and quick, but the post-install configuration can be downright painful. In this three part series, we'll explore the options for automating ESX installations, and detail the methods and commands we can use to make the process as painless as possible.

Is it really that tedious?
Here's a short list of the post-install tasks I've had to endure:
  • Create multiple vSwitches and name them accordingly, often having to log into VirtualCenter and open up the network configuration of an existing host to ensure the vSwitches are all named exactly the same

  • Set up multiple vmkernels and consult diagrams to figure out the IP addresses for them

  • Add iSCSI hosts and configure CHAP authentication, requiring that a password safe be logged into and the CHAP passwords copied out of it

  • Enable the NTP service and configure multiple NTP servers

  • Enable VMotion

  • Customize the .bashrc file for both root and our non-root user so they have the aliases we like to use

  • Enable account password expiration, lockout intervals, etc. according to our security policies

  • Find the document with our standard SSH banner and copy that into a putty window on the new host

  • Configure the various little tweaks and fixes we have found to be helpful in our environment

And if even one of these tasks is overlooked, it's sure to cause severe annoyance down the road, especially during a moment of crisis.

So it's no wonder that after the third or fourth ESX install I went right to the web to search for any way to automate the process. And lo and behold, VMware has actually provided us with a scripted install generator right from the home pages of our ESX servers! But even better, two of our ESX building brethren are actively working on projects to provide automated network installations!

UDA and EDA
If you haven't checked out the Ultimate Deployment Appliance (UDA) or the ESX Deployment Appliance (EDA) yet, you need to jump on that immediately. They are virtual appliances that allow you to PXE boot your hosts and run scripted ESX installations over the network. Both projects have their strengths and weaknesses, but we will focus on the UDA for the purposes of this tutorial. That's not to take anything away from the EDA, it's a great appliance and seems to be the preferred ESX deployment vehicle at the moment, but we're used to the quirks of the UDA, and prefer its single kickstart configuration window to the multiple windows and drop-downs of the EDA. We've actually gotten so obsessed with creating the perfect kickstart file, that each new change is checked into a svn repository for some basic source control, which is very easy to manage with UDA as the entire kickstart file is easily copied and pasted to and from the window. You should definitely check out EDA however if shell scripting doesn't interest you, as it takes care of a lot of the details for you. There are many good sources of information for setting up both UDA and EDA in your environment, so we'll leave that to the experts and assume that you are up and running with UDA for the rest of the tutorial.

A lofty goal
We've got some excellent tools available to us for scripting ESX installations, and an extreme aversion to repetitive tasks, so we're going to hack away at this until the entire installation and post-configuration are completely automated. Our goal is to have to change only one item in our kickstart files: the hostname. The only task left after the scripted install is complete will be to add the ESX host to VirtualCenter. That's a big undertaking, so we better start with the basics.

Anatomy of a kickstart file
To get a good feel for how a scripted installation works, it's a worthwhile exercise to create a basic kickstart configuration file using the ESX Server Scripted Installer, which is accessible via a link available from the web interface of any ESX server. Before using the Scripted Installer it must be enabled by editing a configuration file on the ESX server itself. The details of this are covered in the ESX Server 3 Installation Guide, in the Remote and Scripted Installations section.

A kickstart file consists of four basic sections: command, %packages, %pre, and %post.

The command section consists of kickstart specific configuration settings used during the operating system install. We'll let UDA create much of this for us when we create a new template using the UDA web interface.

The %packages section defines the software packages that should be included during the operating system installation. This is always @base _ @ everything for an ESX installation, so we won't be touching this section.

The %pre section runs immediately after the kickstart options have been parsed, but before the operating system installation begins. The %pre section would be handy for any advanced partitioning options you may want to include before files are written to the system's hard drives. We won't be working with the %pre section in this tutorial, but it could be very handy in certain situations.

The %post section allows us to specify commands to be run immediately following the operation system installation, and it's where we'll spend most of our time in this tutorial. The commands in this section run after the installation, but before the system reboots. This really restricts the ESX specific things we can do here as there are no running VMware services yet. To work around this, we'll primarily use the %post section to configure a shell script that will be executed after the system reboots.

Note that by default the lines of commands included in the %post section are interpreted by the bash shell, but that can be changed with the --interpreter option, allowing you to use python syntax for example in the %post section. To change the interpreter, place the command --interpreter /usr/bin/python at the beginning of the %post section, and replace /usr/bin/python with the scripting language you prefer, provided it's available in the default install of ESX of course.

Now that we understand the basic layout of a kickstart file, we can develop some strategies for making our ESX installations as automated as possible:

- Use UDA to create our most generic configuration image
We build most of our ESX servers with almost exactly the same basic configuration, so we can use UDA to generate this baseline template. Create a new template in UDA using the root password you use for deployments, your Linux partitioning scheme, and your regional and licensing information. We'll use this basic development template to build and test our custom kickstart file.

- Enter our generic post-install commands in the %post section
If we edit our new template from the UDA website, we'll find that UDA has placed a %post section marker at the bottom of the kickstart file for us. As we discovered earlier, these commands are interpreted by the bash shell immediately after the operating system installation completes, but before the post-installation reboot. Since we won't have any VMware services running at this phase of the install, only our most generic Linux customization commands should go here. After putting in all of our non-ESX configuration commands, we will use the %post section to create a shell script that will be run after the ESX server automatically reboots as part of the install process.

- Use the %post section to create our shell script
It's a common practice to use a shell script to create another shell script, and that's exactly what we need to do here; use the kickstart script to create a new shell script. To do this, we'll use a here document to feed a list of commands to cat, and cat will output the commands to a shell script. You've probably seen here documents used a hundred times, but maybe didn't understand exactly what was going on under the hood, but we need to understand the basics in order to avoid a big pitfall with our shell script.

Building the post-install script
A here document is just a special block of text. When bash encounters a here document, instead of seeing the commands and the carriage returns in the text block as an indication that we want bash to execute the commands, bash feeds the block of text into the command we are directing it into. The here document is indicated by two less-than characters, <<, so in its basic form it looks like this:

COMMAND <<InputComesFromHERE
text
InputComesFromHERE

For our purposes, we'll be using cat as the COMMAND, so we'll call our here document a cat script, and it will look like this:

/bin/cat <<UntilYouSeeEOF
bash command 1
bash command 2
etc.
UntilYouSeeEOF

If you paste that code block into a bash shell, the output will be:

bash command 1
bash command 2
etc.

The shell fed the whole block of text to cat, and cat spit it back out to the terminal just as we wrote it, with the original white space and line feeds. If you remove the << characters, you'll get several 'command not found' messages from the shell, because without the <<, bash thinks you want to run the lines of text as commands.

Since we want to use our cat script to create a bash script, we need to tell cat to redirect its output to a file, rather than to our terminal. That's easy enough to do with the > character.

/bin/cat > ~/our_script.sh <<EOF
cp /etc/skel/.bashrc /etc/skel/backup_bashrc
cp /root/.bashrc /root/backup_bashrc
cp /root/.dircolors /root/backup_dircolors
EOF

If you copy that into a bash shell, you'll find it creates a new shell script in your home folder called our_script.sh, and the script has three commands in it to backup some .bashrc files.

So you're probably thinking Enough already, we already knew what a here document was, what's the pitfall we need to look out for? Well, in its default form, the shell will substitute anything it thinks is a parameter in our cat script. So if we write:

/bin/cat > ~/our_script.sh <<EOF
echo $HOSTNAME
EOF

Bash will substitute $HOSTNAME with our actual hostname, which may not be what we intended. This can be a real problem when trying to create users with useradd and specifying the password. Any $ or special shell characters in our cat script will trigger parameter replacement. This is easy enough to fix though, we just need to single quote our limit string like this:

/bin/cat > ~/our_script.sh <<'EOF'
echo $HOSTNAME
EOF

Now check out the script our cat script created:

cat ~/our_script.sh

The output should be:

echo $HOSTNAME

Bash did not substitute $HOSTNAME in our cat script with our real host name.

Start your hacking
So now that we understand how our kickstart file needs to be constructed, and we have our basic strategy planned out, we can start building our scripted installation. If you have a development or test environment, or even a new server you can spend a few days using for kickstart testing, you're set. Even an old server sitting in the recycle pile can be very useful for this purpose, especially if it is from the same vendor as your newer servers.

If you have used UDA before, and have been paying attention, you may be wondering what all the talk about parameter substitution and quoted 'here documents' was for. The first time you attempt to copy one of your kickstart scripts to another template for an additional ESX server you want to bring up, you'll realize one of the big bummers of this setup; we have to change the hostname and every IP address in the script. Ugh!

One of the big draws of EDA is that it will do some of this for you, but if you read through the forum postings, it's pretty clear that it's not 100% yet. So what do we do, just deal with hunting through our kickstart scripts and meticulously changing IP addresses for each ESX host we want a template for? Not a chance! We'll use our natural tendency to name and address our virtual environments in an organized manner to our advantage.

Compulsive labeling
If you've seen a few VMware environments in the wild, you'll know that the ESX servers are almost always named with a series of digits indicating their uniqueness. Whether it's the simple (esx01, esx02, esx03) or something amazingly complex (nyc01-dc03-esx001, nyc01-dc03-esx002), we all appear to be labeling ESX servers in this format. Chalk it up to the fact that an ESX host can be serving many different roles in an enterprise at the same time, so this is really the only way to name them. The IP addressing schemes also tend to follow the naming scheme, so for esx01, the IP address of the service console might be 172.20.1.101, and esx02 would be 172.20.1.102, etc.

esx01
Service Console: 172.20.1.101
iSCSI VMkernel:  172.21.1.101
iSCSI Svc Console: 172.21.1.201
VMotion VMkernel: 172.22.1.101
esx02
Service Console: 172.20.1.102
iSCSI VMkernel:  172.21.1.102
iSCSI Svc Console: 172.21.1.202
VMotion VMkernel: 172.22.1.102

So if we're doing this in our environment, and our IP addressing scheme is tied to our host naming scheme in some fashion, couldn't we just grab those digits from the end of our hostname and feed them to our networking configuration for the IP addresses? You bet we can! Our kickstart script is creating a bash script that will run after the initial reboot, and we can do stuff like this easily in bash. So how do we use bash to grab just the two digits in our hostnames?

As with all problems like this, there are many good solutions. We could use sed or awk to grab the digits, but the syntax for capturing backreferences in both of those can look like Klingon. Backreferences? In the simplest terms, a backreference allows us to enclose a section of our regular expression in parenthesis and save it for later as a match within our match. So we can create a regular expression to match the FQDN of our host naming scheme, and enclose the digits at the end of our hostname in (..) to grab only those after the match. Since Perl handles backreferences (and regular expressions in general) in a very straightforward manner, we're going to use a perl one-liner from our bash script to grab our host digits. Let's imagine that we have an ESX server named esx42.area51.mil, and we want to pull the 42 (or 43, 44, etc.) out of that FQDN. Our regular expression for matching the whole FQDN format would be:

/esx[0-9]+\.area51\.mil/

The leading / and trailing / serve to enclose our regular expression, and we're telling perl to match something that begins with esx followed by the set of numbers 0 to 9 ([0-9]) one or more times (+), then a period (\.), the word area51, another period (\.), and the word mil.

To have perl store the digits right after esx in a backreference for us, we simply enclose them in parenthesis:

/esx([0-9])+\.area51\.mil/

And then reference the backreference using the special variable $1 that perl uses to store it, like this:

perl -e '$_ = "esx42.area51.mil" ; /esx([0-9]+)\.area51\.mil/ ; print $1'

Should output '42' to our terminal. Since we're using perl anyway, we might as well use perl's Sys::Hostname module in our one-liner to grab the hostname of our ESX server. So our complete one-liner to grab just the hostname digits of our ESX servers is:

perl -e 'use Sys::Hostname; hostname() =~ /esx([0-9]+)\.area51\.mil/ ; print $1'

Let's make it a little more portable so we can use it without modification in other scripts. We know our hostnames will always begin with some combination of letters, followed by a two or three digit number, followed by a period, and then some domain name, but possibly no domain name if we somehow forgot to put one in the kickstart.

In the original version of this post, I wasn't concerned about a hostname containing numbers with leading zeroes, like 003, as the VMware utilities will gladly accept an IP address in that format. But keeping the final octet of the IP in that format is causing trouble later in the script, so we'll just use printf instead of plain old print and specify that printf should output a decimal value, which will strip any leading zeroes. So let's change our perl one-liner to:

perl -e 'use Sys::Hostname; hostname() =~ /^[a-zA-Z]+([0-9]+)\.?.*/ ; printf "%d", $1'

That's pretty compact, fairly readable, and it's going to save us a ton of tedium.

Well I can see how that would work in your environment, but we're naming our ESX servers after Greek deities, or characters from the Smurfs, how am I going to get an IP address from azrael, hefty, johan and peewit?

We can easily use a bash case statement to maintain a chart of hostnames to IPs. To grab our hostname this time, we're going to switch things up a little and call the hostname command and then pipe it through the cut utility, using a period as the delimiter, and only print the first field, basically stripping out our domain name. We'll then pipe it to the tr utility and tell it to translate any upper case letters to lower case so we don't have to worry about that in our case statement.

Why didn't you just call `hostname --short` to grab the bare host name without the domain name? I've seen the --short option return localhost on several different types of Linux hosts in the past, even though the man page claims it cuts the full hostname at the first dot. So basically I don't trust the --short option. Our cut method is also safe if hostname returns something without any periods, so even if you forgot to put your domain name in the kickstart file, this still works.

As you can see below, this becomes burdensome very quickly, but it may be the only way to go, so here's how it could look:

BAREHOST=`hostname | cut --delimiter . --field 1 | tr '[:upper:]' '[:lower:]'`

case $BAREHOST in
 azrael)
   IPNUM='1'
;;
 hefty)
   IPNUM='2'
;;
 johan)
   IPNUM='3'
;;
 peewit)
   IPNUM='4'
;;
 *)
   IPNUM='99'
;;
esac

The *) in our case statement is a catch all, so in the event we don't match any of the host names, we'll give the host an IP ending in 99 so the installation doesn't just bomb out.

It's hard to see how maintaining a case statement like this would be less work than just editing the kickstart for each host by hand, but at least it is possible.

Where's the beef?
That's a lot to digest, so we'll end Part 1 of our quest for kickstart nirvana here. In Part 2, we'll get in to the real meat of automating the install process, including vSwitch creation and IP addressing, iSCSI configuration, SSH customization, creation of a non-root user account, and much more.

...read more

April 10, 2009

Hardening the VMX File: Redux

If you missed the first post on Hardening the VMX File, I recommend you give it a read, especially if you are currently not adding any VMware Tools hardening parameters to your .vmx files. It's pretty eye-opening to see what a non-privileged Windows user account is capable of changing in your VMs.

I've received some questions and comments about how the post overlooked the fact that a Citrix / Terminal Server can be really locked down, preventing users from opening control panels, command prompts, etc. This is all very true, and a security savvy AD administrator could probably have used some Group Policy changes to lock me out of everything I was able to do as a Terminal Services user.

After reading the post again, I realized that I gone for the low-hanging fruit (Terminal Services user) and had not done a very good job of making my point. What I should have stressed is that many of the powerful features of VMware Tools are not doing any user validation, allowing any process to make significant changes to the VM's configuration.

The Terminal Server example was easy to illustrate, but in order to log into a Terminal Server, the testuser account needed to be a member of the Users and Remote Desktop Users groups. So to make the point a little better, for this quick demonstration we're going to use an IIS website and an ASP form to execute the VMwareService.exe command remotely to make some changes. We'll also set the authentication method for the site to anonymous access, and use the Internet Guest Account, a restricted account that is only a member of the Guests group.

Demo setup
We'll roll out a fresh Windows Server 2003 Standard VM from a template for this demo. This server will have Service Pack 2 installed, a default instance of IIS 6, and will not be in a domain, just a workgroup. The VM will be hosted on an ESX 3.5u4 server, and will have the latest version of VMware Tools shipping with 3.5u4, which is build 153875.

First we'll install IIS in the VM: open the Control Panel > Add or Remove Programs > Add/Remove Windows Components > double click Application Server > and check Internet Information Server (IIS). You're going to need the Windows Server 2003 CD you installed the server with for the IIS installation.

Tip - During the IIS install, you'll be prompted to insert a CD with the Service Pack 2 files, don't bother, the file the installer is looking for, convlog.exe, is actually cached in C:\WINDOWS\ServicePackFiles\i386, just browse for it.

Once the IIS install is complete, open the Start Menu > Programs > Administrative Tools > Internet Information Services (IIS) Manager

We're going to create a new site:
  • Right click Web Sites > New > Web Site

  • Give the website a description of test

  • Drop down the IP address selector to the IP address of the server

  • For the website path, select C:\Inetpub, and click Make New Folder to create a folder in C:\Inetpub named test

  • Select the test directory and continue, leaving Allow anonymous access to this Web site checked

  • Check the Read and Run scripts permissions

We have to enable Active Server Pages for our test, so in IIS Manager, select Web Service Extensions, right click Active Server Pages, and choose Allow

Stop the Default Web Site by right clicking it and choosing Stop, just to make sure it doesn't interfere with our test

Open the C:\Inetpub\test folder, right click in it and create a new text document, naming it get.asp

Open C:\Inetpub\test\get.asp with Notepad, and paste in this code segment:

<html>
<body>
    
  <form method="GET" action="post.asp">
    Command <input type="text" name="MyCommand" size="80"/>
    <input type="submit" />
  </form>

</body>
</html>

Close and save C:\Inetpub\test\get.asp

Right click in C:\Inetpub\test again and create another new text file, naming it post.asp

Open C:\Inetpub\test\post.asp with Notepad, and paste in this code segment:

<html>
<body>
  <%
    Dim cmd, oShell, oCmd, strRes
    cmd = Request.QueryString("MyCommand")

    set oShell = CreateObject("WScript.Shell") 
    set oCmd = oShell.Exec(cmd) 
    strRes = oCmd.StdOut.Readall() 
    set oCmd = nothing
    set oShell = nothing 
 
    response.write "<pre>" & strRes & "</pre>"
  %>
</body>
</html>

Close and save C:\Inetpub\test\post.asp

This may be an overly simple setup for the website, but the purpose of this exercise is not to prove some problem with IIS 6 or ASP, we just want a website running as the Internet Guest Account so we can attempt to run VMwareService.exe as a restricted account with no rights to even log into the server.

Back in IIS Manager, right click the test website and choose Properties, and the Documents tab

Remove the existing documents from the list, and then click Add, type get.asp, and click Apply and OK to save the changes

Now from another system able to reach the test server over the network, open a browser to the IP address of the test server. If you set everything up correctly, you should see a simple web page with a text box labeled Command and a button labeled Submit Query

Since our test website can't execute files outside of the C:\Inetpub\test folder, we need to copy an executable into the folder to confirm things are working. Let's copy C:\WINDOWS\system32\ping.exe to C:\Inetpub\test

Now on the remote machine with the web page open, paste this into the Command text box: C:\Inetpub\test\ping.exe localhost

If you've set everything up correctly, you should get a page with output like this:

Pinging lab-test [127.0.0.1] with 32 bytes of data:

Reply from 127.0.0.1: bytes=32 timeəms TTL=128
Reply from 127.0.0.1: bytes=32 time=6ms TTL=128
Reply from 127.0.0.1: bytes=32 timeəms TTL=128
Reply from 127.0.0.1: bytes=32 timeəms TTL=128

Ping statistics for 127.0.0.1:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 6ms, Average = 1ms

Now let's copy C:\Program Files\VMware\VMware Tools\VMwareService.exe to the C:\Inetpub\test folder

Back in the Command text box of our website, put this in to see if we can store a variable:

C:\Inetpub\test\VMwareService.exe -cmd "info-set guestinfo.testvar1 This is our first test variable"

We should get this response back. Hmmm, wonder if it stored it:

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"

Let's try and retrieve it, in the Command text box, put this in:

C:\Inetpub\test\VMwareService.exe -cmd "info-get guestinfo.testvar1"

Well, we get that warning again... Ah but look, our variable!

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"
This is our first test variable

Let's see what this command does:

C:\Inetpub\test\VMwareService.exe -cmd "vmx.capability.edit_devices"

That returned a 1

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"
1

Let's see what it reports if we apply isolation.device.edit.disable = "true" to the .vmx and reboot:

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"
Unknown command

Interesting, it shut down the ability to even query the parameter if it is set. Let's see what this command does:

C:\Inetpub\test\VMwareService.exe -cmd "vmx.set_option broadcastIP 0 1"

Ah, must need to try it with the bits swapped:

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"
Invalid old value

C:\Inetpub\test\VMwareService.exe -cmd "vmx.set_option broadcastIP 1 0"

Well, that adds tools.broadcastIP = "TRUE" to the .vmx, but I'm not sure what that does.

Let's try changing the time synchronization as the Internet Guest Account, by placing this command in the website:

C:\Inetpub\test\VMwareService.exe -cmd "vmx.set_option synctime 0 1"

This is the response:

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"
Invalid old value

Oh, we must have already had it enabled, let's try:

C:\Inetpub\test\VMwareService.exe -cmd "vmx.set_option synctime 1 0"

Just the warning this time:

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"

Let's check the VMware Tools applet and see... it changed! We changed the time synchronization behavior of our virtual machine from a remote system as the Internet Guest Account!

Backdoor blues
This brings us to something we haven't explored yet, if we uninstall VMware Tools in a VM, have we closed the backdoor? Let's examine what the backdoor is exactly, and a comment from the Open Virtual Machine Tools source code sums it up perfectly:

/*
 * backdoor.c --
 *
 *    First layer of the internal communication channel between guest
 *    applications and vmware
 *
 *    This is the backdoor. By using special ports of the virtual I/O space,
 *    and the virtual CPU registers, a guest application can send a
 *    synchroneous basic request to vmware, and vmware can reply to it.
 */

If you are really interested, you can read ./lib/backdoor/backdoorGcc32.c and find the inline assembly code for sending messages to the hypervisor and reading the responses. The backdoor is always there, but we need a client in the VM to communicate with it. So while removing VMware Tools might leave the VM unable to use the backdoor, we could always just compile our own tool and copy it over. Or in the worst case, some malware application could have backdoor communication functions built in! And as we've shown, the backdoor is not doing any user privilege checking, it's using special processor instructions embedded in the calls to the virtual cpu, so that malware could be nasty.

However, we can close the backdoor with a single .vmx directive:

monitor_control.restrict_backdoor = "true".

Let's see what happens if we try a command through the backdoor now:

Warning: Unable to open "C:\Documents and Settings\All Users\Application Data\VMware\VMware Tools\tools.conf"
Log: The VMwareService application must be run in a Virtual Machine.
Log: Backtrace:
Log: ----Backtrace using dbghelp.dll----
Log: Module path: C:\Inetpub\test\VMwareService.exe
Log: Module directory: C:\Inetpub\test\

 -- output omitted --

Wow, looks like the VMwareService application doesn't even think it's running in a VM any longer.

Wait, are you suggesting we just shut down the backdoor in our VMs? No, that's what the of list of .vmx directives is for, we can select which portions of the backdoor to restrict.

So this list covers every possible malicious thing someone could attempt? I can't say that, but this list covers what VMware recommends in the VI3 Security Hardening white paper as well as several other published ESX security related guidelines.

"I don't like you because your dangerous..."
I hope this has illustrated how important it is to apply these .vmx file hardening parameters to your virtual machines, whether they have VMware Tools installed or not installed. It doesn't matter if the potential attacker is using a Citrix account, or just a cracked website, any process with any level of privileges is able to make significant changes to the configuration of your VMs. I'll list the recommended parameters we've explored in these two posts again, but check out Hardening the VMX File to see in detail what each parameter does:

Recommended VMX Directives:

isolation.device.connectable.disable = "true"
isolation.device.edit.disable = "true"
isolation.tools.setOption.disable = "true"
isolation.tools.log.disable = "true"
isolation.tools.diskWiper.disable = "true"
isolation.tools.diskShrink.disable = "true"
isolation.tools.copy.disable = "true"
isolation.tools.paste.disable = "true"
isolation.tools.setGUIOptions.enable = "false"
log.rotateSize = "100000"
log.keepOld = "10"
vlance.noOprom = "true"
vmxnet.noOprom = "true"

# PXE boot on the e1000 vNIC can be disabled with this directive:
ethernet0.opromsize = "0"


...read more