Linked clones falling off the domain

By | July 3, 2014

This is something I’ve seen before and someone just posted the question. I thought I’ll share my answer here in case any one of you out there hits the same issue.

This is something unique to linked clones, and I do not expect it to happen with any other types of View desktops.

The issue is this, a user tries to login to a View Virtual Desktop and is presented with the following message. (Sorry no screenshot here)

“The SAM database on the Windows Server does not have a computer account for this workstation trust relationship”

This basically means, the desktop thinks it is on the domain, but the Active Directory does not agree. As a result, when a user tries to login with a domain account, it fails.

Now why does this happen?

In any standard windows machine (desktop or server, physical or virtual), if that machine joins to an AD domain, a computer account is created. This account has a password that is known to the computer itself, and of course the directory. What happens if the password changes, or the account gets deleted off the domain? The machine at the other end does not know about it, and so keeps trying to login with the credential it knows. But of course, the directory will reject those logins since the credentials do not match.

This is exactly like the situation where an admin changes your password (or deletes your account), but doesn’t tell you.

So, how can this happen if the sys admin from hell wasn’t let loose?

There are many possibilities, but I shall focus on one particular type of event. Time traveling virtual machines! No, I don’t mean time drift, but snapshots!

Every virtual infrastructure admin will love the capability of snapshots (as well as hate them). Yes, it’s a double edge sword.

Think about this scenario.

  • Monday 3pm – a snapshot was taken due to an upcoming change, and the snapshot is for an easy revert [computer password at this point is : just@password]
  • Tuesday 4am – the Windows VM has reached the time to change the password (it has a default 90 days lifetime), and the Computer Account password gets changed [computer password is now : new@password]
  • Over the next few days, things are working just as normal, no issues.
  • Friday 2pm – a critical issue was discovered and a decision was made to revert back to snapshot [computer password before revert is : new@password]
  • After snapshot revert – the error appears, and no one can login to the machine with an AD credential [computer password now is : just@password; but in the AD it is still new@password]

So, that is a situation we don’t want to be in isn’t it? Of course to fix it, one of the easiest way is to re-join the machine back to the directory.

So, how is this relevant to linked clones?

Linked clones all have a refresh point that is created once provisioning is completed. This is where the virtual desktop will always revert to as part of the refresh operation. So in some ways, it’s very much like a snapshot revert. As a result, it will eventually be subjected to the scenario above.

To avoid that, VMware engineers have cleverly added a small 20MB disk to every linked clone. This disk does not get affected by refresh operations and is used to store the computer password for the linked clone. As a result, it is safe to always allow a refresh of a linked clone, and yet not have the computer fall off the domain.

So long introduction to the issue. Question now is why, even with this clever trick, some linked clones can still experience this issue?

I have seen a few cases that it is due to security software that customers want to install in the virtual desktop. Some prevent the disk to be accessible and so the virtual desktop is unable to retrieve the password; and so login fails.

I have also seen admins who have no idea where that disk came from and thought it should be removed. Oops!!

What ever it is, you need to know the importance of that little but non-trivial disk, and have that checked out if you ever encounter this problem.

Leave a Reply