IT Connect

Information technology tools and resources at the UW

20130801: MI DNS suffix management

Problem Statement

An overwhelming number of computers joined to the NETID domain have a misconfigured DNS suffix, with a DNS suffix value of netid.washington.edu.

Customers are not allowed to get a DNS entry in the netid.washington.edu DNS zone because there isn’t a good way to delegate that DNS zone.

Having a primary DNS suffix that does not resolve to the computer means that:

  • Kerberos negotiation will fail in certain scenarios–degrading secure authentication communications,
  • certain services can be prevented from running,
  • from Windows computers this creates an undesirable/bogus DDNS registration attempt to the DNS servers hosting that DNS zone

Background

Windows computers have a NetBIOS name or hostname that is 15 characters or less. Windows computers can optionally have a primary DNS suffix. The fully qualfied DNS name (FQDN) for a Windows computer is generally <NetBIOS name> + <primary DNS suffix>. In addition, a Windows computer can have connection specific DNS suffix that changes what name the computer uses across specific network interfaces.

The FQDN for a Windows computer that is joined to a Windows domain is written to its AD computer account. This value is stored in the dnsHostName attribute.

In the Windows GUI, the NetBIOS name and the primary DNS suffix values are accessible via the System control panel. See http://www.netid.washington.edu/documentation/faqDelegated.aspx#dnsSuffixConfig for pictures.

By default, a Windows computer will change its existing DNS suffix to match the DNS suffix of the Windows domain it joins. This default behavior is configurable and a script for this is available (see details below).

The NETID domain service permits any DNS suffix to be used on computers that join the domain except netid.washington.edu (http://www.netid.washington.edu/documentation/delegatedOuPractices.aspx#dnsSuffix). Customers are encouraged to use either the DDNS zone provided by the UWWI line of business (clients.uw.edu) or to use their existing DNS zone.

Delegated OU customers can find their misconfigured computers by searching their OU with a LDAP search filter of (dnshostname=*.netid.washington.edu)

A powershell command that will do that is:

PS> import-module activedirectory

PS> get-adcomputer -ldapfilter “(dnshostname=*.netid.washington.edu)” -SearchBase “OU=uwit,OU=Delegated,DC=netid,DC=washington,DC=edu”

Replace the value for the SearchBase parameter with the correct value for your delegated OU.

Design Considerations and Assumptions

The UWWI service has an unwritten contract with Delegated OU customers that the customers are responsible for management of computers in their OU, and UWWI will refrain from interfering with that unless it affects the service or university interests are at risk.

Some of the negative effects are not present for non-Windows computers.

Some customers will not have an existing DNS zone to use.

Customers will have a diversity of ways they want to manage the DNS suffix of their computers. A solution should take this into account.

Customers want correctly configured computers; secure authentication matters.

If UWWI changes the default behavior, we shouldn’t interfere with servers; there is no obvious DNS suffix default for servers.

Discussion

The default behavior with respect to the DNS suffix of a domain joined computer is the leading cause of the problem. Changing the default bahavior or preventing that default behavior from happening is important. In specific, the fact that a Windows computer will change its existing DNS suffix to match the DNS suffix of the Windows domain it joins is something to avoid or change.

Not having the computer’s DNS suffix match the Windows domain’s DNS suffix means the computer will be in what Microsoft calls a “disjoint namespace.” Microsoft supports the disjoint namespace configuration and has since 2000. There are some issues that can arise from disjoint namespace configuration, but these are primarily because of poor coding practices in products, and there are ways to address these issues (that are documented).

Not having a DNS suffix is almost as bad as having netid.washington.edu as the DNS suffix. The right default setting (for non-servers at least) would seem to be clients.uw.edu, the DDNS zone that the UWWI line of business provides for delegated OU customers. So if we can change the default setting, that would seem to be the right default to change to.

The number of delegated OU customers is significant, and the number of DNS suffixes they might want to use is likewise numerous. In addition to changing the default value, we also need to fix existing misconfigured computers at least to the new default, while still allowing OU customers to choose a DNS suffix other than the new default value.

There are not many options when it comes to managing the primary DNS suffix. It’s worth noting that there are many scripts out there which purport to change the primary DNS suffix, but instead change the connection-specific DNS suffix. A key tip-off that a script is really about connection-specific DNS suffix is if it references the registry keys at HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters. So the valid options for changing the primary DNS suffix that we are aware of are:
  1. Manually via the GUI: http://www.netid.washington.edu/documentation/faqDelegated.aspx#dnsSuffixConfig
  2. Programmatically via the Win32 API function SetComputerNameEx (only possible via local execution).
  3. Programmatically via netdom.exe (only possible via local execution).
  4. Via the DNS client settings in Group Policy: http://www.netid.washington.edu/documentation/ddns.aspx#gpSettings
Looking more closely at these options:

#1 is fine for small scale settings, but doesn’t scale well.

At first glance, #2 & #3 have the same problem. However, it is possible to write a script that would leverage either #2 or #3 (and http://poshcode.org/2958 is a start on using powershell to do this with #2). But a scripted approach would also need to solve several other issues, including:
  • running with the right set of permissions (local admin + OU admin + running in an elevated context)
  • getting executed on the right set of computers
  • choosing the right DNS suffix
These issues aren’t insurmountable, but they do pose a challenge and likely imply lots of support issues if one of them goes wrong.
#4 seems to be the best option in terms of scaling, simplicity, and the ability to scope where it applies. Group policy has both ACLs and WMI filters to scope where a given group policy is applied. We can leverage these to limit a GPO to apply to the Windows Workstation OSes only.
Unfortunately, we can’t limit this GPO to only apply to computers whose existing DNS suffix is netid.washington.edu, which means that we couldn’t apply this GPO at the domain level as a new default without intruding upon the ability of delegated OU customers to assert their own DNS suffix on their computers.
If we did apply this at the domain level and intrude, OU customers would still have the option to override this group policy with a group policy linked to a container closer to the computer account, but all other options to manage the DNS suffix would be lost.
This exhausts the obvious solutions, and we must now look more closely at more nuanced varations. The solution that is simplest is the #4, so we’ll pursue nuances of that solution further.
Since asserting a GPO solution at the domain level doesn’t meet our design considerations because it crosses a line by taking away capabilities, a more nuanced solution should seek to limit the scenarios where capabilities are taken away–or seek agreement.
Nuanced approaches might be:

4a. Voluntary. Encourage each delegated OU to use the GPO solution on their own and revisit this issue at a later time to check on effectiveness
4b. Time constrained. Encourage each delegated OU to implement some solution by a certain date. If no solution has been adopted by that date (as evidenced by fixed values), the GPO solution is employed on that OU by UW-IT.
4c. Required with an option to request removal. Employ the GPO solution. Customers can request removal of the GPO solution.

4a is unlikely to comprehensively solve the problem, but seems like a reasonable next step since it is believed that many customers aren’t even aware that they are in violation of UWWI practices.

4b seems like a good evidence based approach to comprehensively solve the problem in a given time period. It allows customers to solve the problem using their own methods, but if they don’t, a solution is employed to close the gap.

4c is the most heavy-handed of the nuanced solutions. This would be a good option if a solution was needed immediately, but we wanted to provide a way for customers to back out.

A combination of the most successful elements of 4a, 4b, and 4c is possible and might be best.

Finally, it’s worth noting that none of these solutions addresses non-Windows computers and servers. However, it is believed that by focusing on Windows workstations, awareness of this issue is likely to result in voluntary solutions for these computers too. If the non-Windows computers and servers remain a significant problem, we can address those later.

Proposal

We encourage all OU customers to uncheck the ‘Change primary DNS suffix when domain membership changes’ checkbox . A script for doing this is available with the migration helper scripts download–see doNotChangeDnsSuffixWhenDomainMembershipChanges.ps1 within that download. This step isn’t required–just strongly recommended as it prevents the majority of problems.

We encourage all OU customers to change their practices around computer DNS suffix values to meet UWWI practices of not using a DNS suffix of ‘netid.washington.edu’.  Customers have a voluntary period to come into compliance until August 1, 2013.

To assist OU customers, UWWI will provide email-based reports that list all computers in their OU that are out of compliance. These reports will be sent on a weekly or bi-weekly basis.

On August 1, 2013, at its discretion, UWWI will implement enforcement of this policy where reporting indicates it is needed. This enforcement will involve a group policy link to a group policy object with settings as documented here: http://www.netid.washington.edu/documentation/gp/uwwi_defaultWorkstationDNSSuffix.htm. UWWI will contact the point of contact for each OU prior to taking that action to advise them. That GPO has a WMI filter that limits which computers apply the GPO such that it only applies to computers where the OS is of the Windows workstation variations. That GPO applies a DNS suffix of clients.uw.edu.

At any time after August 1, delegated OU customers which would like the GPO link removed because they would like to deploy an alternative solution can request that it be removed. An implementation period of 1 month is provided after which effectiveness of this solution will be re-evaluated.

Examples

The Pottery OU has 100 computers, 80 Windows workstations, 15 Macs, and 5 servers. All of its computers have a primary DNS suffix of netid.washington.edu. The Pottery OU admins aren’t really comfortable with group policy, don’t have time to visit 100 computers or write & support a script to fix their computers. So they choose to do nothing, allowing the new primary DNS suffix GPO to apply to all of their computers on 8/1/13. After 8/1, the 80 workstations now have a primary DNS suffix of clients.uw.edu, but the other 20 computers still have a primary DNS suffix of netid.washington.edu. UWWI follows-up with the Pottery OU admins to get the remaining 20 computers manually fixed.

The Basket Weaving OU has 400 computers, 300 Windows workstations in a lab setting, 70 Windows workstations for employees, 26 Macs for employees, and 4 servers. They refresh their lab image to change the primary DNS suffix to weaving.uw.edu and deploy this new image. They don’t have time to fix their other computers, but have changed their build process so new domain joined computers all have weaving.uw.edu as the primary DNS suffix. On 8/1, UWWI informs them that the GPO solution will be linked to the root of the Basket Weaving OU. They realize this will affect their lab computers, and ask for some alternative that won’t undo their image work. UWWI suggests that they voluntarily link the GPO on the OU for their employee computers. Weaving chooses this alternative. Later in the year, they decide they have time to manually fix the employee computers. They remove the link themselves, and fix those computers.