Hello! This is Jessev from the Directory Services team with some advice on how to deal with an annoyance created by the print spooler service. We, on the Direcory Service team, tend to see this issue more so than our User Experience team, who handles printer issues.
The term “Print Nightmare” is related to the security vulnerability fixed in the July 6 2021 (7B.21) update. What is described in this blog post is a situation that can develop as a result of the fix for the so-called Print Nightmare vulnerability.
Common symptoms are, slow or sluggish DCs, slow or sluggish printer servers, print clients being slow, unable to connect to print queues and the like.
As with many performance issues, the source of the problem can fester and stew for a long time, until all of a sudden, problems begin to manifest. Sooner or later thresholds are breached and the symptoms begin. So, before you reach for the sword of MaxConcurrentAPI, to slay performance monsters on your DCs: Investigate, see if perhaps the source of the problem is this Print Nightmare artifact.
Before we get into it, I am going to assume you know how to take and evaluate a network trace. Any network sniffer will do, so long as it captures full packets and can parse Kerberos traffic. We here in Directory Services use Netmon 3.4 and Wireshark.
The issue is caused by the spooler service sending a bad Service Principal Name (SPN) to a Domain Controller (DC) by way of the InitilizeSecurityContext function. These Kerberos Ticket requests will fail, so the client resorts to NTLM. The bad SPN, sent by the spooler is, “krbtgt/NT Authority”.
The client spooler will reach out to the print server frequently to check the queue, check connectivity, check for updated printer drivers and of course to send print jobs. Each of these connections requires some level of authentication. This extra authentication traffic leads to the performance problems and a loss of sanity by security teams, wondering why their logs are flooded with Kerberos and NTLM traffic.
Symptoms of tend to fall into these three areas:
- Slow Print Jobs, slow update of print queue, printer driver update failures
- Slow DCs, slow logons, replication issues
- Excessive Kerberos and NTLM traffic
Identify and Prove
The best source of data, to find the artifact, is a network trace from a DC or a known noisy client. The artifact will not be seen on the print server itself, unless that print server is also a DC. Note that the DCs used by the clients could be any DC in the domain; it is on these DCs that the artifact will be found. Consideration of the network and site topologies, and the relative location of the print server(s), will need to be considered when deciding upon which DC to collect the network trace. When in doubt, collect the network trace on the DC closest to the clients, the PDC emulator or client that you know is furiously attempting to authenticate.
Evaluating the Network trace
Once you have the trace, filter on Kerberosv5 (just ‘kerberos' for Wireshark) and then look for frames that display the error, KDC_ERR_S_PRINCIPAL_UNKNOWN. In the frame details, expand Kerberos, then the Kerberos error and look for the sname, krbtgt/NT Authority. This sname is our artifact. Also, if this is the source of performance problems, there will be no lack of example frames.
More precise filters:
Netmon: KerberosV5.KrbError.Sname.String.NameString.String.String.OctetStream == “NT Authority”
Wireshark: kerberos.SNameString == “NT Authority”
An example image from Netmon is below:
The mitigation for this is the registry value RpcNamedPipeAuthentication. If the goal is for short term relief, change the value on the top-talkers identified using the method described with Log Parser Studio (LPS) in the section below. For a more long-term solution, this value may be deployed using a group policy preference item.
We set this on the clients
HKEY_LOCAL_MACHINESOFTWAREPoliciesMicrosoftWindows NTPrintersRPC |
RpcNamedPipeAuthentication == 0x2 (DWORD)
Setting RpcNamedPipeAuthentication to 0x2 does not lead to a security vulnerability. This registry value controls the client machine; if it will send authentication information to the remote machine for RPC over NamedPipes calls. Depending on the configuration of the environment, the remote machine might reject the incoming RPC call without the authentication information, but it does not open a security vulnerability.
Now that we have evidence that the Print Nightmare Artifact could be the source of our problems, we can target top talkers for symptom relief. We do this by collecting netlogon logs from one or more DCs and then aggregating the log(s) into a table using Log Parser Studio.
Ok … so how do we know from which DCs to collect netlogon logs?
When the client fails to authenticate with Kerberos, it will fall back to NTLM. The print server will send the client NTLM traffic to the DC to which the print server has its secure channel. To find the print server's secure channel, use the method below.
On the print server run the following from an elevated command prompt, using contoso.com as the example domain:
The result will resemble the following:
Flags: b0 HAS_IP HAS_TIMESERV
Trusted DC Name dc.contoso.com
Trusted DC Connection Status Status = 0 0x0 NERR_Success
Trust Verification Status = 0 0x0 NERR_Success
The command completed successfully
In this case, the DC with which the print server has its secure channel is, dc.contoso.com.
Now that we have identified the DC, we can enable netlogon logging. Note that, if a preferred DC could not be identified, we can collect netlogon logs from all or some of the DCs. Collect from DCs that are in the same site as the print server if nothing else.
The commands below assume an elevated command prompt.
To enable Netlogon logging:
To disable Netlogon logging:
The default netlogon log location is here: c:windowsdebugnetlogon.log.
Using the frequency of the kerberos errors in the network trace, use your best judgment regarding how long to wait for netlogon logging to collect data. Netlogon logging is relatively lightweight, so you can leave it running for as long as desired. The Netlogon log will roll in FIFO method. The default size is 20 megs.
Log Parser Studio
Once we have our netlogon.log file(s), we can evaluate that data with Log Parser Studio (LPS).
Download and Install
Log Parser Studio (LPS) can be downloaded here.
Create the Query
- In LPS, click on File on the menu bar, then New, and then New Query.
- In the bottom window, delete the default contents and paste in the query below. When you paste in the query, mind the word wrap and make sure to clean up any leading or trailing spaces.
SELECT EXTRACT_SUFFIX(TEXT,0,'Returns ‘) AS ERR,
TO_UPPERCASE (extract_prefix(extract_suffix(TEXT, 0, ‘logon of ‘), 0, ‘from ‘)) as UserName,
TO_UPPERCASE (extract_prefix(extract_suffix(TEXT, 0, ‘from ‘), 0, ‘Returns ‘)) as MachineName,
COUNT(*) FROM ‘[LOGFILEPATH]'
WHERE INDEX_OF(TO_UPPERCASE (TEXT),'SAMLOGON') >0
AND INDEX_OF(TO_UPPERCASE(TEXT),'RETURNS') >0
AND NOT INDEX_OF(TO_UPPERCASE(TEXT),'KERBEROS') >0
GROUP BY ERR, UserName, MachineName ORDER BY COUNT(*) DESC
- Click the Log Type item on the middle bar and choose, TEXTLINELOG
- Click the Log button from the tool bar and navigate to the directory that contains your netlogon log(s).
- Click the Run Query button, red exclamation icon.
The query will sort the list with the top talkers at the top. Each line will resemble the following:
|0x0||CONTOSOjdoe||WORKSTATION01 (VIA PRINTSERVER01)||2416|
Once the query has completed running, you may export the results to a CSV using the Green Arrow icon on the toolbar.
The registry value change, described previously, can be applied to these top talkers. Once the value is changed on the client and the client is rebooted, a new netlogon log can be examined to prove that the client has stopped being so noisy. This can be helpful in a situation where change control requires proof of a solution before being rolled out to the enterprise.
This scenario can lead to a lot of difficult to define problems. DCs can become slow causing logon delays, replication errors and the like. Print servers slow down, printer clients can have problems sending print jobs, checking the queue and updating printer drivers.
Once the scenario is confirmed, the fix can be rolled out to the entire enterprise with a group policy preference item. If immediate relief is required, because an enterprise-wide change must pass through change control and ‘InfoSec' teams, use Log Parser Studio to evaluate the netlogon logs and identify the clients that are creating the most load, and get them under control.
(Though I've no ETA, and it is just a rumor, this particular value might be exposed in the future as an individual setting in Group Policy, instead of having to deploy it with a Group Policy Preference item (registry change)).
- Managing deployment of Printer RPC binding changes for CVE-2021-1678 (KB4599464)
- Introducing: Log Parser Studio
- Enabling debug logging for the Netlogon service
- InitializeSecurityContext (Kerberos) function
- Group Policy Preferences
P.S. Special thanks to two customers, MJ and MB, who helped in one fashion or another in the creation of this blog post – You know who you are!