I received an email from a customer yesterday regarding their Replication, which began failing with this error after Windows updates were applied:
Message Replication-Replication Transaction-Log Reader Subsystem: agent servername-xxx2 failed. Executed as user: domainname\svcaccount. A required privilege is not held by the client. The step failed.
Slightly dummied, but the important content is in red. What does that mean? 'A required privilege is not held by the client'... he didn't change anything, I didn't change anything - why is Replication suddenly failing with permissions problems?
The only known change was the Windows Update, but every SQL Server DBA really should know that Windows Updates are often known to trigger failures like this due to security changes. In this case, Group Policy reapplications were performed, and silently overwrote the local security settings for the SQL Server Agent service account.
How did I know this?
First, I checked the security polices for the SQL Server Agent service account in Run \ secpol.msc \ Local Policies \ User Rights Assignment. Specifically, 'Log on as a batch job' and 'Log on as a service', which are needed by the Agent to run snapshot and log reader replication tasks. Neither were enabled for the Agent's service account, DOMAIN\SQLAgentSvc:
So, I input the Agent service account in there, applied and restarted the Agent, but it made no impact. The errors continued. I even rebooted the server but still saw no impact from the change. Then I ran this in powershell:
Get-ResultantSetOfPolicy -ReportType Html -Path C:\GPOReport.html
That cmd retrieves the Resultant Set of Policy (RSoP) information for the computer to html file for review, and it did NOT show 'Log on as a batch job' or 'Log on as a service' for the Agent's service account within Local Policies, User Rights Assignment:
This tells us that the domain-level GPO (Group Policy Object) reasserted itself after the Windows updates and overwrote the local security policy settings for the SQL Server service accounts. Without this policy, the Agent no longer has the permissions needed to run the snapshot and log reader replication tasks. Hence, the error 'A required privilege is not held by the client'.
The fix?
Assign both of these policies at the Group Level within Group Policy Management Console (gpm.msc) for the SQL Server Agent service account:
Log on as a batch jobLog on as a service
Then restart the SQL Server Agent.
After doing so, the failing replication tasks immediately begin to recover. I believe it took about 5-7 minutes for everything to recover fully.
In my Customer's words... "I just want to call out that you are amazing."
No comments:
Post a Comment