08Jan, 2020
Sitecore Restarting 120 Times in a Day Is Not What I Call Ideal
So, this situation reminds me of the time before I got into IT, I was a sound technician, working on tour or supplementing touring shows with our equipment and staff. We would receive technical riders explaining everything the band needed from tea with honey to cases of beer and cold cut platters. Invariably the old urban legend would come up where acts would ask for red only M&M's, ice that doesn't melt, etc. They didn't really want these things but instead were checking to see if the whole document was actually being read.
Now, when we supply an IT team with a requirements document for what's needed in their environment, we really hope they read and follow it thoroughly. For one project a couple of things were missed, like a couple server's resources were too low, etc. which was quickly sorted out. We go live with this site and everything's smooth as silk, but the day after the site restarts over 120 times! I can't do 120 sit ups in a day, maybe even a week, and this site is restarting that much? This has got to get fixed!
The first thing anyone in this situation would do is check the logs and see what's the last message written to the last closed log file, and here's what I see:
CONFIG change CONFIG change CONFIG change CONFIG change CONFIG change CONFIG change CONFIG change CONFIG change CONFIG change
Because we don't see the the follwing shut down message, we know Sitecore wasn't in the best of moods:
5304 13:58:24 INFO ************************************************** 5304 13:58:24 WARN Sitecore shutting down
Somethings is changing critical files in our Sitecore directory causing IIS to restart the app pool. With no automated deployments to this environment, what could it be? Next let's check the configuration history of this instance. In Data\diagnostics\configuration_history for version prior to 9, or App_Data\diagnostics\configuration_history for 9+ you will find zip files named like this:
- 20190228Z.131548Z.zip
- 20190228Z.131401Z.zip
- 20190228Z.131341Z.zip
These files will have a collection of files that were changed, so when comparing the differences between them and the live copy, the answer should be obvious as to what's going on, but it's not. They're all the same!
The answer to this problem is our old friend, McAfee. It turns out the IT team, who didn't really read our requirements document thoroughly enough, enabled a virus scanner on the critical directories the day after deployment “just to be safe”. When a virus scanner checks these directories, IIS believes the files were changed, causing a restart. Once McAfee was disabled for these directories everything settled down, including my heart rate. Another one for the books.