Who's Watching Your Mediaindexing Folder Size? No One, That's Who.

Out of all the Sitecore projects we've rolled out, I've never seen an issue with hard drive space. Our latest client launched their MVP phase a couple months ago, and to my surprise we got a drive space alert in Azure Managed Cloud. After a few checks it turns out the mediaIndexing folder is the offending party, and at 150gb+, that's a big offense!

Of course, there are maintenance actions that clear directories on a regular basis, as seen here. The curious thing is, mediaIndexing isn't one of them.

<files hint="raw:AddCommand" patch:source="Foundation.Indexing.Cleaning.config">
  <remove folder="/App_Data/logs" pattern="*log.*.txt" maxAge="30.00:00:00"/>
  <remove folder="/App_Data/diagnostics" pattern="*.*" maxAge="30.00:00:00" recursive="true"/>
  <remove folder="/App_Data/viewstate" pattern="*.txt" maxAge="2.00:00:00" recursive="true"/>
  <remove folder="/temp/diagnostics" pattern="*.*" maxAge="00:10:00" recursive="true"/>
  <remove folder="/App_Data/MediaCache" pattern="*.*" maxAge="90.00:00:00" recursive="true"/>
  <remove folder="/App_Data/diagnostics/configuration_history" pattern="*" maxAge="30.00:00:00" recursive="false" patch:source="Sitecore.Diagnostics.config"/>
  <remove folder="/App_Data/diagnostics/health_monitor" pattern="*.*" maxAge="07.00:00:00" recursive="false" patch:source="Sitecore.Diagnostics.config"/>
</files>

Media such as PDFs are indexed internally since version 9 of Sitecore was released, so they're getting created here during this process. The files are not needed after they're indexed, so you could dump the contents of mediaIndexing if you're in a bind.


Managing the MediaIndexing Folder

As with any other modification to Sitecore, you'll want to use a patch file, and woe to those who edit OOB files directly! 

I added this file to our Foundation.Indexing project, which cleared up the offending amount of files within 6 hours, though like I mentioned earlier there's no harm in manually clearing this folder if you don't want to wait. The change will wipe any file older than 15 days, so feel free to modify that value to meet your needs.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:role="http://www.sitecore.net/xmlconfig/role/" xmlns:env="http://www.sitecore.net/xmlconfig/env/"></configuration>  <sitecore>
    <scheduling>
      <agent type="Sitecore.Tasks.CleanupAgent" method="Run" interval="06:00:00" >
        <files hint="raw:AddCommand">
          <remove folder="$(dataFolder)/mediaIndexing" pattern="*.*" maxAge="15.00:00:00" role:require="ContentManagement"/>
        </files>
      </agent>      
    </scheduling>
  </sitecore>
</configuration>