I’ve recently been tasked to consolidate our data footprint across our network. We have several terabytes of data that has existed on multiple servers and I would estimate that about 30-40% of the data is useless. I would classify ‘useless’ data as data that hasn’t been touched in over 7 years because someone moved it to a new location and never cleaned up the old one, or its just irrelevant and old.
My question is…
Is there a tool that would allow me to scan large amounts of data to help me identify possible orphaned directories on our network?
Here’s a suggestion, search for DoubleKiller – I found it really useful for identifying duplicate files through terabytes of stuff, it has lots of search options and constraints on which directories to scan. It’s a useful tool for the arsenal, but as with anything reading files it’ll destroy access times, if they might be needed in the future.