Having been involved in several upgrade projects over the last few years, one thing I've often noticed is the poor quality of data that can be present in a large and long running system. This can present problems for upgrading and usually means that you have to spend quite some time fixing the data first.
Upgrading is difficult and causes regression tests to fail as:
After you have corrected the data for upgrade, the original system has much higher quality data and other issues and inconsistencies have been solved. In a recent system we also saw large performance improvements due to duplicate and junk data being removed. On another system we saved the operations staff may hours work a week as the data improvements meant a large number of post report corrections were no longer needed.
So why isn't this analysis done on a regular basis to help keep a system healthy? The main reason is simply that it's just too hard for the operations staff to do. Therefore when you're designing a system you should take this into account and enable these kinds of maintenance tasks. This involves reporting and having tools that can correct sets of problematic data.
Some things to consider:
Please don't rely on database tools to do this as your operational staff probably won't know how to use them and your DBAs don't understand the business domain to analyse the data. You need tools at the appropriate level for the appropriate people and consider the complete lifecycle of your product.