"Do the developers realize that the reason we store data is to report on it ?"
Louis Davidson in Episode 178 of SQL Data Partners (around 19min 10sec )
I was faced with this question again as I could only witness the same piece of information inserted in two different tables. I’m not talking about replication for business continuity purposes. No, 2 different tables in the same database, the same schema. One comes to display the full content, the second is used in a restricted query to display a summary.
I wonder if this is the results of a decision or an artifact due to the ORM/SQL tool being used. But somebody may have forgotten that duplication is not free and the consequences are questions you shouldn’t have to ask yourself inside the same relational database.
Which location to trust to build a report? Are they synchronised? How to resolve conflicts? It is just a question of time until they find their way.
Now that the things are as they are, can it be changed? Too often not, because of a fragile system or the decision is made that fixing an anti-pattern would cost to much time.
Building on fragile foundation is by itself fragile and I’ve yet to see a no cost solution to
- program the checks
- monitor the multiple clones
- merge conflicts
- repair the bug when software update forgot about the “need” for duplication … and all associated reports.
For the last year, I’ve had the luck to work with a developer willing to normalise the design and in an environment where we could correct clear anti-patterns. Unfortunately the present duplication issue seems out of our reach… and we will build report on it.
I would love the rule: you designed it you report it… but I’ve not seen it much applied. The person implementing the reports tend not to be the one making the software/database design.
In Codd we trust and may Codd helps us!