Practice Management

 View Only

Enterprise Data Warehouses - 101

By Steven Magnuson posted 11-15-2019 10:57


The concept of an enterprise data warehouse (EDW) has been around for decades.  However, building an enterprise data warehouse is extremely challenging. Despite this difficulty, the benefits far outweigh the challenges.  Having a centralized repository of information from an organization’s numerous disparate information systems provides many unique opportunities to present data in new and useful ways.  It greatly accelerates the speed to production for new reporting. It also enables an organization to put information into the hands of the business users, enabling them to create their own reporting.

Where do I start?

The first step to the implementation of an EDW is to determine the systems that “own” the data (oftentimes referred to as the system of record).  Frequently information, such as personnel information is maintained in multiple systems, and there can be discrepancies between them. Therefore, it is essential to determine which system is the correct one to reference for each field in order to get accurate information.  It is also important that this system be a managed system, meaning as information changes, the changes are captured properly in this system.

Once you have identified the systems of record, you should focus on which systems are the most important systems for your core business needs, providing immediate value.  At Ballard Spahr LLP we took a phased approach to our EDW implementation, focusing first on our core data elements: People, Clients, Matters, and Financials. We have subsequently added additional data from other secondary information systems, further bolstering our EDW.  Since these systems of record are constantly evolving to meet changing business needs, it’s important to focus on specific initial goals to avoid a never ending project.

EDW Architecture Best Practices

One of the most important aspects of an EDW is high availability.  It is extremely important that load processes (which can take hours) happen seamlessly in the background.  At Ballard we ensure that our EDW is always available by conducting our load process into staging tables. Once the load is complete we swap the staging tables with the production ones having the data switch in an instant.  The other advantage of this is that if there is a failure anywhere in the load process, it happens in the staging environment, allowing a developer to fix the issue while end users can still access accurate, albeit somewhat stale data.  

Another key feature of an EDW is high performance for reporting.  Think about your customers and how they work to determine all the ways data is commonly reported on.  Create fact tables that pre-aggregate and calculate to meet these needs. Although this can lead to lots of duplicative data, it is far more efficient to report on these pre-compiled tables than a transaction log with a query that does the math on the fly.

Lastly, management of an EDW can be a huge task in and of itself.  You should plan ahead so that you are prepared to meet changes in business.  At Ballard we purposely included a source agnostic layer between each of our source systems and the EDW.  This allows us to more easily adapt when the business changes a core system. Instead of going back to the drawing board, we simply setup the new integration, unplug the old integration, and plug the new one in.

My EDW is Built, Now What?

Once you’ve completed the initial build of your EDW you can start the fun stuff.  Now you have the ability to easily create new reports that look at your data in ways never before available.  At Ballard we have taken advantage of business intelligence tools like Qlik and PowerBI (two leaders in this industry) in order to present the data to our end users in a way that enables decision makers to quickly see information at a bird’s eye view, with the ability to drill in to discover new insights.  We are also using machine learning techniques and artificial intelligence tools to further enrich our data, providing insights and predictive analytics to our end users. All of these things are technically possible with siloed data, but a clean centralized repository makes these efforts exponentially easier, and can greatly reduce your time from idea to production.  The most important thing to remember now that you have an EDW is to be creative and have fun!

If you are interested in reading other items of interest, please visit'

#Data Analytics
1 comment