What is a Data Warehouse?

When you mention “data warehouse” to someone not acquainted with the term, the first image that springs to their minds is usually something like the one above. When in fact, it actually looks a little more like this:

A data warehouse is, on the highest level, a central repository of data. Just as big data has Doug Laney and his 3 V’s, data warehousing has Bill Inmon, and his much-less catchy but oft-cited definition:

A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.

This might not be all that illuminating at first glance, but if you unpack it, there’s a lot of information to be found here.

“Subject-Oriented”: Contrary to what many people believe when first hearing the term, a data warehouse does not have to be one monolithic repository of all your data. A data warehouse can be used to store and analyse data on a specific subject.
“Integrated”- One of the key features of data warehouses is that they store and leverage data from multiple different sources.
“Time variant”- Data warehouses are the home of historical data. You can pull up data from a specific time period, be that 3 months or 3 years ago. Data warehouses often also contain detailed temporal data, showing changes to a dataset. For example, if you’re using your data warehouse to store key customer metrics, you can not only see a customer’s current address or recent purchase history, but all data previously associated with this customer & how it’s changed over time.
“Non-volatile”–The data contained within a data warehouse is not subject to change.

Another key defintion often cited when defining the term “data warehouse” comes courtesy of Ralph Kimball. He states:

A data warehouse is a copy of transaction data specifically structured for query and analysis.

The Data Warehousing Information Center add two caveats to Kimball’s defintion, which I think are worth highlighting:

“1) Sometimes non-transaction data are stored in a data warehouse – though probably 95-99% of the data usually are transaction data.
2) I say “querying and reporting” rather than “query and analysis” because the main output from data warehouse systems are either tabular listings (queries) with minimal formatting or highly formatted “formal” reports. Queries and reports generated from data stored in a data warehouse may or may not be used for analysis.”

They also raise the salient point that what Kimball purposefully doesn’t say is also worth taking into consideration. No mention is made about the form of the stored data, as a data’s form never qualifies or disqualifies it from being considered a data warehouse. A data warehouse can be a relational database, a multidimensional database, a hierarchical database, an object database- the list goes on. There is no one form, or set of forms, that a data warehouse is expected to make.

Some leaders in the field of data warehousing solutions include:

Exasol- Exasol offer a ready-to-run data warehouse appliance, tailored to your data volume and analysis requirements. They offer a free single-node copy of their software here, if you’d like to try before you buy.
Teradata- Teradata offer a broad spectrum of data warehousing products, from their fully inclusive Teradata Active Enterprise Data Warehouse, to the Teradata Database Express, which is free to download straight away. Features within the portfolio include support for JSON documents and a range of language plugins (including Ruby, Perl and Python)
Data Virtuality- Data Virtuality’s USP is that it claims to unite “all data, all systems and all sources” on one platform, automatically, in less than one week. It’s quite a claim, but reviewing their extensive list of connectors should convince you of their credibility.

Other important vendors in the data warehousing space include EMC, Greenplume, Splunk, Amazon Warehouse Services and Google Mesa. Stay tuned in the coming weeks for in-depth guides focused around these technologies.

(Image sources: Wikipedia, Sébastien Barré)

What is a Data Warehouse?

Related Posts

Why UI/UX design agencies are the key to success

Precision in the wild: AI-powered tools redefining hunting practices

The Smart Contract revolution – Transforming business operations across industries

Creative ways to enhance your documents and boost productivity

The future of data management: Exploring the rise of cloud platforms

How data science is revolutionizing prefabricated construction: A look at garage kits

LATEST NEWS

iPhone 15 Pro gets a secret AI upgrade you didn’t know about

Super Micro up 113% in 3 months analysts predict up to 108% more gains

Palantir’s valuation could plunge 50% by 2027: Here’s why

Applied AI in Product Engineering: Making Users Happy with Machine Learning

Humane AI Pin shutdown proves the hype was always fake

Want an RTX 5090? Nvidia decides who gets one, here’s how to apply

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

What is a Data Warehouse?

Related Posts

LATEST NEWS

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

Follow Us