Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Data warehouse
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Dimensional versus normalized approach for storage of data=== The two most important approaches to store data in a warehouse are dimensional and normalized. The dimensional approach uses a [[star schema]] as proposed by [[Ralph Kimball]]. The normalized approach, also called the [[third normal form]] (3NF) is an entity-relational normalized model proposed by Bill Inmon.<ref>{{Cite journal|last1=Golfarelli|first1=Matteo|last2=Maio|first2=Dario|last3=Rizzi|first3=Stefano|date=1998-06-01|title=The dimensional fact model: a conceptual model for data warehouses|url=https://www.worldscientific.com/doi/abs/10.1142/S0218843098000118|journal=International Journal of Cooperative Information Systems|volume=07|issue=2n03|pages=215–247|doi=10.1142/S0218843098000118|issn=0218-8430}}</ref> ====Dimensional approach==== In a [[Star schema|dimensional approach]], [[transaction data]] is partitioned into "facts", which are usually numeric transaction data, and "[[dimension (data warehouse)|dimensions]]", which are the reference information that gives context to the facts. For example, a sales transaction can be broken up into facts such as the number of products ordered and the total price paid for the products, and into dimensions such as order date, customer name, product number, order ship-to and bill-to locations, and salesperson responsible for receiving the order. This dimensional approach makes data easier to understand and speeds up data retrieval.<ref name=":0" /> Dimensional structures are easy for business users to understand because the structure is divided into measurements/facts and context/dimensions. Facts are related to the organization's business processes and operational system, and dimensions are the context about them (Kimball, Ralph 2008). Another advantage is that the dimensional model does not involve a relational database every time. Thus, this type of modeling technique is very useful for end-user queries in data warehouse. The model of facts and dimensions can also be understood as a [[data cube]],<ref>{{cite web| url = http://www2.cs.uregina.ca/~dbd/cs831/notes/dcubes/dcubes.html| title = Introduction to Data Cubes}}</ref> where dimensions are the categorical coordinates in a multi-dimensional cube, the fact is a value corresponding to the coordinates. The main disadvantages of the dimensional approach are: # It is complicated to maintain the integrity of facts and dimensions, loading the data warehouse with data from different operational systems # It is difficult to modify the warehouse structure if the organization changes the way it does business. ====Normalized approach==== In the normalized approach, the data in the warehouse are stored following, to a degree, [[database normalization]] rules. Normalized relational database tables are grouped into ''subject areas'' (for example, customers, products and finance). When used in large enterprises, the result is dozens of tables linked by a web of joins.(Kimball, Ralph 2008). The main advantage of this approach is that it is straightforward to add information into the database. Disadvantages include that, because of the large number of tables, it can be difficult for users to join data from different sources into meaningful information and access the information without a precise understanding of the date sources and the [[data structure]] of the data warehouse. Both normalized and dimensional models can be represented in entity–relationship diagrams because both contain joined relational tables. The difference between them is the degree of normalization. These approaches are not mutually exclusive, and there are other approaches. Dimensional approaches can involve normalizing data to a degree (Kimball, Ralph 2008). In ''Information-Driven Business'',<ref>{{cite book|last=Hillard|first=Robert|title=Information-Driven Business|year=2010|publisher=Wiley|isbn=978-0-470-62577-4}}</ref> [[Robert Hillard (writer)|Robert Hillard]] compares the two approaches based on the information needs of the business problem. He concludes that normalized models hold far more information than their dimensional equivalents (even when the same fields are used in both models) but at the cost of usability. The technique measures information quantity in terms of [[Entropy (information theory)|information entropy]] and usability in terms of the Small Worlds data transformation measure.<ref>{{cite web|url=http://mike2.openmethodology.org/wiki/Small_Worlds_Data_Transformation_Measure |title=Information Theory & Business Intelligence Strategy - Small Worlds Data Transformation Measure - MIKE2.0, the open source methodology for Information Development |publisher=Mike2.openmethodology.org |access-date=2013-06-14}}</ref>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Data warehouse
(section)
Add topic