{"id":1068,"date":"2015-11-10T17:53:43","date_gmt":"2015-11-10T23:53:43","guid":{"rendered":"http:\/\/www.designandexecute.com\/designs\/?p=1068"},"modified":"2024-08-31T10:32:40","modified_gmt":"2024-08-31T16:32:40","slug":"data-warehouse-design-patterns","status":"publish","type":"post","link":"https:\/\/www.designandexecute.com\/designs\/data-warehouse-design-patterns\/","title":{"rendered":"Data Warehouse Design Patterns"},"content":{"rendered":"<p>This post will not dive into each topic in detail but serve more like a curriculum of things to research for the\u00a0 Data Journey.\u00a0 Anyone who needs to get into the Data Warehouse (DW) space should have a handle on the following Design Patterns:<\/p>\n<h2>Connection Patterns<\/h2>\n<p>There are 4 Patterns that can be used between applications in the Cloud and on premise.\u00a0 The combinations are as follows<\/p>\n<ul>\n<li>on-premise caller to Cloud provider<\/li>\n<li>Cloud caller to on-premise provider<\/li>\n<li>Cloud caller to Cloud provider<\/li>\n<\/ul>\n<ol>\n<li><strong>Remote procedure calls<\/strong> (RPC) Connection Patterns<\/li>\n<li><strong>Asynchronous<\/strong> (fire and forget) Connection Patterns using Queues<\/li>\n<li><strong>Shared Database<\/strong> in cloud or on-premise<\/li>\n<li><strong>Data\/File synchronizing<\/strong> in Copying Data (ETL) flat file loads, database to database sources to targets.<\/li>\n<\/ol>\n<h2>Extract Transform Load (ETL) Patterns<\/h2>\n<p><strong>Truncate and Load Pattern (AKA full load):<\/strong> its good for small to medium volume data sets which can load pretty fast.\u00a0 it is good for staging areas and it is simple.\u00a0 The key benefit is that if there are deletions in the source then the target is updated pretty easy.\u00a0 The disadvantage is there is no history .kept and no tracking. CUID ie created, updated Inserted or Deleted cannot be tracked.<\/p>\n<p><strong>Slowly Changing Dimension Type 1 Pattern: <\/strong>This pattern is simple but it is very slow and should not be done for anything over 1000 rows.\u00a0 See the<a href=\"http:\/\/www.designandexecute.com\/designs\/basics-of-data-warehouse-dw\/#dimension\" target=\"_blank\" rel=\"noopener noreferrer\"> dimensions definition<\/a> for type 1<\/p>\n<p><strong>Slowly Changing Dimension Type 2 Pattern: <\/strong>This pattern is simple but it is very slow and should not be done for anything over 1000 rows.\u00a0 See the<a href=\"http:\/\/www.designandexecute.com\/designs\/basics-of-data-warehouse-dw\/#dimension\" target=\"_blank\" rel=\"noopener noreferrer\"> dimensions definition<\/a> for type 2<strong><br \/><\/strong><\/p>\n<h2>Declarative\/Adhoc SQL Query Patterns<\/h2>\n<ol>\n<li><strong>Join patterns: <\/strong>directional, inner or equijoin, left and right outer join, full outer joinA\u00a0<em>theta join<\/em> allows for arbitrary comparison relationships (such as \u2265 or between).\u00a0\u00a0An\u00a0<em>equijoin<\/em> is a theta join using the equality operator.\u00a0\u00a0A\u00a0<em>natural join<\/em>\u00a0is an equijoin on attributes that have the same name in each relationship<\/li>\n<li>Flattened <strong>Hierarchies<\/strong> which put all the levels on one row as columns vs Ragged <strong>hierarchies<\/strong> which like\u00a0unbalanced hierarchies, the branches of the\u00a0<b>hierarchies<\/b>\u00a0can descend to different levels.<\/li>\n<li><strong>Join Tables\/ Translation tables<\/strong> usually when putting two silo systems in the same context so the data can be merged<\/li>\n<li><strong>Parent\/ Child Tables<\/strong> and Cardinality (Fan traps that occur when using aggregate measures), the parent is a foreign key (FK) on the child record so the relationship creates data clusters, We have to ensure we do not writes queries that multiple the aggregation values.<\/li>\n<li><strong>Self Joins<\/strong> (aka Alias)\u00a0 in the SQL we can refer to the same table by another name and join to itself eg Manager is a type of Person so technically the person table can be self joined to get the manager&#8217;s info.<\/li>\n<\/ol>\n<h2>Query Performance Patterns<\/h2>\n<ol>\n<li><strong>Explain Plans, Indexing and Partitions, <\/strong>this is the bedrock of performance tuning in relational databases.\u00a0 This topic alone deserves its own post. It would be dependent on table storage and data types configurations at the Data Definition Language (DDL) setup. It will also need knowledge of the data cardinality to create balanced tree vs bitmap indexes and user query patterns to create covering indexes and getting more index range scans if query does not uniquely select the index or hit partitions to use much smaller data sets for faster queries.<\/li>\n<li><strong>ETL Aggregation<\/strong> and Aggregate awareness for multiple aggregation tables<\/li>\n<li><strong>Table Constraints<\/strong> in Data quality, including PK, FK and additional functions or regular expressions that can be put on columns to ensure the accurate data and not nulls are stored as needed.<\/li>\n<\/ol>\n<h2>Dashboard Design Patterns<\/h2>\n<ol>\n<li>Layout Patterns<\/li>\n<li>Leading Indicators Aggregation Pattern<\/li>\n<li>Drill Down Pattern<\/li>\n<li>Progressive Filtering Choice Pattern<\/li>\n<\/ol>\n<h2>Security Patterns<\/h2>\n<p><span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">I have a dedicated article on\u00a0<a href=\"http:\/\/www.designandexecute.com\/designs\/security-patterns\/\" target=\"_blank\" rel=\"noopener noreferrer\">security patterns,<\/a> which are getting increasingly complex as time progresses and new regulations.<\/span><\/p>\n\n\n<figure class=\"wp-block-embed-wordpress wp-block-embed is-type-wp-embed is-provider-design-and-execute\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"wp-embedded-content\" data-secret=\"y8BFC2EeQv\"><a href=\"https:\/\/www.designandexecute.com\/designs\/holy-trinity-of-analytics\/\">Holy Trinity of Analytics<\/a><\/blockquote><iframe loading=\"lazy\" class=\"wp-embedded-content\" sandbox=\"allow-scripts\" security=\"restricted\" style=\"position: absolute; clip: rect(1px, 1px, 1px, 1px);\" title=\"&#8220;Holy Trinity of Analytics&#8221; &#8212; Design and Execute\" src=\"https:\/\/www.designandexecute.com\/designs\/holy-trinity-of-analytics\/embed\/#?secret=y8BFC2EeQv\" data-secret=\"y8BFC2EeQv\" width=\"600\" height=\"338\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This post will not dive into each topic in detail but serve more like a curriculum of things to research for the\u00a0 Data Journey.\u00a0 Anyone who needs to get into the Data Warehouse (DW) space should have a handle on the following Design Patterns: Connection Patterns There are 4 Patterns that can be used between [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":2718,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31],"tags":[],"class_list":["post-1068","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bi-data-warehouse"],"jetpack_featured_media_url":"https:\/\/www.designandexecute.com\/designs\/wp-content\/uploads\/2015\/11\/design-patterns.jpg","_links":{"self":[{"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/posts\/1068","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/comments?post=1068"}],"version-history":[{"count":6,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/posts\/1068\/revisions"}],"predecessor-version":[{"id":20753,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/posts\/1068\/revisions\/20753"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/media\/2718"}],"wp:attachment":[{"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/media?parent=1068"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/categories?post=1068"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.designandexecute.com\/designs\/wp-json\/wp\/v2\/tags?post=1068"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}