Data Warehouse

Data Warehouse

DR. ALVIN’S PUBLICATIONS DATA LAKE ARCHITECTURE BY BILL INMON A SUMMARY BY DR. ALVIN ANG COPYRIGHTED BY DR ALVIN ANG WWW.ALVINANG.SG CONTENTS Data Lake ................................................................................................................... 6 What is a Data Lake ............................................................................................................ 6 Problems with Data Lake .................................................................................................... 7 One Way Data Lake .................................................................................................................................7 Data Scientists.........................................................................................................................................7 Mountains of Irrelevant Information ......................................................................................................7 Inaccessible Metadata ............................................................................................................................8 Lost Data Relationships ...........................................................................................................................8 Data Ponds ................................................................................................................. 9 Pond Descriptor ............................................................................................................... 10 Update Frequency ................................................................................................................................10 Source Description ................................................................................................................................10 Volume of DAta .....................................................................................................................................10 Selection Criteria ...................................................................................................................................10 Summarization Criteria .........................................................................................................................11 Organization Criteria .............................................................................................................................11 Data Relationships ................................................................................................................................11 Pond Target ..................................................................................................................... 11 Pond Data ........................................................................................................................ 12 Pond Metadata ................................................................................................................ 13 Pond Metaprocess............................................................................................................ 14 Pond Transformation Criteria ........................................................................................... 15 Analog Data Pond ..................................................................................................... 17 Examples ......................................................................................................................... 17 Usage ............................................................................................................................... 18 Triggered By… .................................................................................................................. 18 Storage ............................................................................................................................ 18 Analog Data Issues ........................................................................................................... 19 Transforming / Conditioning Raw Analog Data .................................................................. 19 One way of Analog Data Conditioning = Data Reduction .....................................................................19 Another way of Analog Data Conditioning = Forming Data Relationships ...........................................20 Application Data Pond .............................................................................................. 22 Examples of Application Data ........................................................................................... 22 2 | PAG E COPYRIGHTED BY DR ALVIN ANG WWW.ALVINANG.SG Database Format of Application Data ............................................................................... 23 Integration Mapping ........................................................................................................ 25 An Example of Integration ...................................................................................................................26 Another Example of Simple Integration ...............................................................................................27 Another Example of Complex Integration ...........................................................................................27 Data Model ...................................................................................................................... 28 Textual Data Pond .................................................................................................... 29 Places Containing Valuable Textual Information ................................................................ 29 Digesting Textual Data using Computers ........................................................................... 29 Problems with Textual Data .............................................................................................. 30 Unstructured Data ................................................................................................................................30 Context .................................................................................................................................................30 Inline Contextualization ........................................................................................................................30 Proximity ...............................................................................................................................................31 Alternate spelling ..................................................................................................................................31 Homographic Resolution ......................................................................................................................31 Acronym Resolution ..............................................................................................................................32 Custom Variable Recognition................................................................................................................32 Taxonomy Resolution ...........................................................................................................................32 Date Standardization. ...........................................................................................................................32 Taxonomies and Ontologies.............................................................................................. 32 Textual Disambiguation .................................................................................................... 33 Example of Textual Disambiguation .....................................................................................................33 Sentiment Textual Disambiguation .......................................................................................................34 Example of Analyzing A Database ..................................................................................... 34 Visualizing the Results ...................................................................................................... 35 Negative Comments .............................................................................................................................35 Comments on Pricing + Promotions .....................................................................................................35 Archival Data Pond ................................................................................................... 36 Purpose of Archival Pond .................................................................................................. 36 Criteria For Storing / Removing Data from Analog / Application / Textual Data Ponds ........ 36 Structural Alteration......................................................................................................... 36 Analysis vs Analytics ................................................................................................. 38 Analysis ........................................................................................................................... 38 Analytics .......................................................................................................................... 39 The mere sorting of data ......................................................................................................................39 3 | PAG E COPYRIGHTED BY DR ALVIN ANG WWW.ALVINANG.SG Summarizing data .................................................................................................................................39

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    51 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us