Mining Frequent itemsets – Apriori Algorithm

Apriori algorithm is an algorithm for frequent item set mining and association rule learning over transaction databases. Its followed by identifying the frequent individual items in the database and extending them to larger and larger item sets as long as those item sets appear sufficiently often in the database. The frequent item sets determined by Apriori can be used to determine association rules which highlight general trends in the database. Read: Methods to Measure Data Dispersion 9 Laws Everyone In The Data Mining Should Use Various Data Mining Clustering Algorithms and Examples…

Continue ReadingMining Frequent itemsets – Apriori Algorithm
Comments Off on Mining Frequent itemsets – Apriori Algorithm

9 Laws Everyone In The Data Mining Should Use

DATA MINING is a powerful new technology with a great potential to help companies focus on more important information by extracting the hidden predictive information from large database in their data warehouses. There are some 9 data mining laws that miner should follow when performing mining on particular data sets. Data mining provides two types of results: Business Insights Predictive models, makes predictions automatically. It includes a various methods that include, clustering, classification and market basket analysis, etc. Read: 9 Laws Everyone In The Data Mining Should Use Mining Frequent itemsets - Apriori Algorithm…

Continue Reading9 Laws Everyone In The Data Mining Should Use
Comments Off on 9 Laws Everyone In The Data Mining Should Use

Slowly Changing Dimensions (SCD) in Data Warehouse

Slowly changing dimensions or SCD are dimensions that changes slowly over time, rather than regular bases.  In data warehouse environment, there may be a requirement to keep track of the change in dimension values and are used to report historical data at any given point of time. We can implement slowly changing dimensions (SCD) using various approaches, such as; Type 0: Always retains originalType 1 : Keeps latest data, old data is overwrittenType 2 : Keeps the history of old data by adding new rowType 3 : Adds new attribute to store changed valueType 4 : Uses…

Continue ReadingSlowly Changing Dimensions (SCD) in Data Warehouse
Comments Off on Slowly Changing Dimensions (SCD) in Data Warehouse

Design Slowly Changing Dimension Type 2 in SQL Cont

This is a continuation of the Design Slowly Changing Dimension (SCD) Type2 in SQL Step 4: Copy only updated records from LOAD table These are records which are updated in this load cycle. Since updated records are already in the LOAD table, you have  to compare the LOAD and TGT records on Patient ID or SK column and copy the record which is updated. You should also keep in mind to set EFF_FROM_DT and flag to indicate active record. The updated records will be active from the moment we load them so EFF_FR_DT…

Continue ReadingDesign Slowly Changing Dimension Type 2 in SQL Cont
2 Comments