Syllabus
Course Code: MS-20-31 Course Name: Data Mining and Analytics using R |
||
MODULE NO / UNIT | COURSE SYLLABUS CONTENTS OF MODULE | NOTES |
---|---|---|
1 | Data Warehouse: A Brief History, Characteristics, Architecture for a Data Warehouse. Fact and Dimension Tables, Data Mining: Introduction, Motivation, Importance, Knowledge Discovery Process, Data Mining Functionalities, Interesting Patterns, Classification of Data Mining Systems, Major issues, Data Preprocessing: Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation and Data Discretization, Data Visualization, Outliers. |
|
2 | Data Mining Techniques: Statistical Perspective on Data Mining, Similarity Measures, Clustering- Requirement for Cluster Analysis, Clustering Methods, Decision Tree- Decision Tree Induction, Attribute Selection Measures, Tree Pruning. Association Rule Mining: Frequent Item-set Mining using Apriori Algorithm, Nearest Neighbour Classification: Performance of Nearest Neighbour Classifiers. |
|
3 | Data Analytics: Ways of Thinking About Data, Qualitative and Quantitative Data, And Data Strategies, Conceptualizing Data Analysis as a Process, Managing Data Analysis Process, Exploratory Data Analysis: Exploring a New Dataset, Summarizing Numeric Data, Anomalies in Numeric Data, Visualizing Relations between Variables. Working with External Data: Manual Data Entry, CSV Files, Other Files, Merging Data from Different Sources. |
|
4 | R Programming: Advantages of R over other Programming Languages, Working with Directories and Data Types in R, Control Statements, Loops, Data Manipulation and integration in R, Exploring Data in R: Data Frames, R Functions for Data in Data Frame, Loading Data Frames, Decision Tree packages in R, Issues in Decision Tree Learning, Hierarchical and K-means Clustering functions in R, Mining Algorithm interfaces in R. |
|