数据挖掘是专业选修课程,旨在为学生构建大数据时代所需的数据分析知识体系,提升其专业技能。课程基于高等数学、概率统计理论和线性代数基础知识,聚焦于从现实场景中收集数据,并通过统计方法和数据挖掘技术进行数据预处理、分析与知识发现。学生将学习如何使用相关软件解决数据分析问题的综合能力。 课程主要讲授数据挖掘的基本理论、经典算法与前沿应用。包括数据的定义、流程、数据预处理、描述性统计分析、数据可视化、关联分析、聚类、分类、数值预测等几个主要的部分,其中数据预处理包括数据的标准化、空缺值处理、噪音数据处理、数据规约;关联分析主要讲授关联规则、Aproiri算法、FP-growth算法;聚类部分主要讲授K-means、层次聚类分析、DBSCAN等方法;分类部分主要讲授决策树、朴素贝叶斯、支持向量机等方法;数值预测主要讲授回归方法、回归树与决策树、K近邻数值预测等方法。该课程通过案例分析、上机练习提高学生应用数据挖掘方法解决实际问题的动手能力。 (课程英文介绍) Data Mining is a professional elective course aimed at building students' knowledge system of data analysis required in the era of big data and enhancing their professional skills. The course is based on advanced mathematics, probability and statistics theory, and basic knowledge of linear algebra, focusing on collecting data from real-world scenarios and conducting data preprocessing, analysis, and knowledge discovery through statistical methods and data mining techniques. Students will learn the comprehensive ability to use relevant software to solve data analysis problems.
The course mainly teaches the basic theory, classical algorithms, and cutting-edge applications of data mining. It includes several main parts such as data definition, process, data preprocessing, descriptive statistical analysis, data visualization, correlation analysis, clustering, classification, and numerical prediction. Data preprocessing includes data standardization, missing value processing, noise data processing, and data reduction; Association analysis mainly teaches association rules, Apriori algorithm, and FP growth algorithm; The clustering section mainly teaches methods such as K-means, hierarchical clustering analysis, DBSCAN, etc; The classification section mainly teaches methods such as decision trees, naive Bayes, support vector machines, etc; Numerical prediction mainly teaches regression methods, regression trees and decision trees, K-nearest neighbor numerical prediction, and other methods. This course enhances students' hands-on ability to apply data mining methods to solve practical problems through case analysis and hands-on exercises. |