实验报告聚类分析报告.docx-资源下载

实验报告聚类分析报告.docx

1、实验报告聚类分析报告实验报告聚类分析实验原理：K均值聚类、中心点聚类、系统聚类和EM算法聚类分析技术。实验题目：用鸢尾花的数据集，进行聚类挖掘分析。实验要求：探索鸢尾花数据的基本特征，利用不同的聚类挖掘方法，获得基本结论并简明解释。实验题目-分析报告：data(iris) rm(list=ls() gc() used (Mb) gc trigger (Mb) max used (Mb)Ncells 431730 23.1 929718 49.7 607591 32.5Vcells 787605 6.1 8388608 64.0 1592403 12.2 data(iris) data hea

2、d(data) Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa#Kmean聚类分析 newiris newiris$Species (kc table(iris$Species, kc$cluster) 1 2 3 setosa 0 50 0

3、versicolor 48 0 2 virginica 14 0 36 plot(newirisc(Sepal.Length, Sepal.Width), col = kc$cluster) points(kc$centers,c(Sepal.Length, Sepal.Width), col = 1:3, pch = 8, cex=2)#K-Mediods 进行聚类分析 install.packages(cluster) library(cluster) iris.pam table(iris$Species,iris.pam$clustering) 1 2 3 setosa 50 0 0

4、versicolor 0 3 47 virginica 0 49 1 layout(matrix(c(1,2),1,2) plot(iris.pam) layout(matrix(1)#hc iris.hc plot( iris.hc, hang = -1) plclust( iris.hc, labels = FALSE, hang = -1) re iris.id sapply(unique(iris.id),+ function(g)iris$Speciesiris.id=g)1 1 setosa setosa setosa setosa setosa setosa setosa set

5、osa setosa setosa setosa12 setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa23 setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa34 setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa setosa45 setosa setosa setosa setosa setosa s

6、etosaLevels: setosa versicolor virginica2 1 versicolor versicolor versicolor versicolor versicolor versicolor versicolor 8 versicolor versicolor versicolor versicolor versicolor versicolor versicolor15 versicolor versicolor versicolor versicolor versicolor versicolor versicolor22 versicolor versicol

7、or virginica virginica virginica virginica virginica 29 virginica virginica virginica virginica virginica virginica virginica 36 virginica virginica virginica virginica virginica virginica virginica 43 virginica virginica virginica virginica virginica virginica virginica 50 virginica virginica virgi

8、nica virginica virginica virginica virginica 57 virginica virginica virginica virginica virginica virginica virginica 64 virginica virginica virginica virginica virginica virginica virginica 71 virginica virginica Levels: setosa versicolor virginica3 1 versicolor versicolor versicolor versicolor ver

9、sicolor versicolor versicolor 8 versicolor versicolor versicolor versicolor versicolor versicolor versicolor15 versicolor versicolor versicolor versicolor versicolor versicolor versicolor22 versicolor versicolor versicolor versicolor versicolor versicolor virginica Levels: setosa versicolor virginic

10、a plot(iris.hc) rect.hclust(iris.hc,k=4,border=light grey)#用浅灰色矩形框出4分类聚类结果 rect.hclust(iris.hc,k=3,border=dark grey)#用浅灰色矩形框出3分类聚类结果 rect.hclust(iris.hc,k=7,which=c(2,6),border=dark grey)# DBSCAN #基于密度的聚类 install.packages(fpc) library(fpc) ds1=dbscan(iris,1:4,eps=1,MinPts=5)#半径参数为1，密度阈值为5 ds1dbscan

11、Pts=150 MinPts=5 eps=1 1 2border 0 1seed 50 99total 50 100 ds2=dbscan(iris,1:4,eps=4,MinPts=5) ds3=dbscan(iris,1:4,eps=4,MinPts=2) ds4=dbscan(iris,1:4,eps=8,MinPts=2) par(mfcol=c(2,2) plot(ds1,iris,1:4,main=1: MinPts=5 eps=1) plot(ds3,iris,1:4,main=3: MinPts=2 eps=4) plot(ds2,iris,1:4,main=2: MinPts

12、=5 eps=4) plot(ds4,iris,1:4,main=4: MinPts=2 eps=8) d=dist(iris,1:4)#计算数据集的距离矩阵d max(d);min(d)#计算数据集样本的距离的最值1 7.0851961 0 install.packages(ggplot2) library(ggplot2) interval=cut_interval(d,30) table(interval)interval 0,0.236 (0.236,0.472 (0.472,0.709 (0.709,0.945 (0.945,1.18 (1.18,1.42 88 585 876 89

13、1 831 688 (1.42,1.65 (1.65,1.89 (1.89,2.13 (2.13,2.36 (2.36,2.6 (2.6,2.83 543 369 379 339 335 406 (2.83,3.07 (3.07,3.31 (3.31,3.54 (3.54,3.78 (3.78,4.01 (4.01,4.25 458 459 465 480 468 505 (4.25,4.49 (4.49,4.72 (4.72,4.96 (4.96,5.2 (5.2,5.43 (5.43,5.67 349 385 321 291 187 138 (5.67,5.9 (5.9,6.14 (6.1

14、4,6.38 (6.38,6.61 (6.61,6.85 (6.85,7.09 97 92 78 50 18 4 which.max(table(interval)(0.709,0.945 4 for(i in 3:5)+ for(j in 1:10)+ ds=dbscan(iris,1:4,eps=i,MinPts=j)+ print(ds)+ + dbscan Pts=150 MinPts=1 eps=3 1seed 150total 150dbscan Pts=150 MinPts=2 eps=3 1seed 150total 150dbscan Pts=150 MinPts=3 eps

15、=3 1seed 150total 150dbscan Pts=150 MinPts=4 eps=3 1seed 150total 150dbscan Pts=150 MinPts=5 eps=3 1seed 150total 150dbscan Pts=150 MinPts=6 eps=3 1seed 150total 150dbscan Pts=150 MinPts=7 eps=3 1seed 150total 150dbscan Pts=150 MinPts=8 eps=3 1seed 150total 150dbscan Pts=150 MinPts=9 eps=3 1seed 150

16、total 150dbscan Pts=150 MinPts=10 eps=3 1seed 150total 150dbscan Pts=150 MinPts=1 eps=4 1seed 150total 150dbscan Pts=150 MinPts=2 eps=4 1seed 150total 150dbscan Pts=150 MinPts=3 eps=4 1seed 150total 150dbscan Pts=150 MinPts=4 eps=4 1seed 150total 150dbscan Pts=150 MinPts=5 eps=4 1seed 150total 150db

17、scan Pts=150 MinPts=6 eps=4 1seed 150total 150dbscan Pts=150 MinPts=7 eps=4 1seed 150total 150dbscan Pts=150 MinPts=8 eps=4 1seed 150total 150dbscan Pts=150 MinPts=9 eps=4 1seed 150total 150dbscan Pts=150 MinPts=10 eps=4 1seed 150total 150dbscan Pts=150 MinPts=1 eps=5 1seed 150total 150dbscan Pts=15

18、0 MinPts=2 eps=5 1seed 150total 150dbscan Pts=150 MinPts=3 eps=5 1seed 150total 150dbscan Pts=150 MinPts=4 eps=5 1seed 150total 150dbscan Pts=150 MinPts=5 eps=5 1seed 150total 150dbscan Pts=150 MinPts=6 eps=5 1seed 150total 150dbscan Pts=150 MinPts=7 eps=5 1seed 150total 150dbscan Pts=150 MinPts=8 e

19、ps=5 1seed 150total 150dbscan Pts=150 MinPts=9 eps=5 1seed 150total 150dbscan Pts=150 MinPts=10 eps=5 1seed 150total 150#30次dbscan的聚类结果 ds5=dbscan(iris,1:4,eps=3,MinPts=2) ds6=dbscan(iris,1:4,eps=4,MinPts=5) ds7=dbscan(iris,1:4,eps=5,MinPts=9) par(mfcol=c(1,3) plot(ds5,iris,1:4,main=1: MinPts=2 eps=

20、3) plot(ds6,iris,1:4,main=3: MinPts=5 eps=4) plot(ds7,iris,1:4,main=2: MinPts=9 eps=5)# EM 期望最大化聚类 install.packages(mclust) library(mclust) fit_EM=Mclust(iris,1:4)fitting . |=| 100% summary(fit_EM)- Gaussian finite mixture model fitted by EM algorithm - Mclust VEV (ellipsoidal, equal shape) model wi

21、th 2 components: log.likelihood n df BIC ICL -215.726 150 26 -561.7285 -561.7289Clustering table: 1 2 50 100 summary(fit_EM,parameters=TRUE)- Gaussian finite mixture model fitted by EM algorithm - Mclust VEV (ellipsoidal, equal shape) model with 2 components: log.likelihood n df BIC ICL -215.726 150

22、 26 -561.7285 -561.7289Clustering table: 1 2 50 100 Mixing probabilities: 1 2 0.3333319 0.6666681 Means: ,1 ,2Sepal.Length 5.0060022 6.261996Sepal.Width 3.4280049 2.871999Petal.Length 1.4620007 4.905992Petal.Width 0.2459998 1.675997Variances:,1 Sepal.Length Sepal.Width Petal.Length Petal.WidthSepal.

23、Length 0.15065114 0.13080115 0.02084463 0.01309107Sepal.Width 0.13080115 0.17604529 0.01603245 0.01221458Petal.Length 0.02084463 0.01603245 0.02808260 0.00601568Petal.Width 0.01309107 0.01221458 0.00601568 0.01042365,2 Sepal.Length Sepal.Width Petal.Length Petal.WidthSepal.Length 0.4000438 0.1086544

24、4 0.3994018 0.14368256Sepal.Width 0.1086544 0.10928077 0.1238904 0.07284384Petal.Length 0.3994018 0.12389040 0.6109024 0.25738990Petal.Width 0.1436826 0.07284384 0.2573899 0.16808182 plot(fit_EM)#对EM聚类结果作图Model-based clustering plots: 1: BIC2: classification3: uncertainty4: densitySelection: （下面显示选项

25、） #选1#选2#选3#选4Selection: 0 iris_BIC=mclustBIC(iris,1:4)fitting . |=| 100% iris_BICsum=summary(iris_BIC,data=iris,1:4) iris_BICsum #获取数1据集iris在各模型和类别数下的BIC值Best BIC values: VEV,2 VEV,3 VVV,2BIC -561.7285 -562.5522369 -574.01783BIC diff 0.0000 -0.8237748 -12.28937Classification table for model (VEV,2): 1 2 50 100 iris_BICBayesian Information Criterion (BIC): EII VII EEI VEI EVI VVI EEE1 -1804.0854 -1804.0854 -1522.1202 -1522.1202 -1522.1202 -1522.1202 -829.97

邮箱/手机：
温馨提示：	快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如填写123，账号就是123，密码也是123。
特别说明：	请自助下载，系统不会自动发送文件的哦；如果您已付费，想二次下载，请登录后访问：我的下载记录
支付方式：
验证码：	换一换

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？