数据挖掘实验报告资料下载.pdf-资源下载

数据挖掘实验报告资料下载.pdf

1、given a set of points in some space,it groups together points that are closely packed together（points with many nearby neighbors）,marking as outliers points that lie alone in low-density regions（whose nearest neighbors are too far away）.DBSCAN is one of the most common clustering algorithms and also

2、 most cited in scientific literature.二、实验设计 1.K-Means 算法思想:任意选取点集中的 k 个点作为中心,对每一个点与 k 个中心进行对比,划分至以这 k 个中心为中心点的簇中.划分结束后,重新计算每一个簇的中心点.重复以上过程,直至这些中心点不再变化.哈尔滨工业大学 Page 2 of 10 Designed by 谢浩哲程序流程图:核心代码:1 public class KMeans 2 public Cluster getClusters（int k,Point points）3 if（k=points.length）4 return n

3、ull;5 6 7 Cluster clusters=getInitialClusters（k,points）;8 Cluster newClusters=null;9 do 10 newClusters=getClusters（k,points,clusters）;11 12 if（isClustersTheSame（clusters,newClusters）13 break;哈尔滨工业大学 Page 3 of 10 Designed by 谢浩哲 14 15 clusters=newClusters;16 while（true）;17 return clusters;18 19 20 pr

4、ivate Cluster getClusters（int k,Point points,Cluster cluster）21 for（int i=0;i points.length;+i）22 Point currentPoint=pointsi;23 Cluster c=getClosestClusters（currentPoint,cluster）;24 c.points.add（currentPoint）;25 26 27 Cluster newClusters=new Clusterk;28 for（int i=0;i k;+i）29 Cluster c=clusteri;30 in

5、t numberOfPointsInCluster=c.points.size（）;31 32 if（numberOfPointsInCluster=0）33 /If the cluster is empty 34 int randomIndex=（int）（Math.random（）*points.length）;35 newClustersi=new Cluster（pointsrandomIndex）;36 else 37 /If the cluster is not empty 38 double newCentroidX=0;39 double newCentroidY=0;40 f

6、or（int j=0;j numberOfPointsInCluster;+j）41 Point p=c.points.get（j）;42 newCentroidX+=p.x;43 newCentroidY+=p.y;44 45 newCentroidX/=numberOfPointsInCluster;46 newCentroidY/=numberOfPointsInCluster;48 Cluster newCluster=new Cluster（new Point（newCentroidX,newCentroidY）;49 newClustersi=newCluster;50 51 哈尔

7、滨工业大学 Page 4 of 10 Designed by 谢浩哲 52 return newClusters;53 54 2.AGNES（层次聚类）算法思想:算法选用 Group Average 作为合并估量.第一次循环选取 n 个点中 Group Average 最小值进行合并,将合并后的簇加入列表中,移除之前的2个簇,并重新计算该簇中的点与其他n 2 个簇的 Group Average.重复执行之前的步骤,直至所有的簇都被合并.程序流程图:哈尔滨工业大学 Page 5 of 10 Designed by 谢浩哲核心代码:1 public class Agnes 2 public Cl

8、uster getCluster（List clusters）3 while（clusters.size（）1）4 double minProximity=Double.MAX_VALUE;5 int minProximityIndex1=0,minProximityIndex2=0;6 7 for（int i=0;i clusters.size（）;+i）8 for（int j=i+1;j clusters.size（）;+j）9 double proximity=getProximity（clusters.get（i）,clusters.get（j）;10 11 if（proximity

9、minProximity）12 minProximity=proximity;13 minProximityIndex1=i;14 minProximityIndex2=j;15 16 17 18 Cluster c=new Cluster（clusters.get（minProximityIndex1）,clusters.get（minProximityIndex2）;19 clusters.add（c）;20 clusters.remove（minProximityIndex2）;21 clusters.remove（minProximityIndex1）;22 23 return clu

10、sters.size（）=0?null:clusters.get（0）;24 25 3.DBSCAN 算法思想:首先在所有的点集中识别出 Core Point（对其邻域内点的个数进行计数）,再在剩余的点集中识别出 Core Point（即该点在 Core Point 的邻域内）.接着,若两个 Core Point 彼此相连,他们是一个 Cluster 中的点,将所有的 Core Point合并成若干的Cluster.再检查所有的Border Point,看该Border Point在哪一个Core Point的邻域内,将其合并至该 Core Point 所在的簇.哈尔滨工业大学 Page 6

11、of 10 Designed by 谢浩哲程序流程图:以下为该算法核心代码的实现（仅包含识别 Core Point,并将 Core Point 分类成簇）1 public class Dbscan 2 public List getClusters（List points,int minPoints,double eps）3 List corePoints=getCorePoints（points,minPoints,eps）;4 Map clusters=getClustersOfCorePoints（corePoints,eps）;5 6 List borderPoints=getBor

12、derPoints（points,corePoints,minPoints,eps）;7 getClustersOfBorderPoints（corePoints,borderPoints,clusters,eps）;8 哈尔滨工业大学 Page 7 of 10 Designed by 谢浩哲 9 return new ArrayList（clusters.values（）;10 11 12 private List getCorePoints（List points,int minPoints,double eps）13 List corePoints=new ArrayList（）;14 15 for（int i=0;i points.size（）;+i）16 Point currentPoint=points.get（i）;

邮箱/手机：
温馨提示：	快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如填写123，账号就是123，密码也是123。
特别说明：	请自助下载，系统不会自动发送文件的哦；如果您已付费，想二次下载，请登录后访问：我的下载记录
支付方式：
验证码：	换一换

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？