《聚类分析文献英文翻译.docx》由会员分享,可在线阅读,更多相关《聚类分析文献英文翻译.docx(10页珍藏版)》请在优知文库上搜索。
1、电气信息工程学院外文栩译英文名称:Dataminingc1.us1.crinf译文名称:数据挖掘一聚类分析专业X白动化姓名:竺*班级学号:U指导教师:译文出处:Datamining:IanH.Witten.EibeFrank著二。一。年四月二十六日C1.ustering5.1 INTRODUCTIONC1.usteringissimi1.artoc1.assificationinthatdataaregrouped.However,un1.ikec1.assification,thegroupsarenopredefined.Instead,thegroupingisaccomp1.ished
2、byI1.iuiingsimi1.aritiesbetweendataaccording(ocharacteristicsIbundintheactua1.data.Thegroupsareca1.1.edc1.usters.Someauthorsviewc1.usteringasaspecia1.typeofc1.assication.Inthistext,however.WCIb1.1.owamoreconventiona1.viewinthatthetwoarcdifferent.Manydefinitionsforc1.ustershavebeenproposed: Setof1.ik
3、ee1.enents.E1.ementsfromdifferentc1.ustersarenota1.ike. Thedistancebetweenpointsinac1.usteris1.esshanthedistancebetweenapointinthec1.usterandanypoinoutsideit.Atermsimi1.artoc1.usteringisdatabasesegmentation,where1.iketup1.e(record)inadatabasearegrouped(ogether.T1.usisdonetopartitionorsegmentIhedatab
4、aseintocomponentsthatthengivetheuseramoregenera1.viewofthedata.In(hiscasetext,wcdonotdifferentiatebetweenSCgnwntationandc1.ustering.Asimp1.eexamp1.eofc1.usteringisfoundinExamp1.e5.1.Thisexamp1.ei1.1.ustratesthefactthatthatdetermininghowtodothec1.usteringisnotstraightforward.Asi1.1.ustratedinFigure5.
5、1.agivensetofdatamaybec1.usteredondifferentattributes.Hereagroupofhomesinageographicareaisshown.Thefirstf1.oortypeofc1.usteringisbasedonthe1.oca1.ionofthehome.Homesthataregeographica1.1.yc1.osetoeachotherarec1.usteredtogeher.Inthesecondc1.ustering,homesaregroupedbasedontesizeofthehouse.C1.usteringha
6、sbeenusedinmanyapp1.icationdomains,inc1.udingbio1.ogy,medicine,anthropo1.ogy,marketing,andeconomics.C1.usteringapp1.icationsinc1.udep1.antandanima1.c1.assification,diseasec1.assification,imageprocessing,patternrecognition,anddocUmCn1.retrieva1.Oneofthefirstdomainsinwhichc1.usteringwasusedwasbio1.ogi
7、ca1.taxonomy.Recentusesinc1.udeexaminingWeb1.ogdatatodetectusagepatterns.Whenc1.usteringisapp1.iedtoarea1.wor1.ddatabase,manyinterestingprob1.emsoccur: Ou1.1.ierhand1.ingisditrcu1.t.Herethee1.enensdonotnatura1.1.yfa1.1.intoanyc1.uster.Theycanbeviewedasso1.itaryc1.usters.However,ifac1.usteringa1.gori
8、thmattemptstofind1.argerc1.usters,theseout1.ierswi1.1.beforcedtobep1.acedinsomec1.uster.Thisprocessmayresu1.tinthecreationofPOOrc1.ustersbycombiningtwoexistingc1.ustersand1.eavingtheout1.ierinitsownc1.uster. Dynamicdatainthedatabaseimp1.iesthatc1.ustermembershipmaychangeovertime. Interpretingthesema
9、nticmeaningofeachc1.ustermaybedifficu1.t.Withc1.assification,the1.abe1.ingofthec1.assesisknownaheadof1.ime.However,withc1.ustering,thismaynotbethecase.Thus,whenthec1.usteringprocessfinishescreatingasetofc1.usters,theexactmeaningofeachc1.ustermaynotbeobvious.Hereiswhereadomainexpertisneededtoassigna1
10、.abe1.orinterpretationforeachc1.uster. Thereisnoonecorrectanswertoac1.usteringprob1.em.Infact,manyanswersmaybefound.Theexactnumberofc1.ustersrequiredisnoteasytodetermine.Again,adomainexpertmayberequired.Forexamp1.e.SUPPoSewehaveasetofdataaboutp1.antsthathavebeenco1.1.ectedduringafie1.dtrip.Withoutan
11、ypriorknow1.edgeofp1.antc1.assification,ifweattempttodividethissetofdataintosimi1.argroupings,itwou1.dnotbec1.earhowmanygroupsshou1.dbecreated. Anotherre1.atedissueiswhatdatashou1.dbeusedOfc1.ustcring.Un1.ike1.earningduringac1.assificationprocess,wherethereissomeaprioriknow1.edgeconcerningwhattheatt
12、ributesofeachc1.assificationshou1.dbe.inc1.usteringwehavenosupervised1.earningtoaidtheprocess.Indeed,c1.usteringcanbeviewedassimi1.artounsupervised1.earning.WecanthensummarizesonebasicfeaturesOfc1.usiering(asopposedtoc1.assification): The(best)numberofc1.ustersisnotknown. Theremaynotbeanyaprioriknow
13、1.edgeconcerningthec1.usters. C1.usterresu1.tsarcdynamic.Thec1.usteringprob1.emisstatedasshowninDefinition5.1.Hereweassumethatthenumberofc1.usterstobecreatedisaninputva1.ue,k.Theactua1.content(andinterpretation)ofeachc1.uster.1jk.isdeterminedasaresu1.tofthefunctiondefinition.Without1.ossofgenera1.it
14、y,wewi1.1.viewthattheresu1.tofso1.vingac1.usteringprob1.emisthatasetofc1.ustersiscreated:K=ki.k2.ki).DEFINITION5.!.GivenadatabaseD=t1.,t2,.,tnoftup1.esandanintegerva1.uek,thec1.usteringprob1.emistodefineamappingfiD1,.jIwhereeacht1.isassignedtoonec1.usterKi,jk.Ac1.usterK,containsprecise1.ythosetup1.e
15、smappedtoi(;(hatis.K,=t,/(1)=Af,1.in,andt1.eD.Ac1.assificationofthedi-erenttypesofc1.usteringa1.gorithmsisshowninFigure5.2.C1.usteringa1.gorithmsthemse1.vesmaybeviewedashierarchica1.orpartitiona1.Withhierarchica1.c1.ustering,anestedsetofc1.ustersiscreated.Each1.eve1.inthehierarchyhasaseparateSCtofc1
16、.usters.Atthe1.owest1.eve1.,eachitemisinitsownuniquec1.uster.Atthehighest1.eve1.,a1.1.itemsbe1.ongtothesamec1.uster.Withhierarchica1.c1.us1.ering,thedesirednumberofc1.ustersisnotinput.Withpa11iiiona1.c1.ustering,thea1.gorithmcreateson1.yonesetofc1.usters.Theseapproachesusethedesirednumberofc1.usterstodrivehowthefina1.setiscreated.Traditiona1.c1.usteringa1.gorithmstend