DataMiningEmotionSNSpreprint.docx
《DataMiningEmotionSNSpreprint.docx》由会员分享,可在线阅读,更多相关《DataMiningEmotionSNSpreprint.docx(18页珍藏版)》请在冰豆网上搜索。
DataMiningEmotionSNSpreprint
DataMiningEmotioninSocialNetworkCommunication:
GenderdifferencesinMySpace
MikeThelwall,DavidWilkinson,SukhvinderUppal
StatisticalCybermetricsResearchGroup,SchoolofComputingandInformationTechnology,UniversityofWolverhampton,WulfrunaStreet,WolverhamptonWV11SB,UK.
E-mail:
m.thelwall@wlv.ac.uk,d.wilkinson@wlv.ac.uk,s
Tel:
+441902321470Fax:
+441902321478
Despitetherapidgrowthinsocialnetworksitesandindataminingforemotion(sentimentanalysis),littleresearchhastiedthetwotogetherandnonehashadsocialsciencegoals.ThisarticleexaminestheextenttowhichemotionispresentinMySpacecomments,usingacombinationofdataminingandcontentanalysis,andexploringageandgender.Arandomsampleof819publiccommentstoorfromU.S.userswasmanuallyclassifiedforstrengthofpositiveandnegativeemotion.Twothirdsofthecommentsexpressedpositiveemotionbutaminority(20%)containednegativeemotion,confirmingthatMySpaceisanextraordinarilyemotion-richenvironment.Femalesarelikelytogiveandreceivemorepositivecommentsthanmales,butthereisnodifferencefornegativecomments.Itisthuspossiblethatfemalesaremoresuccessfulsocialnetworksiteuserspartlybecauseoftheirgreaterabilitytotextuallyharnesspositiveaffect.
Introduction
Thecomputer-aideddetection,analysisandapplicationofemotion,particularlyintext,hasbeenagrowthareainrecentyears(Pang&Lee,2008).Almostallofthisresearchhasfocusedondetectingopinionsinlargebodiesoftext.Forexample,aprogrammightscanalargenumberofcustomercommentsorreviewsofamanufacturer’sproductsandreportwhichaspectsofwhichproductstendedtoreceivepositiveandnegativefeedback.Knownasopinionmining(computerscience)orsentimentanalysis(computationallinguistics),thisapproachtypicallyworksbyidentifyingpositivewordsorphrasesinfreetext(e.g.,“Ilike”,or“rocked!
”)andtyingthemtotheobjectsreferredto(e.g.,“theleatherseats”,“thepackageofextras”).Fromawidersocialperspective,emotionisimportanttohumancommunicationandlifeandsoitseemsthatthetimeisripetoexploitadvancesandintuitionsfromopinionmininginordertodetectemotioninawidervarietyofcontextsandforprimarilysocialratherthancommercialgoals.Inparticular,isitnowpossibletodetectemotioninpeople’stextualcommunicationsandusethistogaindeeperinsightsintoissuesforwhichemotioncanplayarole?
Forinstance,howimportantisemotionalexpressionfor:
effectivecommunicationbetweenfriendsoracquaintances,winninganonlineargument,automaticallydetectingabusivecommunicationpatternsinchatrooms,ordetectingpredatorybehaviouronline?
ThisarticlebeginstheprocessofmovingfromopinionminingtoemotiondetectionbyusingacasestudyofMySpacecommentstodemonstratethatitispossibletoextractemotion-bearingcommentsonalargescale,togainpreliminaryresultsaboutthesocialroleofemotionandtoidentifykeyproblemsforthetaskofidentifyingemotionininformaltextualcommunicationsonline.Hence,althoughitispreliminaryandexploratoryitisdesignedtoreportusefulinformationforfutureemotiondetectionresearchandforthoseinterestedinsocialnetworkcommunication.Largescaledatacollectionandanalysisfromsocialnetworksiteshasalreadybeenusedforsocialscienceresearchgoals(Kleinberg,2008)butnotyetincombinationwithemotiondetection.
Background
Thissectionreviewsseveralaspectsofthebackgroundtoautomaticemotiondetectioninsocialnetworksites:
opinionmining(i.e.automaticopiniondetection);thepsychologyandsociologyofemotion(becauseemotionisacomplexconstruct);andsocialnetworkcommunicationandusage.Genderdifferencesinemotionandlanguagearealsodiscussed.
OpinionMiningandTextMining
Opinionminingorsentimentanalysisistheautomaticdetectionofopinionsfromfreetext.Thisresearchareahasbeenpartlymotivatedbythecommercialgoalofgivingcheap,detailedandtimelycustomerfeedbacktobusinesses(Pang&Lee,2008).BeforetheInternet,businesseswouldhavetorelyuponrelativelyslowandexpensivemethodsofgainingcustomerfeedback,suchasphoneormailsurveys,interviewsandfocusgroups.Online,however,theymaybeabletogainfeedbackfromonlinecustomerreviews,blogs,commentsandchatroomdiscussion,assumingthatacomputerprogramcanfilterouttherelevantdatafromtherestoftheweboraparticularreviewswebsite.Inthiscontext,thegoalofopinionminingistoidentifypositiveandnegativeopinionsinfreetextandtoassociatethisopinionwithrelevantobjects.Thegoalmightbedetailinthesenseofidentifyingwhatisdiscussedandhow(e.g.,whichaspectsofacararelikedordisliked),orthegoalmightbeajudgementinthesenseofdiagnosingthenatureandstrengthofopinion(e.g.,diagnosinghowmuchareviewerlikedafilmfromtheironlinereview).
Opinionminingisoftensplitintotwoconsecutivetasks:
detectingwhichtextsegments(e.g.,sentences)containopinionsandthepolarityandperhapsstrengthofthatopinion(Pang&Lee,2008).Asimpletechniquecountshowoftenpositiveandnegativewordsoccurorhowoftentheyco-occurinsentenceswithgiventargetterms(e.g.,“enginereliability”).Whilstfullmachinecomprehensionoftextiscurrentlyimpossible,computationallinguisticstechniquescanpartlyanalysethestructureoftext,usingittomoreaccuratelydetectsentiment.Thisapproachmightincorporatenegatingwords(Das&Chen,2001)like“not”,boosterwordslike“very”andgrammaticalstructurescommoninsentiment-bearingsentences(Turney,2002).ItreliesuponreasonablygrammaticallycorrectEnglishtofunctioneffectively,however,whichmakesitlessusefulinenvironmentslikesocialnetworksiteswithmuchinformallanguage.Manyrefinementsoftheaboveapproacheshavebeenproposed(e.g.,Konig&Brill,2006;Turney,2002).
Textminingapplicationshavealsobeendevelopedinpsychology,communicationstudies,managementandcorpuslinguistics(forareviewsee:
Pennebaker,Mehl,&Niederhoffer,2003).Forinstance,somepsychologicaldisorderscanbequitereliablydiagnosedinpatientsbaseduponasimplewordfrequencyanalysisofspeech(Oxman,Rosenberg,&Tucker,1982);politicalstatements(Hart,2001)andbusinessmissionstatements(Short&Palmer,2008)havebeenanalysedforthestrengthofvariablesincludingoptimism;andafactoranalysisacrossawiderangeoftextgenreshasidentifiedthatthedegreeofauthorinvolvementinatextasopposedtoaninformationalorientation(arguablyaweakexpressionofemotion)issomethingthattendstobeconstantwithingenresbutvariesbetweengenres(Biber,2003).
Thepsychologyofemotion
Manydifferentaspectsofemotioncanbemeasured,including:
individuals’self-reportsoffeelings,neurologicalchanges,autonomicsystemreactions,andbodilyactions–includingfacialmovements(Mauss&Robinson,2009).Theseseemtooverlaptobetweendifferentemotionsleadingthemtobedescribedassyndromesratherthanclearsetsofidentifiablefeatures.Eckman(1992)andothersneverthelessarguethattherearebasicorfundamentalemotionsthatarerelativelyuniversallyrecognisedandapparentlyexperiencedbyhumans,andthattheseexistasaresultofevolutionarypressure.Forexample,autonomicchangesandcognitiveprocessesduringfearprepareapersontorunawayfromdanger.Insupportofthis,thereisscientificevidencethatatleastfivedifferentemotions(fear,disgust,anger,happiness,sadness)aredemonstrablydifferentinthesenseofactivatingdifferentcombinationsofbrainregions(Murphy,Nimmo-Smith,&Lawrence,2003);addingsurprisegivesEkman’s(1992)mainlistofsixbasicemotions.Eckman’s(1992)evidencefoundinsupportofemotionsbeingbasicisasetofsixgeneralcharacteristicscommontoallbasicemotions(e.g.,briefduration,presenceinotherprimates)andthreetypesofcharacteristicthatexistbutdifferbetweenemotions:
signals(e.g.,facialexpressions);physiology(e.g.,autonomicnervoussystemactivitypatterns);andantecedentevents(e.g.,adangerouseventoccurring).
Theabovelistexcludessomeemotionsconsideredimportantbyothers,suchasanxiety,guilt,shame,envy,jealousy,compassionandlove(Lazarus,1991,p.122).Non-basicemotionsaresometimesseenascombinationsofbasicemotionsandseemtovarymorebetweencultures.Emotionperceptionisculture-specificbecausesomesocietiesdescribeemotionsneverapparentlyexperiencedelsewhere(e.g.,theoft-mentioned“stateofbeingawildpig”(Newman,1964)inaNewGuineacommunity).
Fromtheperspectiveoffelthumanexperiencesratherthanattheneurologicalordescriptivelevels,itseemsthattherearetwofundamentaldimensionsratherthanarangeofdifferingkindsofemotions(Fox,2008,p.120).First,thevalenceofanexperiencedemotionisthedegreetowhichitisstronglypositiveornegative.Second,thelevelofarousalfeltistheamountofenergyperceived(e.g.,fromlethargictohyperactive).Thisassertionapparentlycontradictstheneurologicalevidenceaboveofatleastfiveemotionsandthelinguisticevidenceintheformoftheexistenceofawiderangeofnon-synonymoustermsforemotions.Nevertheless,researchhasshownthatpeopledescribingthesametraumaticeventmayuseawiderangeofdifferentemotionalterms(e.g.,sad,angry,upset)almostindiscriminately(Barrett,2006)andthatthetwodimensionsofvalenceandarousalseemtobethekeyunderlyingfactors.Aconsequenceofthisisthatidentifyingvalenceandarousalislikelytobefareasierandmorer