关于软件防护扩展随机化的数据安全分析.docx

资源描述

关于软件防护扩展随机化的数据安全分析.docx

《关于软件防护扩展随机化的数据安全分析.docx》由会员分享，可在线阅读，更多相关《关于软件防护扩展随机化的数据安全分析.docx（27页珍藏版）》请在冰豆网上搜索。

关于软件防护扩展随机化的数据安全分析.docx

关于软件防护扩展随机化的数据安全分析

SecuringDataAnalyticsonSGXWithRandomization

关于软件防护扩展随机化的数据安全分析

Author:

SwarupChandra,VishalKarande,ZhiqiangLin,LatifurKhan,MuratKantarcioglu,andBhavaniThuraisingham

From:

UniversityofTexasatDallas,RichardsonTX,USA,

{swarup.chandra,vishal.karande,zhiqiang.lin,lkhan,muratk,

bhavani.thuraisingham}@utdallas.edu,

Abstract

Protectionofdataprivacyandpreventionofunwarrantedinformationdisclosureisanenduringchallengeincloudcomputingwhendataanalyticsisperformedonanuntrustedthird-partyresource.Recentadvancesintrustedprocessortechnology,suchasIntelSGX,haverejuvenatedtheeffortsofperformingdataanalyticsonasharedplatformwheredatasecurityandtrustworthinessofcomputationsareensuredbythehardware.However,apowerfuladversarymaystillbeabletoinferprivateinformationinthissettingfromsidechannelssuchascacheaccess,CPUusageandothertimingchannels,therebythreateningdataanduserprivacy.Thoughstudieshaveproposedtechniquestohidesuchinformationleaksthroughcarefullydesigneddata-independentaccesspaths,suchtechniquescanbeprohibitivelyslowonmodelswithlargenumberofparameters,especiallywhenemployedinareal-timeanalyticsapplication.Inthispaper,weintroduceadefensestrategythatcanachievehighercomputationalefficiencywithasmalltrade-offinprivacyprotection.Inparticular,westudyastrategythataddsnoisetotracesofmemoryaccessobservedbyanadversary,withtheuseofdummydatainstances.Wequantitativelymeasureprivacyguarantee,andempiricallydemonstratetheeffectivenessandlimitationofthisrandomizationstrategy,usingclassificationandclusteringalgorithms.Ourresultsshowsignificantreductioninexecutiontimeoverheadonreal-worlddatasets,whencomparedtoadefensestrategyusingonlydata-obliviousmechanisms.

Keywords:

DataPrivacy,Analytics,IntelSGX,Randomization

摘要

在不受信任的第三方资源上执行数据分析时，保护数据隐私和防止不必要的信息泄露是云计算中的一个长期挑战。

可信处理器技术（如英特尔SGX软件防护扩展）的最新进展使在共享平台上执行数据分析的工作重新焕发活力，数据安全性和计算可信性由硬件保证。

然而，强大的对手仍然可以通过诸如缓存访问，CPU使用率和其他定时通道等侧通道来推断该设置中的私人信息，从而威胁数据和用户隐私。

尽管研究人员已经提出了通过精心设计的与数据无关的访问路径来隐藏这种信息泄漏的技术，但是对于具有大量参数的模型来说，这种技术可能过于缓慢，尤其是在实时分析应用程序中使用时。

在本文中，我们引入一个防御策略，可以实现更高的计算效率，在隐私保护方面有一个小的折衷。

具体来说，我们研究了一种策略，通过使用虚拟数据实例来增加对手观察到的内存访问痕迹的噪声。

我们定量测量隐私保证，并用分类和聚类算法来验证这种随机化策略的有效性和局限性。

与仅使用数据遗忘机制的防御策略相比，我们的结果显示在实际数据集上的执行时间开销显着减少。

关键字：

数据保密，分析，英特尔SGX，随机化

1.Introduction

Whencomputationinvolvingdatawithsensitiveinformationisoutsourcedtoanuntrustedthird-partyresource,dataprivacyandsecurityisamatterofgraveconcerntothedata-owner.Forexample,third-partyservicesofferingstate-of-the-artpredictiveanalyticsplatformmaybeusedondatacontainingprivateinformationsuchashealth-carerecords.Anadversaryinthisenvironmentmaycontrolthethird-partyresourceforobtainingrecordsofaspecificuser,oridentifyingsensitivepatternsindata.Typically,dataisprotectedfromsuchexternaladversariesusingcryptographicallysecureencryptionschemes.However,directcomputationonencrypteddata,usingtechniquessuchasfully-homomorphicencryptionschemes[13],canbeinefficientformanypracticalpurposes[21],includingdataanalytics-thefocusofthispaper.

当涉及敏感信息的数据计算外包给不可信的第三方资源时，数据的隐私性和安全性就成为数据所有者关心的问题。

例如，提供最先进的预测分析平台的第三方服务可用于包含私人信息的数据，例如保健记录。

在此环境中的对手可以控制第三方资源以获取特定用户的记录，或者识别数据中的敏感模式。

通常，使用密码安全的加密方案保护数据免受这些外部对手的攻击。

然而，使用诸如完全同态加密方案[13]等技术对加密数据进行直接计算对于许多实际目的可能是低效的[21]，包括数据分析-本文的重点。

Recentadvancesinhardware-basedtechnologysuchasIntelSGXofferscryptographicallysecureexecutionenvironment,calledanEnclave,thatisolatescodeanddatafromuntrustedregionswithinadevice.Itisnaturaltoleveragetheconfidentialityandtrustworthinessprovidedbythismechanism,supportedbyanuntrustedthird-partyserver,toefficientlyperformlarge-scaleanalyticsoversensitivedatawhichisdecryptedwithinasecureregion.Anadversarycontrollingthisserverwillneitherhaveaccesstodecrypteddata,norwillbeabletomodifycomputationinvolvingit.

英特尔SGX等基于硬件的技术的最新进展提供了密码安全的执行环境，称为Enclave，它将代码和数据从设备内的不可信区域中分离出来。

利用由不受信任的第三方服务器支持的此机制提供的机密性和可信性，能够高效地对在安全区域内解密的敏感数据进行大规模分析是很自然的。

控制这台服务器的对手既不能访问解密的数据，也不能修改涉及它的计算。

Unfortunately,studieshavediscoveredpresenceofside-channelsthatmayleakundesirableinformationfromwithinanenclave.Byobservingresourceaccessandtiming,anadversarycandesignanattacktoderivesensitiveinformationfromcomputationatruntime[14,34].Nevertheless,mechanismstoeliminatesuchinformationleaktypicallyreliesonthesoftwaredevelopertohideaccesspatternswithothernon-essentialordummyresourceaccesses.Theseincludebalancedexecution[31]anddata-obliviousexecution[26].Fromtheadversarialpointofview,thesemechanismsaddnoisetopatternsemergingfromessentialcomputationofanaiveimplementation.AlthoughusingsuchdefensescurbinformationleakfromanSGXenclaveandguaranteedataprivacy,theyaddsignificantcomputationaloverheadoncertainapplicationsindataanalytics;insettingsinvolvingalargenumberofparameters,andrequiringreal-timeresponse[23].

不幸的是，研究发现存在可能从飞地内泄漏不良信息的旁道。

通过观察资源访问和时间，攻击者可以设计一个攻击来在运行时计算敏感信息[14,34]。

尽管如此，消除这种信息泄露的机制通常依赖于软件开发者隐藏其他非必要或虚拟资源访问的访问模式。

这些包括平衡执行[31]和数据不执行[26]。

从对抗的角度来看，这些机制将噪音添加到从天真实施的基本计算中形成的模式中。

虽然使用这种防御措施可以遏制新加坡交通领域的信息泄漏并保证数据的隐私性，但是它们会在数据分析的某些应用上增加大量的计算开销;在设置涉及大量的参数，并要求实时响应[23]。

Inthispaper,wediscussanoveldefensemechanismthatcanachievelowercomputationaloverheadwithatrade-offonprivacyguarantee,whenperformingdataanalyticswithinanSGXenclaverunningonathird-partyserver.Inparticular,wefocusontwoclassicalproblemsindataanalytics,i.e.,dataclassificationandclustering.Here,astatisticalmodelisusedtopredictclasslabelsofgivendatainstances（inclassification）orassociatethemtoclusters（inclustering）.Wegeneratenewdummydatainstancesandinterleavethemwithuser-givendatainstancesbeforeevaluation.Ourproposeddefensestrategyleveragesequivalenceinresourceaccesspatternsobservedbyanadversaryduringevaluationofuser-givenanddummydatainstances.Thisintroducesuncertaintyinobservedside-channelinformationinastochasticmanner.

Inshort,wemakethefollowingcontributionsinthispaper.

在本文中，我们讨论了一种新型的防御机制，当在第三方服务器上运行的SGX飞地内执行数据分析时，可以通过隐私保证的权衡取得较低的计算开销。

我们特别关注数据分析中的两个经典问题，即数据分类和聚类。

这里，使用统计模型来预测给定数据实例（分类中）的类别标签，或将它们与群集（群集中）相关联。

我们生成新的虚拟数据实例，并在评估之前将其与用户给定的数据实例进行交织。

我们提出的防御策略在对用户给定和虚拟数据实例进行评估时，利用敌手观察到的资源访问模式的等价性。

这在随机的方式中引入观测的旁路信息的不确定性。

简而言之，我们在本文中做出以下贡献。

–Wepresentadefensestrategyagainstside-channelattacksonIntelSGXbyrandomizinginformationrevealedtotheattacker,andasymptoticallyguaranteeingdataprivacy.

–WeillustrateitsapplicationonpopulardataanalyticsincludingdecisiontreeandNaiveBayesclassification,andk-meansclusteringtechniques.

–Westudytheeffectofprivacyintermsofproportionofdummydatainstancesemployedwithrespecttouser-givendatainstances,andempiricallydemonstratetheeffectivenessofourdefensestrategy.

-通过对攻击者泄露的信息进行随机化，渐进地保证数据的隐私，提出针对英特尔SGX侧向通道攻击的防御策略。

-我们说明了它在流行数据分析中的应用，包括决策树和朴素贝叶斯分类，以及k均值聚类技术。

-我们根据用户提供的数据实例所使用的虚拟数据实例的比例来研究隐私的影响，并凭经验证明我们的防御策略的有效性。

Therestofthepaperisorganizedasfollows.WefirstproviderelevantbackgroundonIntelSGXanddataanalyticsin§2.Wedetailthethreatmodelandourdefensestrategyin§3,anddescriberelevantimplementationtechniquesin§4.Wequantifyprivacyguaranteeoftheproposedstrategywithrespecttothenumberofdummydatainstancesin§5,andthenpresentempiricalestimatesofcomputationaloverheadusingreal-worlddatasets.Wefinallydiscussrelatedstudiesin§6,andconcludein§7.

本文的其余部分安排如下。

我们首先在§2中提供有关英特尔SGX和数据分析的相关背景。

我们在§3中详细介绍威胁模型和我们的防御策略，并在§4中描述相关的实现技术。

我们量化提出的策略的隐私保证相对于第5节中的虚拟数据实例的数量，然后使用真实世界的数据集提出计算开销的经验估计。

我们最后在§6中讨论相关的研究，并在§7中得出结论。

2Background

2.1IntelSGX

IntelSoftwareGuardExtensions（SGX）[2]isasetofadditionalprocessorinstructionstothex86family,withhardwaresupporttocreatesecurememoryregionswithinexistingaddressspace.SuchanisolatedcontaineriscalledanEnclave,whilerestoftheaddressspaceisuntrusted.Datawithinthesememoryregionscanonlybeaccessedbycoderunningwithintheenclave.Thisaccesscontrolisenforcedbythehardware,usingattestationandcryptographicallysecurekeys[11]withatrustedprocessor.ThenewSGXinstructionsareusedtoloadandinitializeanenclave,aswellasenterandexittheprotectedregion.Fromadeveloper’sperspective,anenclaveisenteredbycallingtrustedecalls（enclavecalls）fromtheuntrustedapplicationspace.Theenclavecaninvokeuntrustedcodeinitshostapplicationbycallingocalls（outsidecalls）toexittheenclave.Datafromtheenclaveisalwaysencryptedwhenitisinmemory,buttherearecasesinwhichthecontentshouldbesecurelysavedoutsidetheenclave.TheprocessofexportingthesecretsfromanenclaveisknownasSealing.Theencryptedsealeddatacanonlybedecryptedbytheenclave.EverySGX-enabledprocessorcontainsasecrethardwarekeyfromwhichotherplatformkeysarederived.AremotepartycanverifythataspecificenclaveisrunningonSGXhardwarebyhavingtheenclaveperformremoteattestation.

2背景

2.1英特尔SGX

英特尔软件防护扩展（SGX）[2]是针对x86系列的一组额外的处理器指令，其硬件支持在现有地址空间内创建安全内存区域。

这样一个孤立的容器被称为一个英克雷，而其余的地址空间是不可信的。

这些内存区域内的数据只能通过在飞地内运行的代码来访问。

这种访问控制由硬件执行，使用证明和加密安全密钥[11]与可信处理器。

新的SGX指令用于加载和初始化飞地，以及进入和退出受保护区域。

从开发人员的角度来看，通过从不受信任的应用程序空间调用可信ecall（飞地调用）来进入飞地。

飞地可以通过调用ocalls（外部调用）来退出飞地，从而在其主机应用程序中调用不受信任的代码。

飞地中的数据在内存中时总是被加密，但是在某些情况下内容应该安全地保存在飞地。

从飞地出口秘密的过程被称为密封。

加密的密封数据只能由飞地解密。

每个启用SGX的处理器都包含一个秘密硬件密钥，从中可以派生出其他平台密钥。

远程方可以通过使飞地执行远程认证来验证特定飞地在SGX硬件上运行。

AttacksWhileperformingcomputationswithintheenclave,anadversarycontrollingthehostOSmayinfersensitiveandconfidentialinformationfromsidec

展开阅读全文