IT专家优化方法利用c++在NxN数组中找到M最大的元素文档格式.docx

资源描述

IT专家优化方法利用c++在NxN数组中找到M最大的元素文档格式.docx

《IT专家优化方法利用c++在NxN数组中找到M最大的元素文档格式.docx》由会员分享，可在线阅读，更多相关《IT专家优化方法利用c++在NxN数组中找到M最大的元素文档格式.docx（19页珍藏版）》请在冰豆网上搜索。

IT专家优化方法利用c++在NxN数组中找到M最大的元素文档格式.docx

SourcePoint[M];

maxCoefficients=newSourcePoint*[for（intj=0;

jrows;

j++）{for（int

i=0;

icols;

i++）{floatsample=arr[i][j];

if（samplemaxValues[0].value）{intq=1;

while（samplemaxValues[q].valueqM）{maxValues[q-1]=maxValues[q];

//shufflethe

valuesbackq++;

maxValues[q-1].value=sample;

maxValues[q-1].point=Point（i,j）;

Pointstructisjusttwoints-xandy.

点结构就是两个ints-x和y。

Thiscodebasicallydoesaninsertionsortofthevaluescomingin.maxValues[0]

alwayscontainstheSourcePointwiththelowestvaluethatstillkeepsitwithinthetopM

valuesencouteredsofar.Thisgivesusaquickandeasybailoutifsample=maxValues,

wedon’tdoanything.TheissueI’mhavingistheshufflingeverytimeanewbettervalue

isfound.ItworksitswayallthewaydownmaxValuesuntilitfindsit’sspot,shufflingall

theelementsinmaxValuestomakeroomforitself.

这段代码基本上是在插入某种类型的值。

maxValues[0]始终包含具有最低值的源

点，该源点仍然保持在到目前为止附带的前M值中。

如果样本=maxValues，我们

什么都不做，这就给了我们一个快速而简单的援助。

我遇到的问题是每次找到一个

新的更好的值时都要进行洗牌。

它沿着maxValues一直向下移动，直到找到它的位

置，将maxValues中的所有元素都拖放到一起，为自己腾出空间。

I’mgettingtothepointwhereI’mreadytolookintoSIMDsolutions,orcache

optimisations,sinceitlookslikethere’safairbitofcachethrashinghappening.Cuttingthe

costofthisoperationdownwilldramaticallyaffecttheperformanceofmyoverall

algorithmsincethisiscalledmanymanytimesandaccountsfor60-80%ofmyoverall

cost.

我已经准备好研究SIMD解决方案或缓存优化，因为看起来有相当一部分缓存抖

动正在发生。

降低这个操作的成本将极大地影响我的整体算法的性能，因为这被多

次调用，占我总成本的60-80%。

I’vetriedusingastd:

vectorandmake_heap,butIthinktheoverheadforcreatingthe

heapoutweighedthesavingsoftheheapoperations.ThisislikelybecauseMandN

generallyaren’tlarge.Mistypically10-20andN10-30（NxN100-900）.Theissueisthis

operationiscalledrepeatedly,anditcan’tbeprecomputed.

我尝试过使用std:

vector和make_heap，但是我认为创建堆的开销超过了堆操作的

节省。

这可能是因为M和N一般都不大。

M通常是10-20和N10-30（NxN100-

900）。

问题是这个操作被反复调用，并且不能预先计算。

Ijusthadathoughttopre-loadthefirstMelementsofmaxValueswhichmayprovide

somesmallsavings.Inthecurrentalgorithm,thefirstMelementsareguaranteedtoshuffle

themselvesallthewaydownjusttoinitiallyfillmaxValues.

我刚想到预加载maxValues的前M个元素，这可以节省一些开销。

在当前算法

中，前M个元素被保证会一直拖到下端，以最初填充maxValues。

Anyhelpfromoptimizationguruswouldbemuchappreciated:

）

任何来自优化大师的帮助，我们都非常感激:

Afewideasyoucantry.InsomequicktestswithN=100andM=15Iwasabletogetit

around25%fasterinVC++2010buttestityourselftoseewhetheranyofthemhelpin

yourcase.Someofthesechangesmayhavenoorevenanegativeeffectdependingonthe

actualusage/dataandcompileroptimizations.

你可以尝试一些想法。

在N=100和M=15的一些快速测试中，我可以在vc++

2010中提高25%左右的速度，但是你可以自己测试一下，看看它们是否对你有帮

助。

根据实际的使用/数据和编译器优化，其中一些更改可能没有或甚至有负面影

响。

Don’tallocateanewmaxValuesarrayeachtimeunlessyouneedto.Usingastack

variableinsteadofdynamicallocationgetsme+5%.除非需要，否则不要每次都分配一

个新的maxValues数组。

使用堆栈变量而不是动态分配得到+5%Changing

g_Source[i][j]tog_Source[j][i]gainsyouaverylittlebit（notasmuchasI’dthoughtthere

wouldbe）.将g_Source[i][j]更改为g_Source[j][j][i]会给您带来一点好处（没有我想象

的那么多）。

UsingthestructureSourcePoint1listedatthebottomgetsmeanotherfew

percent.使用底部列出的SourcePoint1结构，我又得到了几个百分点。

Thebiggest

gainofaround+15%wastoreplacethelocalvariablesamplewithg_Source[j][i].The

compilerislikelysmartenoughtooptimizeoutthemultiplereadstothearraywhichit

can’tdoifyouusealocalvariable.+15%的最大收益是用g_Source[j][i]代替局部变量

样本。

编译器可能足够聪明，可以优化数组的多次读取，如果使用局部变量，编译

器就不能这么做。

Tryingasimplebinarysearchnettedmeasmalllossofafew

percent.ForlargerM/Nsyou’dlikelyseeabenefit.尝试一个简单的二分搜索，我得到

了百分之几的小损失。

对于较大的M/n，你可能会看到好处。

Ifpossibletryto

keepthesourcedatainarr[][]sorted,evenifonlypartially.Ideallyyou’dwanttogenerate

maxValues[]atthesametimethesourcedataiscreated.如果可能的话，尝试保持

arr[][][]中的源数据排序，纵然只是部分排序。

理想情况下，您希望在创建源数据的

同时生成maxValues[]。

Lookathowthedataiscreated/stored/organizedmaygive

youpatternsorinformationtoreducetheamountoftimetogenerateyourmaxValues[]

array.Forexample,inthebestcaseyoucouldcomeupwithaformulathatgivesyouthe

topMcoordinateswithoutneedingtoiterateandsort.查看如何创建/存储/组织数据可以

为您提供模式或信息，从而减少生成maxValues[]数组的时间。

例如，在最好的情况

下，您可以得出一个公式，它给出了最上面的M坐标，而不需要迭代和排序。

Codeforabove:

上面的代码:

structS

展开阅读全文