SQL Server Integration Services SSISBest Practices.docx

上传人:b****5 文档编号:3413072 上传时间:2022-11-22 格式:DOCX 页数:11 大小:136.31KB
下载 相关 举报
SQL Server Integration Services SSISBest Practices.docx_第1页
第1页 / 共11页
SQL Server Integration Services SSISBest Practices.docx_第2页
第2页 / 共11页
SQL Server Integration Services SSISBest Practices.docx_第3页
第3页 / 共11页
SQL Server Integration Services SSISBest Practices.docx_第4页
第4页 / 共11页
SQL Server Integration Services SSISBest Practices.docx_第5页
第5页 / 共11页
点击查看更多>>
下载资源
资源描述

SQL Server Integration Services SSISBest Practices.docx

《SQL Server Integration Services SSISBest Practices.docx》由会员分享,可在线阅读,更多相关《SQL Server Integration Services SSISBest Practices.docx(11页珍藏版)》请在冰豆网上搜索。

SQL Server Integration Services SSISBest Practices.docx

SQLServerIntegrationServicesSSISBestPractices

SQLServerIntegrationServices(SSIS)-BestPractices

WinaKindlewith5SQLServereBooks

Problem

SQLServerIntegrationServices(SSIS)hasgrownalotfromitspredecessor(前任;前輩)DTS(DataTransformationServices)tobecomeanenterprisewideETL(Extraction,TransformationandLoading)productintermsof(就….而論;在……方面)itsusability,performance,parallelismetc.Apartfrom(除…..之外)beinganETLproduct,italsoprovidesdifferentbuilt-intaskstomanageaSQLServerinstance.AlthoughtheinternalarchitectureofSSIShasbeendesignedtoprovideahighdegreeofperformanceandparallelismtherearestillsomebestpracticestofurtheroptimizeperformance.Inthistipseries,IwillbetalkingaboutbestpracticestoconsiderwhileworkingwithSSISwhichIhavelearnedwhileworkingwithSSISforthepastcoupleofyears.

Solution

Asmentionedabove,SSISisthesuccessorofDTS(ofSQLServer7/2000).IfyouarecomingfromaDTSbackground,SSISpackagesmaylooksimilartoDTSpackages,butit'snotthecaseinreality.WhatImeanis,SSISisnotanenhancementtoDTSbutratheranewproductwhichhasbeenwrittenfromscratchtoprovidehighperformanceandparallelismandasaresultofthisitovercomesseverallimitationsofDTS.

SSIS2008hasfurtherenhancedtheinternaldataflowpipelineenginetoprovideevenbetterperformance,youmighthaveheardthenewsthatSSIS2008hassetanETLWorldrecordofuploading1TBofdatainlessthanhalfanhour.

ThebestpartofSSISisthatitisacomponentofSQLserver. ItcomesfreewiththeSQLServerinstallationandyoudon'tneedaseparatelicenseforit.Becauseofthis,alongwithhardcoreBIdevelopers,databasedevelopersanddatabaseadministratorsarealsousing ittotransferandtransformdata.

BestPractice#1-PullingHighVolumesofData

Recentlywehadtopulldatafromasourcetablewhichhad300millionsrecordstoanewtargettable.InitiallywhentheSSISpackagestarted,everythinglookedfine,datawasbeingtransferredasexpectedbutgraduallytheperformancedegradedandthedatatransferratewentdowndramatically.Duringanalysiswefoundthatthetargettablehadaprimaryclusteredkeyandtwonon-clusteredkeys.Becauseofthehighvolumeofdatainsertsintothetargettabletheseindexesgotfragmentedheavilyupto85%-90%.Weusedtheonlineindexrebuildingfeaturetorebuild/defragtheindexes,butagainthefragmentationlevelwasbackto90%afterevery15-20minutesduringtheload.Thiswholeprocessofdatatransferandparallelonlineindexrebuildstookalmost12-13hourswhichwasmuchmorethanourexpectedtimefordatatransfer.

Thenwecamewithanapproachtomakethetargettableaheapbydroppingalltheindexesonthetargettableinthebeginning,transferthedatatotheheapandondatatransfercompletion,recreateindexesonthetargettable.Withthisapproach,thewholeprocess(bydroppingindexes,transferringdataandrecreatingindexes)tookjust3-4hourswhichwaswhatwewereexpecting.

Thiswholeprocesshasbeengraphicallyshowninthebelowflowchart.Sotherecommendationistoconsiderdroppingyourtargettableindexesifpossiblebeforeinsertingdatatoitspeciallyifthevolumeofinsertsisveryhigh.

BestPractice#2-AvoidSELECT*

TheDataFlowTask(DFT)ofSSISusesabuffer(achunkofmemory)orientedarchitecturefordatatransferandtransformation.Whendatatravelsfromthesourcetothedestination,thedatafirstcomesintothebuffer,requiredtransformationsaredoneinthebufferitselfandthenwrittentothedestination.

Thesizeofthebufferisdependantonseveralfactors,oneofthemistheestimatedrowsize.Theestimatedrowsizeisdeterminedbysummingthemaximumsizeofallthecolumnsintherow.Sothemorecolumnsinarowmeanslessnumberofrowsinabufferandwithmorebufferrequirementstheresultisperformancedegradation.Henceitisrecommendedtoselectonlythosecolumnswhicharerequiredatdestination.

Evenifyouneedallthecolumnsfromthesource,youshouldusethecolumnnamespecificallyintheSELECTstatementotherwiseittakesanotherroundforthesourcetogathermeta-dataaboutthecolumnswhenyouareusingSELECT*.

Ifyoupullcolumnswhicharenotrequiredatdestination(orforwhichnomappingexists)SSISwillemitwarningslikethis.

[SSIS.Pipeline]Warning:

Theoutputcolumn"SalariedFlag"(64)onoutput"OLEDBSourceOutput"(11)andcomponent"OLEDBSource"

(1)isnotsubsequentlyusedintheDataFlowtask.RemovingthisunusedoutputcolumncanincreaseDataFlowtaskperformance.

[SSIS.Pipeline]Warning:

Theoutputcolumn"CurrentFlag"(73)onoutput"OLEDBSourceOutput"(11)andcomponent"OLEDBSource"

(1)isnotsubsequentlyusedintheDataFlowtask.RemovingthisunusedoutputcolumncanincreaseDataFlowtaskperformance.

Bewarewhenyouareusing"Tableorview"or"Tablenameorviewnamefromvariable"dataaccessmodeinOLEDBsource.ItbehaveslikeSELECT*andpullsallthecolumns,usethisaccessmodeonlyifyouneedallthecolumnsofthetableorviewfromthesourcetothedestination.

Tip:

Trytofitasmanyrowsintothebufferwhichwilleventuallyreducethenumberofbufferspassingthroughthedataflowpipelineengineandimproveperformance.

BestPractice#3-EffectofOLEDBDestinationSettings

TherearecouplesofsettingswithOLEDBdestinationwhichcanimpacttheperformanceofdatatransferaslistedbelow.

DataAccessMode–Thissettingprovidesthe'fastload'optionwhichinternallyusesaBULKINSERTstatementforuploadingdataintothedestinationtableinsteadofasimpleINSERTstatement(foreachsinglerow)asinthecaseforotheroptions.Sounlessyouhaveareasonforchangingit,don'tchangethisdefaultvalueoffastload.Ifyouselectthe'fastload'option,therearealsoacoupleofothersettingswhichyoucanuseasdiscussedbelow.

KeepIdentity–Bydefaultthissettingisuncheckedwhichmeansthedestinationtable(ifithasanidentitycolumn)willcreateidentityvaluesonitsown.Ifyoucheckthissetting,thedataflowenginewillensurethatthesourceidentityvaluesarepreservedandsamevalueisinsertedintothedestinationtable.

KeepNulls–Againbydefaultthissettingisuncheckedwhichmeansdefaultvaluewillbeinserted(ifthedefaultconstraintisdefinedonthetargetcolumn)duringinsertintothedestinationtableifNULLvalueiscomingfromthesourceforthatparticularcolumn.Ifyoucheckthisoptionthendefaultconstraintonthedestinationtable'scolumnwillbeignoredandpreservedNULLofthesourcecolumnwillbeinsertedintothedestination.

TableLock–Bydefaultthissettingischeckedandtherecommendationistoletitbecheckedunlessthesametableisbeingusedbysomeotherprocessatsametime.Itspecifiesatablelockwillbeacquiredonthedestinationtableinsteadofacquiringmultiplerowlevellocks,whichcouldturnintolockescalationproblems.

CheckConstraints–Againbydefaultthissettingischeckedandrecommendationistoun-checkitifyouaresurethattheincomingdataisnotgoingtoviolateconstraintsofthedestinationtable.Thissettingspecifiesthatthedataflowpipelineenginewillvalidatetheincomingdataagainsttheconstraintsoftargettable.Ifyouun-checkthisoptionitwillimprovetheperformanceofthedataload.

BestPractice#4-EffectofRowsPerBatchandMaximumInsertCommitSizeSettings

Rowsperbatch–Thedefaultvalueforthissettingis-1whichspecifiesallincomingrowswillbetreatedasasinglebatch.Youcanchangethisdefaultbehaviorandbreakallincomingrowsintomultiplebatches.Theallowedvalueisonlypositiveintegerwhichspecifiesthemaximumnumberofrowsinabatch.

Maximuminsertcommitsize–Thedefaultvalueforthissettingis'2147483647'(largestvaluefor4byteintegertype)whichspecifiesallincomingrowswillbecommittedonceonsuccessfulcompletion.Youcanspecifyapositivevalueforthissettingtoindicatethatcommitwillbedoneforthosenumberofrecords.Youmightbewondering,changingthedefaultvalueforthissettingwillputoverheadonthedataflowenginetocommitseveraltimes.Yesthatistrue,butatthesametimeitwillreleasethepressureonthetransactionlogandtempdbtogrowtremendouslyspecificallyduringhighvolumedatatransfers.

Theabovetwosettingsareveryimportanttounderstandtoimprovetheperformanceoftempdbandthetransactionlog.Forexampleifyouleave'Maxinsertcommitsize'toitsdefault,thetransactionlogandtempdbwillkeepongrowingduringtheextractionprocessandifyouaretransferringahighvolumeofdatathetempdbwillsoonrunoutofmemoryasaresultofthisyourextractionwillfail.Soitisrecommendedtosetthesevaluestoanoptimumvaluebasedonyourenvironment.

Note:

TheaboverecommendationshavebeendoneonthebasisofexperiencegainedworkingwithDTSandSSISforthelastcoupleofyears.Butasnotedbeforethereareotherfactorswhichimpacttheperformance,oneofthethemisinfrastructureandnetwork.Soyoushoulddothoroughtestingbeforeputtingthesechangesintoyourproductionenvironment.

Problem

Inthefirsttip(SQLServerIntegrationServices(SSIS)-BestPractices-Part1)ofthisseriesIwroteaboutSSISdesignbestpractices.Tocontinuedownthatpath,thistipisgoingtocoverrecommendationsrelatedtotheSQLServerDestinationAdapter,asynchronoustransformations,DefaultBufferMaxSizeandDefaultBufferMaxRows,BufferTempStoragePathandBLOBTempStoragePathaswellastheDelayValidationproperty.

Solution

Inthistipmyrecommendationsarerelatedto

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 小学教育 > 学科竞赛

copyright@ 2008-2022 冰豆网网站版权所有

经营许可证编号:鄂ICP备2022015515号-1