超经典的SAS BASE的笔记3.docx
《超经典的SAS BASE的笔记3.docx》由会员分享,可在线阅读,更多相关《超经典的SAS BASE的笔记3.docx(41页珍藏版)》请在冰豆网上搜索。
![超经典的SAS BASE的笔记3.docx](https://file1.bdocx.com/fileroot1/2023-1/10/dbdcb043-5de3-4385-b9b8-a84018217fbd/dbdcb043-5de3-4385-b9b8-a84018217fbd1.gif)
超经典的SASBASE的笔记3
Topic:
ManagingData
1.ConditionallyIF-THEN-ELSEexecuteSASstatements
2.SORTobservationsinaSASdataset
3.KEEP,DROP,DELETE,RENAME,RETAIN,BY,FILE,PUTandOUTPUTStatements
4.PROCFORMAT
5.SASDO-LOOPS
6.SASArray
1.ConditionallyIF-THEN-ELSEexecuteSASstatements
ConditionalexecutionofdatastepprogramstatementsisimplementedusingtheIF/THEN/ELSEstatements.
Syntax:
IFexpressionTHENstatement1;
ObservethatIF/THENandELSEaretwoseparateSASstatements.EachtimetheIFstatementisexecutedtheexpressionfollowingtheIFisevaluated.Whentheexpressionistruefortheobservation,thestatementfollowingtheTHENisexecuted.TheELSEstatement,whichisoptional,canbeusedtocontrolaspecificactioniftheIFconditionisfalse.
Ortrytofullyunderstandthefollowingstatements:
TheinputstotheIF/ELSEstatementsare
expressionisanexpressionthatisevaluatedforbeingtrueorfalse.
statement1isastatementexecutedwhenexpressionistrue.
statement2isastatementexecutedwhenexpressionisfalse.
TheseexamplesshowdifferentwaysofspecifyingtheIF-THEN/ELSEstatement.
/*Example:
IF-THEN*/
dataif_01_a;
inputcode@@;
cards;
123
;
dataif_01_b;
lengthtype$8.;
setif_01_a;
ifcode=1thentype='Fix';
ifcode=2thentype='Variable';
ifcode^=1andcode^=2thentype='Unknown';
labeltype='TypesofMortgageRate';
run;
/*Example:
IF-THEN/ELSE*/
dataif_01_c;
lengthtype$8.;
setif_01_a;
ifcode=1thentype='Fix';
elseifcode=2thentype='Variable';
elseifcode^=1andcode^=2thentype='Unknown';
labeltype='TypesofMortgageRate';
run;
Letusseehowtodivideagegroup?
dataage_grp;
inputpat_idage@@;
cards;
2901566629987168280256642704566026215658
2632565526645653250656442512564325745647
2583564824460642249456412332563323765637
228356282226062221985621
;
run;
dataage_grp2;
lengthagegrp$5.;
setage_grp;
if20<=age<35thenagegrp='20-34';
elseif35<=age<50thenagegrp='35-49';
elseif50<=age<65thenagegrp='50-64';
elseif65<=agethenagegrp='65+';
run;
procfreq;
tablesagegrp/list;
*tablesagegrp/out=age_grp3;
run;
Advice:
AbetterwaytowritemultipleIFstatementistouseanELSEbeforeallbutthefirstIF.TheotherIF-THEN/ELSEstatementswouldlooklikethis:
Ifthen;
Elseifthen;
Elseifthen;
…………..;
TheeffectoftheELSEstatementsisthatwhenanyIFstatementistrue,allthefollowingELSEstatementsareskipped.Theadvantageistoreducecomputertime(sincealltheIFdonothavetobetested)andevenlytoavoidthefollowingtypeoferror.Wouldyouseewhatwillhappenwiththestatementsbelow?
dataer;
inputx@@;
cards;
12345
;
run;
dataer_2;
seter;
ifx=1thenx=5;
ifx=2thenx=4;
ifx=4thenx=2;
ifx=5thenx=1;
run;
dataer_3;
seter;
ifx=1thenx=5;
elseifx=2thenx=4;
elseifx=4thenx=2;
elseifx=5thenx=1;
run;
/*Example:
IF-THEN/DO*/
dataoss_test;
lengthcourse$8.school$4.;
inputcourseschoolscoreyear;
cards;
EnglishNT921998
EnglishNT941999
EnglishNT962000
EnglishNT912001
EnglishNSS881998
EnglishNSS801999
EnglishNSS842000
EnglishNSS822001
MathNT861998
MathNT881999
MathNT892000
MathNT862001
MathNT902002
MathNSS841998
MathNSS881999
MathNSS882000
MathNSS922001
MathNSS892002
;
dataoss_test1;
setoss_test;
ifcourse='English'thendo;
ifscore>84thenoutput;
end;
run;
dataoss_test2;
setoss_test;
ifschool='NT'thendo;
ifcourse='Math'thendo;
ifscore>86thenoutput;
end;
end;
run;
EarlieryoulearnedtoassignvaluesconditionallyusingIF-THEN/ELSEstatements.YoucanalsouseSELECTgroupsinDATAstepstoperformconditionalprocessing.ASELECTgroupcontainsthesestatements:
Thisstatement...
Performsthisaction...
SELECT
beginsaSELECTgroup.
WHEN
identifiesSASstatementsthatareexecutedwhenaparticularconditionistrue.
OTHERWISE(optional)
specifiesastatementtobeexecutedifnoWHENconditionismet.
END
endsaSELECTgroup.
Syntax:
SELECT<(select-expression)>;
WHEN-1(when-expression-1<...,when-expression-n>)statement;
WHEN-n(when-expression-1<...,when-expression-n>)statement;
END;
Example:
datasel;
inputidsalarygenderjob$;
cards;
128000CA
231000RN
326981ME
445501MD
538951TA
;
datasel2;
lengthoccupation$20Sex$7;
setsel;
select(id);
when
(1)income=salary*10;
when(3,4)income=salary*15;
otherwiseincome=salary*5;
end;
select(job);
when('CA')occupation="CharteredAccountant";
when('RN')occupation="RegistedNurse";
when('ME')occupation="MechanicI";
when('MD')occupation="Doctor";
otherwiseoccupation="Other";
end;
select(gender);
when(0)Sex="Female";
when
(1)Sex="Male";
otherwiseSex='Unknown';
end;
run;
2.SORTobservationsinaSASdataset
BasicConcept:
2.1.1SortingOrdersforNumericVariables
Fornumericvariables,thesmallest-to-largestcomparisonsequenceis
1.SASSystemmissingvalues(shownasaperiodorspecialmissingvalue)
2.negativenumericvalues
3.zero
4.positivenumericvalues.
datasorting_1;
inputx@@;
cards;
.10-23
;
procsortdata=sorting_1;
byx;
run;
orderofoutput:
.-2013;
2.1.2SortingOrdersforCharacterVariables
datasorting_2;
inputarea$@@;
cards;
torontoLondon535.hamilton
;
procsortdata=sorting_2;
byarea;
run;
Output:
‘blank’,535,London,hamilton,toronto
2.2WhatcanSORTdo?
-Specifytheinputdataset
-Createanoutputdataset
-Specifytheoutputorder
-EliminateduplicateobservationswithcommonBYvaluesandotheroptions
2.3ApplicationsofPROCSORTprocedure
ThesortproceduresortsobservationsinaSASdatasetbyoneormorecharacterornumericvariables,eitherreplacingtheoriginaldatasetorcreatinganew,sorteddataset.PROCSORTbyitselfproducesnoprintedoutput.
2.3.1ObservationsSortedbytheValuesofOneVariable
Inthisexample,PROCSORTreplacestheoriginaldataset,sortedalphabeticallybylastname,withadatasetthatissortedbyemployeeidentificationnumber.Thestatementsthatproducetheoutputfollow:
datasorting_3;
inputName$IDnumber;
datalines;
Arnsbarger5466
Belloit1988
Capshaw7338
Lemeux4210
Pierce5779
Wesley2092
;
procsort;
byidnumber;
run;
2.3.2ObservationsSortedbytheValuesofMultipleVariables
Thebusinessesinthisexamplearefirstsortedbytown,thenbydebtfromhighestamounttolowestamount,thenbyaccountnumber.
DESCENDINGoption:
reversesthesortorderforthevariablethatimmediatelyfollowsinthestatementsothatobservationsaresortedfromthelargestvaluetothesmallestvalue.
datasorting_4;
inputcompany$1-23town$24-36debtaccnt_num;
datalines;
ApexCateringApex37.959923
Bob'sBedsMorrisville119.954998
Boyd&SonsAccountingGarner312.494762
DeluxeHardwareGarner467.128941
ElwayPianoandOrganGarner65.795217
IceCreamDelightHollySprings299.982310
Pauline'sAntiquesMorrisville302.059112
Paul'sPizzaApex83.001019
Peter'sAutoPartsApex65.797288
StricklandIndustriesMorrisville657.221675
Tina'sPetShopApex37.955108
Tim'sBurgerStandHollySprings119.956335
WatsonTaborTravelApex37.953131
WorldWideElectronicsGarner119.951122
;
run;
procsort;
bytowndescendingdebt;
run;
2.3.3CreateOutputDataSetfortheSortedObservations
procsortdata=sorting_4out=sorting_5;
bytowndescendingdebt;
run;
2.3.4Eliminateduplicateobservations
-NODUPKEYoption
Inthisexample,PROCSORTcreatesanoutputdatasetthatcontainsonlythefirstobservationofeachBYgroup.TheNODUPKEYoptionremovesanobservationfromtheoutputdatasetwhenitsBYvalueisidenticaltothepreviousobservation'sBYvalue.Theresultingreportcontainsoneobservationforeachtownwherethebusinessesarelocated.ItautomaticallyeliminatesmultipleobservationswheretheByvariableshavethesamevalue.
optionsnodatepageno=1linesize=80pagesize=60;
dataaccount;
inputCompany$1-22Debt25-30AccountNumber33-36
Town$39-51;
datalines;
Paul'sPizza83.001019Apex
WorldWideElectronics119.951122Garner
StricklandIndustries657.221675Morrisville
IceCreamDelight299.982310HollySprings
WatsonTaborTravel37.953131Apex
Boyd&SonsAccounting312.494762Garner
Bob'sBeds119.954998Morrisville
Tina'sPetShop37.955108Apex
ElwayPianoandOrgan65.795217Garner
Tim'sBurgerStand119.956335HollySprings
Peter'sAutoParts65.797288Apex
DeluxeHardware467.128941Garner
Pauline'sAntiques302.059112Morrisville
ApexCatering37.959923Apex
;
procsortdata=accountout=townsnodupkey;
bytown;
run;
procprintdata=towns;
vartowncompanydebtaccountnumber;
title'TownsofCustomerswithPast-DueAccounts';
run;
-NODUPLICATE/NODUP/NODUPRECSoption
Inthisexample,theNODUPLICATEoptionremovesobservationsthathaveduplicatevalueswithBYvalue.
Example:
datasorting_6;
inputpat_idage;
cards;
29015666
29015666
28025664
28015664
26215658
26325655
;
run;
procsortdata=sorting_6noduplicateout=sorting_7;
byage;
run;
***Nodupkey,nodup,andnoduplicate***;
datadup;
inputaccount_idvisitmmddyy10.checking_balcomma9.2;
datalines;
20118911/11/19987,865.28
20118911/28/19985,724.02
20118912/08/19986,908.98
20236911/11/19984,405.18
20418911/28/19985,724.02
20418912/05/19988,054.32
22518911/28/19983,632.85
225189