资源预览内容
第1页 / 共26页
第2页 / 共26页
第3页 / 共26页
第4页 / 共26页
第5页 / 共26页
第6页 / 共26页
第7页 / 共26页
第8页 / 共26页
第9页 / 共26页
第10页 / 共26页
亲,该文档总共26页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Chapter 6: Modifying and Combining SAS Data Sets “I usually say, The computer is the dumbest thing on campus. It does exactly what you tell it to; not necessarily what you want. Logic is up to you.“ Necia A. Black, R.N., Ph.D. From the SAS-L Listserv, May 6, 1994. Reprinted by permission of the author. 6.1 Modifying a Data Set Using the SET Statement The SET statement in the DATA step allows you to read a SAS data set so you can add new variables, create a subset, or otherwise modify the data set. If you were short on disk space, for example, you might not want to store your computed variables in a permanent SAS data set. Instead, you might want to calculate them as needed for analysis. Likewise, to save processing time, you might want to create a subset of a SAS data set when you only want to look at a small portion of a large data set. The SET statement brings a SAS data set, one observation at a time, into the DATA step for processing.1 To read a SAS data set, start with the DATA statement specifying the name of the new data set. Then follow with the SET statement specifying the name of the old data set you want to read. If you dont want to create a new data set, you can specify the same name in the DATA and SET statements. Then the results of the DATA step will overwrite the old data set named in the SET statement.2 The following shows the general form of the DATA and SET statements: DATA new-data-set; SET data-set; Any assignment, subsetting IF, or other DATA step statements usually follow the SET statement. For example, the following creates a new data set, FRIDAY, which is a replica of the SALES data set, except FRIDAY has only the observations for Fridays, and it has an additional variable, Total: DATA friday; SET sales; IF Day = F; Total = Popcorn + Peanuts; RUN; Example The Fun Times Amusement Park is collecting data about their train ride. They can add more cars on the train during peak hours to shorten the wait, or take them off when theyre not needed to save fuel costs. The raw data file contains data for the time of day, the number of cars on the train, and the total number of people on the train: 10:10 6 21 12:15 10 56 15:30 10 25 11:30 8 34 13:15 8 12 10:45 6 13 20:30 6 32 23:15 6 12 The data are read into a permanent SAS data set, TRAINS, stored in the MySASLib directory on the parks central computer by means of the following program: * Cr6eate permanent SAS data set trains; The Little SAS Book: A Primer, Fifth EditionSAS Institute, SAS Institute Inc. (c) 2012, Copying ProhibitedPage 2 / 27DATA c:MySASLibtrains; INFILE c:MyRawDataTrain.dat; INPUT Time TIME5. Cars People; RUN; This example uses direct referencing to tell SAS where to store the permanent SAS data set, but you could use a LIBNAME statement instead. Each train car holds a maximum of six people. After collecting the data, the Fun Times management decides they want to know the average number of people per car on each ride. This number was not calculated in the original DATA step which created the permanent SAS data set, but can be calculated by the following program: * Read the SAS data set trains with a SET statement; DATA averagetrain; SET c:MySASLibtrains; PeoplePerCar = People / Cars; RUN; PROC PRINT DATA = averagetrain; TITLE Average Number of People per Train Car; FORMAT Time TIME5.; RUN; The DATA statement defines a new temporary SAS data set named AVERAGETRAIN. Then the SET statement reads the permanent SAS data set TRAINS, and an assignment statement creates the new variable PeoplePerCar. Here are the results of the PROC PRINT: 1The MODIFY statement also allows you to modify a single data set. See the SAS Help and Documentation for more information. 2By default, SAS will not overwrite a data set in a DATA step that has errors.6.2 Stacking Data Sets Using the SET Statement The SET statement with one SAS data set allows you to read and modify the data. With two or more data sets, in addition Average Number of People per Train Car Obs Time Cars People PeoplePerCar 1 10:10 6 21 3.50000 2 12:15 10 56 5.60000 3 15:30 10 25 2.50000 4 11:30 8 34 4.25000 5 13:15 8 12 1.50000 6 10:45 6 13 2.16667 7 20:30 6 32 5.33333 8 23:15 6 12 2.00000 The Little SAS Book: A Primer, Fifth EditionSAS Institute, SAS Institute Inc. (c) 2012, Copying ProhibitedPage 3 / 27to reading and modifying the data, the SET statement concatenates or stacks the data sets one on top of the other. This is useful when you want to combine data sets with all or most of the same variables but different observations. You might, for example, have data from two different locations or data taken at two separate times, but you need the data together for analysis. In a DATA step, first specify the name of the new SAS data set in the DATA statement, then list the names of the old data sets you want to combine in the SET statement:3 DATA new-data-set; SET data-set-1 data-se
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号