The following code will create a data set with just the bmi variable. ![]() Note that you cannot use a drop statement and a keep statement in the same data step. Which one to use depends simply on whether or not you want to drop or keep more variables. The keep statement (basically) does the same thing as the drop statement but in reverse, by only keeping the variables we have specified. Set mat013.MMM(drop=weight_in_kg height_in_metres) Note that the following code would not give the required output as we are trying to drop the variables from the original data set, however we need those variables to calculate the bmi: data mat013.MMM_with_BMI_nhw We use a "drop" statement to get rid of those variables: data mat013.MMM_with_BMI_nhw(drop=weight_in_kg height_in_metres) Let us consider the previous example and assume that we want our MMM_with_BMI data set without the weight and height variables. We do this with the "drop" or "keep" statement. Recalling how SAS handles a data step (using the pdv as described previously), one immediate way of improving efficiency is to ensure that the pdv only "transports" the variables we require. ![]() In this section we'll take a quick look at two simple ways of improving the efficiency of a data step. It's worth checking the web for a full list of various SAS functions (there are a huge amount of them). We can also do operations on strings, the following code replaces the variable "Sex" with the first entry of "Sex" (which gets rid of the Male - M and Female - F issue). Some of the arithmetic functions are shown. The following code creates a new data set call MMM_with_BMI, with a new variable "BMI" as a function of the height and weight variables in the MMM dataset in the mat013 library. SAS creates the descriptive portion of the SAS data set (viewable using the "contents" procedure).Īn example of how this works with concatenation and an example of how this works with merging is shown.Ĭreating new variables using various arithmetic and/or string relationships is relatively straightforward in SAS. (If a "by" statement is used (for example when merging two data sets) the PDF does not empty if there are still observations with the same value of the "by" variable). SAS reads in the data line by line using the PDF. SAS creates a PDV to store the information for all the variables required from the data step.SAS checks the data step for any unrecognized keywords and syntax errors.The basic steps of compiling a data step are as follows: In this section we'll explain how it uses the "program data vector" (pdv) to efficiently handle data. SAS is able to handle very large data sets because of the way data steps work. The result is shown in (note that the above code makes use of the substr function that we will see later). The following code selects only the elements of the above data set that start with a D. Merge mat013.first_data_set mat013.other_data_set ĭata steps can be used in conjunction with the where statement to select certain variables. The following code would merge the two data sets first_data_set and other_data_set in the mat013 library as shown. Note that the two data sets must be sorted on the merge variable prior to merging. To merge two data sets (as shown pictorially) we use the following syntax: data The following code concatenates the jjj and mmm data sets as shown. To concatenate two data sets (as shown pictorially) we use the following syntax: data The following code simply creates a data set in the work library called "j" that is a copy of the data set jjj located in the mat013 library. Copying a data set (with new variables). ![]() A data step is a type of SAS statement that allows you to manipulate SAS data sets.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |