Skip to main content

Dummy variables in SAS

It is difficult to create indicator or dummy variables from factor variables in SAS programming language.

The macro below assumes an integer factor variable (factor levels with values 0, 1, 2, 3, ...) and no missing values or formats. Here is a way to ensure this simple assumption is fulfilled. Add a few lines to an appropriate data step:

if missing(factor) then factor=0; 
format factor;
To convert a non-numeric factor variable to an integer factor variable create a format and construct an integer factor variable:

proc format library=WORK;
value $factor
'firstlevel'='1'
'secondlevel'='2'
...
'finallevel'='n'
;
run;

data newdataset;
set olddataset;
newfactor=input(put(factor,$factor.),8);
run;
You are now able to define and run the macro.

/* Macro to generate dummy variables. */

%MACRO GETCON; 
  %DO I = 1 %TO &N; 
  %IF &&M&I = 0 %THEN %GOTO OUT; 
    IF &factor = &&M&I THEN &factor&I = 1; 
    ELSE &factor&I = 0; 
  %OUT: %END; 
%MEND GETCON; 

%macro indicator(factor,dataset);
PROC SORT DATA=&dataset OUT=UNIQUE NODUPKEY; 
  BY &factor; 
RUN; 
/* Assign the largest value of CON to the macro variable N. */ 
DATA _NULL_; 
  SET UNIQUE END=LAST; 
  IF LAST THEN CALL SYMPUT('N', PUT(&factor, 8.)); 
RUN;
/* Assign the initial value 0 to all macro variables. */  
DATA _NULL_; 
  DO I = 1 TO &N; 
    CALL SYMPUT('M'||LEFT(PUT(I, 8.)), '0'); 
  END; 
RUN; 
/* Assign the value of CON to the corresponding macro variable. */ 
DATA _NULL_; 
  SET UNIQUE; 
  CALL SYMPUT('M'||LEFT(PUT(&factor, 8.)), PUT(&factor, 8.));  
RUN; 
/* Create dummy variables. */ 
DATA &dataset; 
  SET &dataset; 
  %GETCON 
RUN; 
%mend;

*Example data;
DATA TESTDATA; 
  INPUT CON; 
  CARDS; 
    1 
    7 
   34 
  115 
    7 
    1 
  487 
   34 
  506 
   57 
    7 
   43 
  ; 
RUN; 

*Test run;
%indicator(con,testdata);
The macro above is based on an example in SUGI paper 052-29.

Comments

Popular posts from this blog

HackRF on Windows 8

This technical note is based on an extract from thread. I have made several changes and added recommendations. I have experienced lot of latency using GnuRadio and HackRF on Pentoo Linux, so I wanted to try out GnuRadio on Windows.



HackRF One is a transceiver, so besides SDR capabilities, it can also transmit signals, inkluding sweeping a given range, uniform and Gaussian signals. Pentoo Linux provides the most direct access to HackRF and toolboxes. Install Pentoo Linux on a separate drive, then you can use osmocom_siggen from a terminal to transmit signals such as near-field GSM bursts, which will only be detectable within a meter.









Installation of MGWin and cmake: Download and install the following packages:
- MinGW Setup (Go to the Installer directory and download setup file)
- CMake (I am using CMake 3.2.2 and I installed it in C:\CMake, this path is important in the commands we must send in the MinGW shell)
Download and extract the packages respectively in the path C:\MinGW\msys\…

Example: Beeswarm plot in R

library(foreign)

data <- read.dta("C:/Users/hellmund/Documents/MyStataDataFile.dta")

names(data)

install.packages('beeswarm')

library(beeswarm)

levels(data$group)

png(file="C:/Users/hellmund/Documents/il6.png", bg="transparent")

beeswarm(data$il6~data$group,data=data, method=c("swarm"),pch=16,pwcol=data$Gender,xlab='',ylab='il6',ylim=c(0,20))

legend('topright',legend=levels(data$Gender),title='Gender',pch=16,col=2:1)

boxplot(data$il6~data$group, data=data, add = T, names = c("","",""), col="#0000ff22")

dev.off()

Real world split-plot designs

Google Earth picture from a blog on statistics. A real world example near Christchurch (NZ) of a split-plot design. Today things have completely changed on location as the forest has grown considerably. Google Earth coordinate link.