Skip to main content

Creating dummy variables in R

Randy Zwitch has a blog entry on creation of dummy variables from factor levels.

example <- span=""> as.data.frame(c("A", "A", "B", "F", "C", "G", "C", "D", "E", "F"))
names(example) <- span=""> "strcol"

#For every unique value in the string column, create a new 1/0 column
#This is what Factors do "under-the-hood" automatically when passed to function requiring numeric data
for(level in unique(example$strcol)){
  example[paste("dummy", level, sep = "_")] <- span=""> ifelse(example$strcol == level, 1, 0)
}
view raw
Often you encounter special characters in which case you can use gsub and regular expressions
example <- span=""> as.data.frame(c("AÆ", "AÆ", "B", "FÅ", "C", "G", "C", "D", "E", "FÅ"))
names(example) <- span=""> "strcol"

#For every unique value in the string column, create a new 1/0 column
#This is what Factors do "under-the-hood" automatically when passed to function requiring numeric data
for(level in unique(example$strcol)){
  example[gsub('[^a-zA-Z0-9_],"",paste("dummy", level, sep = "_"),fixed=FALSE)] <- span=""> ifelse(example$strcol == level, 1, 0)
} 
You may also use levels instead of unique in conjunction with subsetting, e.g. levels(example$strcol)[-1] to create dummy variables mapping reference level to baseline/intercept in your regression model.
Equation formulas are generated with the paste function:
paste("somevar ~",paste(names(dataframe),sep="",collapse="+"))

Comments

Popular posts from this blog

HackRF on Windows 8

This technical note is based on an extract from thread. I have made several changes and added recommendations. I have experienced lot of latency using GnuRadio and HackRF on Pentoo Linux, so I wanted to try out GnuRadio on Windows.



HackRF One is a transceiver, so besides SDR capabilities, it can also transmit signals, inkluding sweeping a given range, uniform and Gaussian signals. Pentoo Linux provides the most direct access to HackRF and toolboxes. Install Pentoo Linux on a separate drive, then you can use osmocom_siggen from a terminal to transmit signals such as near-field GSM bursts, which will only be detectable within a meter.









Installation of MGWin and cmake: Download and install the following packages:
- MinGW Setup (Go to the Installer directory and download setup file)
- CMake (I am using CMake 3.2.2 and I installed it in C:\CMake, this path is important in the commands we must send in the MinGW shell)
Download and extract the packages respectively in the path C:\MinGW\msys\…

Example: Beeswarm plot in R

library(foreign)

data <- read.dta("C:/Users/hellmund/Documents/MyStataDataFile.dta")

names(data)

install.packages('beeswarm')

library(beeswarm)

levels(data$group)

png(file="C:/Users/hellmund/Documents/il6.png", bg="transparent")

beeswarm(data$il6~data$group,data=data, method=c("swarm"),pch=16,pwcol=data$Gender,xlab='',ylab='il6',ylim=c(0,20))

legend('topright',legend=levels(data$Gender),title='Gender',pch=16,col=2:1)

boxplot(data$il6~data$group, data=data, add = T, names = c("","",""), col="#0000ff22")

dev.off()

RPITX - a transmitter along frequencies from 130kHz to 500MHz

Great news. Raspberry Pi can now emit RF signals along a much wider range than previously publicized. It is three years since a team at Robotics Imperial College London wrote PiFM, a FM transmitter based on C and Python for the Raspberry Pi. PiFM enabled even novice Raspberians to setup a fm transmitter. RPITX by Evariste Okcestbon sets a new standard and may see a wide range of applications compared to PiFM. By design it is both more versatile and closer to the needs of a radio amateur. It is able to transmit at lower frequencies than even the HackRF, as it includes frequencies between 130kHz and 1MHz, though it is not capable of transmitting above 500MHz at this point. Innovations in electronics are still possible on Raspberry Pi, a modest platform for the auteur with the modest budget and a deep understanding of a well-documented interface and architecture. RPITX is available from GitHub. A third argument in the fcntl function open lacked in several source files (9th november …