### Logistic regression with Cox PH procedures

Illustration of the Stata stcox command used for logistic regression estimation.

Results are compared to output from the logit procedure.

Background:

In SAS you can use the ties=discrete option in the model statement for moderately sized logistic regression analyses or in general for reasonable sized data sets if time is truly discrete.

The method is illustrated on sas.com for a conditional logistic regression, adding a strata statement - which in Stata compares to a strata variable specified in the strata option added to the stcox command.
Have a look at support.sas.com here.

"Extra memory is needed for certain TIES= options. Let be the maximum multiplicity of tied times.
The TIES=DISCRETE option requires extra memory (in bytes) of
4k*(p^2+4p), where k is the maximum multiplicity of tied times and p is the number of predictors/explanatory variables
Source of citation on sas.com.

The equivalent option in Stata's stcox is exactp, which however is not compatible with the vce(cluster) option. Can we mend it with the _robust command?

Stata code follows

*Demonstration
use http://www.ats.ucla.edu/stat/stata/dae/binary.dta, clear

*Benchmark estimates with the logit command

logit admit gre, or

/* Result
---------------------------------------------------------------------------
admit | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+-------------------------------------------------------------
gre | 1.003589 .0009895 3.63 0.000 1.001651 1.00553
---------------------------------------------------------------------------
*/

logit admit gre gpa i.rank, or

/*Result
---------------------------------------------------------------------------
admit | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+-------------------------------------------------------------
gre | 1.002267 .0010965 2.07 0.038 1.00012 1.004418
gpa | 2.234545 .7414652 2.42 0.015 1.166122 4.281877
|
rank |
2 | .5089309 .1610714 -2.13 0.033 .2736922 .9463578
3 | .2617923 .0903986 -3.88 0.000 .1330551 .5150889
4 | .2119375 .0885542 -3.71 0.000 .0934435 .4806919
---------------------------------------------------------------------------
*/

logit admit gre, or vce(cluster rank)

/*Result
(Std. Err. adjusted for 4 clusters in rank)
---------------------------------------------------------------------------
| Robust
admit | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+-------------------------------------------------------------
gre | 1.003589 .0005782 6.22 0.000 1.002456 1.004723
---------------------------------------------------------------------------
/*

*Now similar logistic regressions with stcox

capture stset ,clear
stset time, failure(admit==1) origin(time 0)
stcox gre, exactp

/*Result
---------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+-------------------------------------------------------------
gre | 1.00358 .0009882 3.63 0.000 1.001644 1.005518
---------------------------------------------------------------------------
*/

stcox gre gpa i.rank, exactp

/*Result
---------------------------------------------------------------------------
_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+-------------------------------------------------------------
gre | 1.002261 .001095 2.07 0.039 1.000117 1.00441
gpa | 2.229813 .7389021 2.42 0.016 1.164668 4.269082
|
rank |
2 | .5099792 .1611582 -2.13 0.033 .2745141 .9474148
3 | .2627833 .0906066 -3.88 0.000 .1336925 .5165218
4 | .2128309 .0888118 -3.71 0.000 .0939374 .482204
---------------------------------------------------------------------------
*/

*Note it is unwise to rearrange data and make a 'fix' for clusteret data using the stset command - due to memory limitations. A prober solution would utilize postestimation - which would also make comparison with unadjusted s.e.'s easy.

### Alder/korrekt århundrede udfra cpr nummer

De fleste, der arbejder med registre eller databaser, står ofte med problemstillingen, at alder er uoplyst, medens cpr-nummer er kendt. Hvordan regner man den ud? Følgende regel er gældende: Hvis syvende ciffer er 0, 1, 2 eller 3 er man født i det 20. århunderede (1900-tallet) Ligeledes, hvis syvende ciffer er 4 eller 9, og årstallet (femte og sjette ciffer) er større end eller lig 37. Endelig er man født i det 19. århundrede (1800-tallet) hvis syvende ciffer er 5, 6, 7 eller 8 og årstallet er større end eller lig 58. Nedenfor finder du eksempel i SAS kode: En lille makro, der udover fødselsdato også udregner køn samt den præcise alder givet datovariabel. Kilde: Opbygning af CPR nummeret, cpr.dk proc format library=work; value gender 0="Female" 1="Male" ; run; %macro agefromCPR(cpr,datevar=inddto,birthvar=birth,agevar=age); dy_temp=input(substrn(&cpr,1,2),2.); mt_temp=input(substrn(&cpr,3,2),2.); yr_temp=input(substrn(&cpr,5,2),

### Comorbidity indexes in SQL

Generating Elixhauser comorbidity index from Danish National Health Register as relational database. ( ICD 10 Coding  in SAS) A lookup-table based version of Charlson comorbidity index I made in SQL. A similar approach can be applied to Elixhauser. SELECT V_CPR, MAX(EI1)+MAX(EI2)+MAX(EI3)+MAX(EI4)+MAX(EI5)+ MAX(EI6)+MAX(EI7)+MAX(EI8)+MAX(EI9)+MAX(EI10)+ MAX(EI11)+MAX(EI12)+MAX(EI13)+MAX(EI14)+MAX(EI15)+ MAX(EI16)+MAX(EI17)+MAX(EI18)+MAX(EI19)+MAX(EI20)+ MAX(EI21)+MAX(EI22)+MAX(EI23)+MAX(EI24)+MAX(EI25)+ MAX(EI26)+MAX(EI27)+MAX(EI28)+MAX(EI29)+MAX(EI30)+MAX(EI31) AS Elixhauser FROM (SELECT V_CPR, -- Congestive Heart Failure CASE WHEN DIAG LIKE 'DI099%' OR DIAG LIKE 'DI110%' OR DIAG LIKE 'DI130%' OR DIAG LIKE 'DI132%' OR DIAG LIKE 'DI255%' OR DIAG LIKE 'DI420%' OR DIAG LIKE 'DI425%' OR DIAG LIKE 'DI426%' OR DIAG LIKE 'DI427%' OR DIAG LIKE 'DI428%' OR DIAG LIKE 'DI429%' OR D