Extended transmission disequilibrium testDave Curtis and Pak Sham, July 1995. Sham PC and Curtis D. An extended transmission/disequilibrium test (TDT) for multiallele marker loci. Ann Hum Genet, 1995. 
The transmission disequilibrium testThe A allele is transmitted to affected offspring four times out of five. Spielman RS, McGinnis RE, Ewens WJ. Transmission disequilibrium test for linkage disequilibrium: the insulin gene region and insulindependent diabetes mellitus (IDDM). Am J Hum Genet, 1993, 52: 506516. The transmission disequilibrium test was proposed by Spielman as a robust test for association due to two loci being tightly linked. To test whether a marker allele exhibits transmission disequilibrium with a disease, parents of affected subjects are observed. If parents who are heterozygous for the allele transmit it to affected subjects on more than 50% of occasions this is evidence for both linkage and linkage disequilibrium between the marker and disease loci. 
Features of TDT

Possible drawbacks of TDT
The fact that TDT can deal conceptually with only one associated allele poses problems for modern multiallelic markers. If one tests each allele in turn against the rest then one must introduce a correction for multiple testing. Additionally, if two alleles are both associated with the disease then each may serve to mask the other and true association may be missed. 
The extended transmission disequilibrium testIf a marker has multiple alleles then each may have a certain degree of association with the susceptibility allele at the disease locus. Information pertinent to transmission disequilibrium to affected subjects may be summarised in a table such as this: Transmitted 1 2 3 4 1 4 4 2 Not 2 13 15 5 transmitted 3 9 12 3 4 6 3 6 Here the entries in the cells of the table indicate the number of times a heterozygote parent transmitted the allele corresponding to the cell column to an affected offspring, while not transmitting the allele corresponding to the cell row. Thus the 4 in the second column of the first row indicates that that 4 parents with genotype 12 transmitted allele 2 to an affected offspring, while the 13 in the first column of the second row indicates that the 13 parents with genotype 12 transmitted allele 1 instead. If there were no transmission disequilibrium we would expect diagonally opposite elements to be equal. 
Consideration of individual allelesIf we wished to consider alleles individually then the information in the table could be summarised as follows: 1 2 3 4 Transmitted 28 19 25 10 Not transmitted 10 33 24 15 These totals are obtained simply by summing over the columns and rows of the original table. However although this format may be useful for examining the behaviour of individual alleles one cannot analyse the table as a whole because each parent is counted twice (once for the allele transmitted and once for the allele not transmitted). 
A modelfitting approachAn alternative way of displaying the information in the original table would be as follows: 1 2 17 13 1 3 13 9 1 4 8 6 2 3 27 12 2 4 8 3 3 4 9 6 Here, the first two columns indicate the genotypes of parents of affected subjects, the third column indicates the number of times that the genotype occurs and the fourth column indicates the number of times the first allele is transmitted to the affected subject. In terms of logistic regression analysis, the third column denotes the number of "trials" and the fourth column the number of "successes". 
Genotypewise analysisOne test for transmission disequilibrium would be to allow a separate parameter to denote the probability of a "success" for each of the observed heterozygous parental genotypes, that is the probability of transmitting the first allele of the genotype, and to examine whether these probabilities differ from 50%. This would yield a likelihood ratio test based on comparing the likelihood maximised over separate transmission probabilities for each genotype with the likelihood assuming all these probabilities are 50%. We will refer to this as a saturated model, and denote it as H2. To compare H2 to H0, we take twice the natural logarithm of this likelihood ratio, 2ln(LR), and treat this as a chisquared statistic with degrees of freedom equal to the number of observed heterozygous parental genotypes. If all possible parental genotypes occur, then for a marker with m alleles the chisquared statistic will have m(m1)/2 degrees of freedom. 
Allelewise analysisIf we consider each genotype individually we risk losing potentially valuable information. In the example shown, allele 1 is transmitted preferentially with respect to alleles 2,3 and 4, but a genotypewise analysis would not take any account of this. Instead, we may attempt to use a more parsimonious model to fit to the observed data, such that each allele has its own parameter which reflects the extent to which it is associated with the disease allele. Consider a situation where a marker is in linkage disequilibrium with a disease, such that for each pair of marker alleles i and j the log of odds for allele i to be transmitted from a parent with genotype Gij is given by: 1. ln[P(TiGij)/P(TjGij)] = BiBj Here Bi and Bj are simply parameters which pertain to the alleles and which tell us something about the extent to which each allele is associated with the disease allele. These allow the application of a standard logistic regression analysis to the data shown above. Feeding the number of "trials" (parental genotypes) and "successes" (first allele transmitted) into a standard logistic regression package allows maximumlikelihood estimation of the allelespecific parameters B. Using these parameters provides a more parsimonious test. We can term the hypothesis that each allele is associated (positively or negatively) with the marker to a certain extent as H1. To carry out an allelewise test for transmission disequilibrium, we compare the likelihood maximised over these allelespecific parameters to the likelihood that all parameters are equal, and then 2ln(LR) will yield a chisquared statistic with m1 degrees of freedom. 
Goodness of fitWe can examine how well the allelewise model fits the data by comparing it to the model in which each genotype can have a separate transmission probability. If we compare the likelihoods under H2 and H1 then 2ln(LR) forms a chisquared statistic with (m2)(m1)/2 degrees of freedom (assuming that all possible parental genotypes occur). 
Output from multiallele analysisBelow is a summary of the results obtained by analysing the example data above.
Results of fitting parameters: trials successes fitted 17 13 12.974995 13 9 9.010517 8 6 6.001652 27 12 11.123588 8 3 3.858492 9 6 5.136907 This section shows the predicted number of times the first allele would be transmitted (third column) compared to the observed number of times (second column). In this case, fitting one parameter for each allele produces a very close approximation to the observed data. Fitted allele parameters with SE's: value SE Allele 1: 1.101505 0.503740 Allele 2: 0.070830 0.463126 Allele 3: 0.285143 0.459913 Correlation matrix of parameters: 1.000000 0.639812 0.629211 0.639813 1.000000 0.740540 0.629212 0.740540 1.000000 These are the actual values of the fitted parameters, together with their correlations. The parameter for the last allele is arbitrarily fixed at 0. Log likelihood under null hypothesis: L0 = 56.838066 Log likelihood under parsimonious (allelewise) hypothesis: L1 = 51.785629 Log likelihood using saturated (genotypewise) model: L2 = 51.367027 Chisquared for allelewise TDT = 2*(L1L0) = 10.104874, 3 df, p = 0.017695 Chisquared for genotypewise TDT = 2*(L2L0) = 10.942078, 6 df, p = 0.090183 Chisquared for goodnessoffit of allelewise model = 2*(L2L1) = 0.837204, 3 df, p=0.840549 Comparisons of the likelihoods under H0, H1 and H2 using chisquare tests shows significant evidence for transmission disequilibrium using the allelewise analysis. Howeever the genotypewise analysis incorporates additional degrees of freedom and does not yield a much higher likelihood, so is not statistically significant. The parsimonious allelewise model is shown to fit well to the data. Transmissions for individual alleles: 1 2 3 4 Passed: 28 19 25 10 Not passed: 10 33 24 15 Chisquared: 8.526 3.769 0.020 1.000 p values: 0.0035 0.0522 0.8864 0.3173 (these p values should be corrected for multiple testing) Once the overall analysis has produced evidence for linkage disequilibrium between the loci, it may be helpful to examine which alleles appear most strongly to contribute to this. If one were simply to consider each allele separately without performing the overall analysis first, one would need to carry out a Bonferroni correction for a number of tests equal to the number of alleles. Here, the table helps focus attention on the fact that allele 1 appears preferentially transmitted over other loci. 
Dealing with missing parental genotypesIf one parent has not been genotyped, one may still use information from the other parent under certain conditions. The typed parent must be heterozygous, and obviously if the affected child has the same genotype then one cannot tell which allele has been transmitted. If the child is homozygous, then one can deduce which allele has been transmitted, but the pair must still be discarded from the analysis because otherwise one will introduce a bias in favour of commoner alleles. If a parent is missing, one should only incorporate the remaining parentchild pair if both are heterozygous and have different genotypes. Curtis D and Sham PC. A note on the application of the transmission disequilibrium test when a parent is missing. Am J Hum Genet, 1995, 56: 811812. 
Comparison with Terwilliger's testJoe Terwilliger has recently described another method which allows TDT analaysis of multiallelic markers (Terwilliger JD. A powerful likelihood method for the analysis of linkage disequilibrium between trait loci and one or more polymorphic loci. Am J Hum Genet, 1995, 56, 777787.). His test makes an explicit assumption about the nature of the transmission disequilibrium, in particular that there is one associated allele and that the probability for each allele being the associated one is equal to its frequency in the general population. This test can be applied to a number of loci simultaneously to find a maximumlikelihood map position. The test described here makes no prior assumption about the nature of the association between the disease and marker alleles, and for example allows two or more alleles to be positively associated. On the other hand, this test can only be applied to one marker at a time. 
Program availabilityA simple DOS program to carry out all the analyses for the multiallele TDT is available from John Attwood's ftp site at ftp.gene.ucl.ac.uk in /pub/packages/dcurtis, with filename etdt.zip. 