Model-free linkage analysisDave Curtis and Pak Sham, August 1995. Curtis D and Sham PC. Model-free linkage analysis using likelihoods. Am J Hum Genet, 1995. |
Problems with linkage analysisThe lod score method uses all available information and is powerful, but produces false negative lod scores if the transmission model is misspecified. This is especially the case at small recombination fractions, and also in multipoint analyses. Attempts to reduce reliance on model specification
|
Developing a model-free method of analysisWe attempted to produce a new method of analysis with the following aims:
|
The general single locus linkage modelThe situation of a biallelic susceptibility locus and its relation with a marker can be described using the following parameters:
|
Parameter estimation and hypothesis testing1. Segregation analysisThe likelihood is maximised over the (four) transmission model parameters. Marker genotypes are to be ignored, so the recombination fraction between the disease and marker loci is fixed to 50%.Testing for the presence of a susceptibility locus: LR = L(D | T) / L(D | T0) = L(D | f0,f1,f2,q) / L(D | f0=f1=f2 or q=0 or q=1) = L(D,M | f0,f1,f2,q,theta=0.5) / L(D,M | f0=f1=f2 or q=0 or q=1,theta=0.5) |
Parameter estimation and hypothesis testing2. Classical linkage analysisThe likelihood is maximised over different values of theta, assuming a fixed transmission model. Testing for linkage: LR = L(D,M | theta<0.5,{q,F}) / L(D,M | theta=0.5,{q,F}) |
Parameter estimation and hypothesis testing3. Linkage analysis incorporating admixtureClassical linkage analysis can extended to allow for locus heterogeneity by allowing alpha to be less than 1. Testing for linkage with admixture: LR = L(D,M | theta<0.5,alpha>0,{q,F}) / L(D,M | theta=0.5 or alpha=0,{q,F}) |
Parameter estimation and hypothesis testing4. Testing for an effect on susceptibility at a given positionTo test a specified position, we fix theta and allow alpha as the sole parameter to test for linkage. To make the test model- free, we no longer fix the transmission model parameters in advance. Testing for a susceptibility locus at position theta=t: LR = L(D,M | alpha>0,theta=t,q,F) / L(D,M | alpha=0,theta=t,q,F) Because there is one more free parameter in the numerator than in the denominator, log(LR) provides a test with one degree of freedom and is comparable with a standard lod score. |
Constraining transmission model parametersAs described, the test may have no power to detect linkage - if affected sib pairs are used the transmission model parameters will take values such as f2=1, q=1 or f0=f1=f2=1. In other samples this effect may be less extreme but may still reduce the power of the method. |
Constraining to produce correct prevalenceTo reduce the effect of selection bias which may yield unrealistic estimates of the transmission model parameters, we can constrain the transmission model parameters to yield the correct population prevalence for the disease, K, and this constraint can be denoted [q,F]. The test for linkage can then be written as follows: LR = L(D,M | alpha>0,theta=t,[q,F]) / L(D,M | alpha=0,theta=t,[q,F]) If we impose the additional constraint f0<=f1<=f2 then we can draw a polyhedron which encloses all possible values for F: This polyhedron has a vertex at the point (K,K,K), corresponding to T0, which models the locus having no effect on susceptibility. At all other points there is single value for q which produces the correct value of K, so that q becomes a function of F. |
Further constraints on model parametersTo make the procedure less computationally demanding one may restricting consideration to a smaller subset of models: those represented in the figure by the dotted lines joining the Mendelian recessive model, at (0,0,1), through the null effect model, at (K,K,K), to the Mendelian dominant model at (0,1,1). It can be seen that these lines pass close to most points within the allowable volume. If only these models indicated are used, then f0 and f2 both become functions of f1 and transmission model is completely defined by the choice of f1. |
ImplementationMFLINK is a program which automatically sets up appropriate data files and then calls MLINK to carry out the likelihood calculations. It will shortly be available from John Attwood's ftp site at ftp.gene.ucl.ac.uk in /pub/packages/dcurtis. |
EvaluationThe method was applied to pedigree and sib pair data generated under a wide variety of transmission models. Performance was compared with an IBD sib pair analysis and the lod score method using the correct transmission model. When applied to the affected sib pair data set all methods gave similar results. With the pedigree data, the methods sometimes gave similar results to each other, but for some transmission models the two likelihood-based methods were considerably more powerful than the sib pair method. |