by R. K. Chan
Advances in DNA sequencing technology have made possible the rapid sequencing of whole genes, of hundreds of genes at a time (PMID 21228398) and of even the entire DNA sequence of an individual (PMID 20435227). Such large-scale sequencing across a population is likely to reveal many different rare variants of unknown significance. To make sense of this glut of genetic information, it is necessary to determine if any of these variants have the potential to cause disease. Geneticists define mutation as any variation or change in the sequence of DNA (PMID 18456578). Whether a mutation is neutral, deleterious, or even beneficial depends on the consequence or phenotype of the mutation. Mutations that impact the function of essential genes and cause disease are called pathogenic mutations. Diseases caused by a mutation in a single gene are called monogenic diseases.
To illustrate how pathogenic mutations are evaluated and identified, we will use the example of cystic fibrosis–a monogenic disease. Over 1800 variants in the cystic fibrosis transmembrane conductance regulator gene (CFTR) are now known. However, only the 23 mutations in the American College of Medical Genetics cystic fibrosis screening panel (PMID 15371902) and a few other mutations have been rigorously examined for their pathogenicity or disease-causing potential (PMID 18456578). Sosnay et al. (PMID 21547743) have suggested criteria for evaluating the pathogenicity of the remaining variants in the CFTR gene:
- For an autosomal recessive disease, people with two copies of the defective gene should show the disease.
- A pathogenic mutation should always be associated with the disease.
- A pathogenic mutation should alter physiological and cellular function.
- A pathogenic mutation should impact protein structure or gene expression.
The first two criteria are the gold standard for establishing the pathogenicity of a mutation with high penetrance in which every person with two copies of the mutation shows the disease. However, information about the association between the mutation and disease will not be available if only one copy of a new variant in a gene for an autosomal recessive disease is identified. In such cases, evidence about the in vivo effects of the mutation and about the genetic nature of the mutation are useful for evaluating the potential consequences of the mutation. Now let us discuss these criteria in more detail.
Must Both Copies of the Gene Be Defective?
Cystic fibrosis (CF) is an autosomal recessive disease, so the phenotype of a mutation can only be observed in a person with two defective copies of the CFTR gene. An example is an affected person with two copies of the common deltaF508 mutation, which is a well known disease-causing mutation. In the case of a person with one copy of deltaF508 and one copy of a new variant, if clinical signs of CF are present then the variant is more likely to be disease-causing. It is also important to determine whether the individual carries additional mutations in the CFTR gene which would complicate the interpretation of pathogenicity of the variant.
Is It Always Associated with the Disease?
Finding a variant in the CFTR gene in a person with CF is just the first step in proving that the variant causes CF. It is necessary to rule out the possibility that the variant is an innocent bystander with no deleterious effect on the CFTR gene and that the real causative mutation has been overlooked. This can be done by screening a healthy population to verify that the putative CF mutation is absent. If the putative CF mutation is found in a healthy population, there are two possibilities. Either it is not disease-causing or it does not always cause disease (low penetrance) in every individual. In addition, mutations that always cause disease (high penetrance) in every individual should always be found in family members with the disease (PMID 10612817). Indeed, the presence of the same mutation in unrelated families with similar disease symptoms is good evidence for a causal relationship between the mutation and a disease (PMID 12610532).
Does It Alter Physiological and Cellular Function?
Functional testing is useful for evaluating variants with no family genetic information. Cystic fibrosis is caused by defects in the CFTR gene, which codes for a chloride channel that regulates the salt content of fluids on the surface of the nose and lung as well the salt content of sweat. Individuals with CF have more chloride in their sweat. In individuals with CF, these changes in salt content also affect the electrical potential difference across the lining of the nose. Thus, functional testing of a CFTR variant would include determining the sweat chloride concentration and measuring the electrical potential difference in the lining of the nose of a person with that mutation.
Does It Affect Protein Structure or Gene Expression?
In the absence of family genetic information or of functional information about the variant, one can always consider how the variant changes the DNA sequence of the gene. In an analysis of the mutations cataloged in the Human Gene Mutation Database, Botstein and Risch (PMID 12610532) concluded that the most frequent type (59% of the total) of monogenic disease mutation was an alteration in the sequence that coded for protein. For CFTR, some mutations are more likely to cause severe changes to the structure or expression of the CFTR protein and so are more likely to be pathogenic in their effect. Nonsense mutations cause production of a shorter than normal CFTR protein, while deletion mutations can remove essential parts of the CFTR protein. Both of these types of genetic changes are more likely to damage CFTR function and hence cause CF (PMID 21547743, PMID 12610532). Mutations deleting or modifying RNA splicing sites may also affect the resulting CFTR protein. In addition to changing the protein structure, mutations may affect the amount of protein made as well as the processing, folding or cellular localization of the protein.
Finally, the location of the variant in the gene may give clues to its importance. For proteins that have been studied in several different species, the most critical features of a gene and its protein are indicated by those sequences that have been conserved in evolution. Thus, mutations in conserved regions of the gene are more likely to inactivate the protein and to be pathogenic (PMID 10612817, PMID 12610532).
Pathway’s Process for Evaluating Mutations
We have used mutations in the cystic fibrosis CFTR gene to illustrate the process used to determine whether a variant is likely to be a pathogenic mutation. This is the same process that Pathway’s curation team uses when they scour the literature to identify mutations that can be added to Pathway’s carrier status genetic test. We look for evidence in the peer-reviewed literature concerning the clinical phenotype of people with two copies of the defective gene, the genetic linkage of the mutation with disease, the physiological and cellular defects in people with the mutation, and the predicted effects of the mutation on protein structure and gene expression. Only after evaluating the quality of this scientific evidence, do we make recommendations on which mutations to test. This process is ongoing as Pathway’s curation team is always checking the latest medical and genetic research for new information that can be used to update and improve the genetic tests that we offer.