Main Page
From Cox Lab Projects
Cox Lab Projects
[edit] TO DO LIST:
[edit] LINKS OF INTEREST
[edit] CURRENT PROJECTS
[edit] Genome-Wide Association Methods
Methods (U01 HL084715) – link to all software for us and others from the project (Dan, Jonathan, Mary Sara, Mark, Carole, Craig) and to the ENDGAMe web site and wiki pages
Computation (like at bottom)
[edit] BioGen
[edit] Genome Wide Association (Papers)
NEW! Cookbook on HapMap & HaploView
[edit] Mesothelioma
Malignant mesothelioma is a form of cancer that develops in the mesothelium, a membranous tissue that lines body cavities including the pleura, peritoneum, and pericardium. Malignant mesothelioma causes approximately 3,000 deaths per year in the United States, with increasing incidence worldwide. Median survival of affected individuals is 9 months, and treatment prolongs survival by an average of 3 months. Environmental exposure to asbestos is the single most important contributing factor to risk of malignant mesothelioma in the industrialized world; however, less than 5% of heavily asbestos-exposed individuals develop malignant mesothelioma. A mineral fiber, erionite, is recognized as the most potent carcinogen in the pathogenesis of this disease, but erionite is present only in certain regions, such as Cappadocia, Turkey, and Nevada, Oregon, and California in the US. Thus, the effects of erionite are limited to specific geographic areas. Additionally, a virus, SV40, has a confirmed association with malignant mesothelioma, but the contribution of this virus to the pathogenesis of the disease remains controversial.
Our ongoing research falls into three projects, which collectively aim to elucidate the mechanisms of pathogenesis of malignant mesothelioma. We hypothesize that genetic predisposition and SV40 infection of mesothelial cells are two important factors in determining individual susceptibility to the environmental carcinogens asbestos and erionite. We are currently in the process of identifying genes that together with erionite have caused an epidemic of malignant mesothelioma in Cappadocia, and we will investigate whether the same genes cause familial malignant mesothelioma in American families. We will also determine if these genes contribute to sporadic occurrence of the disease. In addition, we are testing the hypothesis that asbestos and SV40 are co-carcinogens in sporadic malignant mesothelioma in the Western world, and are working to identify the mechanisms of carcinogenesis and co-carcinogenesis. Mechanistically, we will focus our investigations on AKT and MAPK (ERK) signaling cascades, and on the tumor repressors NF2, p16(INK4a), and p14/p19(ARF) because preliminary studies have implicated these pathways as the targets of asbestos, erionite, and SV40 infection in carcinogenesis.
Status of samples sent to Vanderbilt.
[edit] GWA for Diabetes in Starr County Mexican-Americans
OBJECTIVE: To identify DNA polymorphisms associated with type 2 diabetes (T2D) in a Mexican American (MA) population.
RESEARCH DESIGN AND METHODS: We genotyped 116,204 single nucleotide polymorphisms (SNPs) in 281 MA with T2D and 280 random MA from Starr County, Texas using the Affymetrix GeneChip® Human Mapping 100K Set. Allelic association exact tests were calculated. Our most significant SNPs were compared to results from other T2D genome-wide association studies (GWAS). Proportions of African, European, and Asian ancestry were estimated from the HapMap samples using structure for each individual to rule out spurious association due to population substructure
TUNA results for Mexican American data
- with CEU and ASN target samples, and r^2 = 0.8: ASN_CEU_MexAm.r2.0.8.xls
- with CEU and ASN target samples, and r^2 = 0.95: ASN_CEU_MexAm.r2.0.95.xls
[edit] GoKinD
The long-term objective of the proposed research is to identify the genetic variation contributing to risk of diabetic nephropathy and related quantitative phenotypes. Diabetic nephropathy is the most serious long-term complication of diabetes, accounting for about 40% of cases of end-stage renal disease in the U.S. The clustering of diabetic nephropathy in families and segregation analyses suggest a strong genetic component in susceptibility to diabetic kidney disease. However, the results of genetic studies to date have enjoyed only modest success in identifying genetic variation that may affect risk. These previous studies generally involved relatively small numbers of cases and controls or trios and have probably been uniformly underpowered to detect genes with modest effect. In addition, they examined only a small number of genes, albeit plausible candidates. Here, we propose to carry out a genome-wide association study (GWA) to identify genetic variation affecting susceptibility to diabetic nephropathy and related phenotypes using a panel of > 300,000 single nucleotide polymorphisms (SNPs) in the Genetics of Kidneys in Diabetes (GoKinD) collection, one of the largest collections available for genetic studies of diabetic nephropathy. To this end we propose:
- To genotype all probands and parents in the GoKinD collection (2,807 individuals) using the Affymetrix Genome-Wide Human SNP Array 5.0 platform.
- To conduct both standard and novel genetic analyses of the data to map genes associated with diabetic nephropathy and related phenotypes.
- To verify genotyping and carry out fine-mapping studies in genes or regions showing association with diabetic nephropathy, related quantitative phenotypes, and other measures of diabetic complications measured in the GoKinD samples.
We expect that identification of the genes and the specific genetic variants that contribute to the development of diabetic nephropathy will lead to new approaches for preventing and treating this common long-term complication of diabetes.
[edit] Stuttering
Specific Aims Recent studies have suggested a significant sex-specific component to the genetic architecture of many complex phenotypes (Stone et al 2004; Weiss et al 2005; Cantor et al 2005; Weiss et al 2006). Linkage studies conducted on the largest cohort of families to undergo linkage mapping for the speech/language disorder of stuttering to date mapped only sex-specific linkage signals meeting genome-wide criteria for significance (Suresh et al 2006), although several regions had nominally significant evidence for linkage in the overall sample. Few linkage-based sex-specific signals have undergone sufficient follow up to lead to gene identification and an understanding of the nature of the sex specificity underlying the signal. We propose here to conduct fine mapping of three regions identified in our previous linkage mapping studies on stuttering. Sex-specific linkage analyses led to identification of a region on chromosome 7q with genome-wide significant evidence for linkage in males and to a region on chromosome 21 with genome-wide significant evidence for linkage in females. The third region to be examined (2q) has high priority for follow up because of its near-perfect overlap with a region implicated in studies of a language subphenotype of autism. Our specific aims are:
- to conduct fine mapping over these 3 regions (chromosomes 7, 21, and 2) in our cohort of 100 families of European descent with at least two individuals classified with the “ever stuttered” phenotype. We will use a map-based strategy to choose SNPs across the region defined by a 1-LOD confidence interval, with initial density of SNP genotyping decreasing with distance from the peak evidence for linkage in these regions.
- to investigate the regions on chromosomes 7 and 21 with the sex-specific evidence for linkage to stuttering with linkage and association studies in CEPH cell lines phenotyped for gene expression. We will utilize information from previously published studies of gene expression in CEPH families as well as genotype data appropriate for linkage mapping studies and HapMap data on CEPH lines from these families to maximize our ability to identify linkage and association signals related to sex-specific expression of genes in the two regions showing significant evidence for sex-specific linkage to stuttering.
- to genotype a second set of SNPs chosen from the 3 regions of interest. This second set of SNPs will be prioritized based on results of the studies conducted for specific aims 1 and 2, but does not require that signals of any particular threshold be obtained for either of the aims.
Stopping Stuttering on ABC7 news on August 16, 2007
[edit] UChicago GWA
[edit] P50 / CODICTUM Project
PROJECT DUE DATES:
NOW: Send out / post to CODICTUM wiki page document re. diagnosing stuttering;
submit biosketch;
fill in table numbers (talk with Nikki re. consent, N, phenotyping)
June 1: Good, rough draft of project/core write-ups + budgets posted to wiki page
June 9: Conference call
June 17: Reviews of others' drafts posted to wiki page
July 1: Revisions posted to wiki page
July 7: Conference call
Aug. 1: Review & polishing => final (?) draft
Link to the CODICTUM wiki page
[edit] Specific Language Impairment
[edit] Pharmacogenetics Projects
[edit] Genetic predictions of secondary leukemia
[edit] Heart failure in African-American (Pharmacogenetics)
[edit] Alternative splicing analysis
In addition to the differences between populations in transcriptional and translational regulation of genes, alternative pre-mRNA splicing is also likely to play an important role in regulating gene expression and generating variation in mRNA and protein isoforms. The differences in alternative splicing patterns between the full set of HapMap lymphoblastoid cell lines derived from individuals of European and African ancestry were evaluated using the Affymetrix GeneChip® Human Exon 1.0 ST Array. Certain biological processes such as histone acetylation and nuclear mRNA splicing via spliceosome were found to be enriched among the differential probesets or exons. Genetic contribution to the population differences in alternative splicing patterns was then evaluated by a genome-wide association using the HapMap SNPs. The results suggest that local and distant genetic variants account for a substantial fraction of the observed differences in alternative splicing patterns between populations.
[edit] Exon chip expression data
Gene Expression Assessment. RNA from 87 CEU and 89 YRI cell lines was extracted, prepared, and hybridized to the Affymetrix GeneChip® Human Exon 1.0 ST Array at per manufacturer’s recommendation (see Affymetrix website for additional information). Hybridized arrays were washed and stained on a GeneChip® Fluidics Station 450, and scanned on a GCS3000 Scanner (Affymetrix, Inc.). Resulting probe signal intensities were sketch quantile normalized using a subset of the 1.4 million probe sets. Gene expression levels were summarized using the robust multi-array average (RMA). A constant of 16 was added for variance stabilization and summarized signals were log2 transformed. This was done with signals generated on a core set of well-annotated exons (~200,000) within the Affymetrix Exon Array Computational Tool (ExACT) software package. To prevent confounding interpretations of gene expression variation, we removed data from exons for which probesets contained 2 or more probes harboring SNPs before summarizing expression. All raw exon array data has been deposited into GEO (GSE7761).
[edit] GWA for 80303
[edit] GWA for Low Birth Weight -- Collaboration with NU
Summary Low birth weight is associated with both increased perinatal morbidity and mortality and an increased risk for adult metabolic diseases. Size at birth is largely determined by an interaction between the fetus’s genetic potential for growth and maternal metabolism through its impact on the intrauterine environment. Maternal metabolism is itself modulated by maternal and fetal genotype and environmental factors. The goal of the proposed studies is to use the unique resources of the HAPO Study and GWA mapping to identify genes that account for the genetic underpinnings of fetal growth and maternal metabolism and the interaction of those genes with the environment in determination of phenotype. Identification of these genes will provide novel information about pathways that regulate fetal growth and maternal metabolism as well as important insight into susceptibility genes for chronic diseases like type 2 diabetes. Improved understanding of fetal growth and maternal metabolism is critical for the development of interventions to prevent low birth weight and to give infants with low birth weight a better chance to survive and develop in the early postnatal period, and to lead a healthy life as adults by avoiding chronic diseases associated with low birth weight.
Read the full proposal here.
[edit] Asthma
[edit] Autism
[edit] SPORE
[edit] Tourette Syndrome (GTS)
[edit] SCAN: SNP and CNV Annotations
Genome-wide association studies generate genotype information on hundreds of thousands of SNPs for a phenotype. Making sense of the data requires a number of steps related to the prioritization of variants (SNPs or CNVs) showing association to disease. Having as much information as possible about the variants is critical to sensible prioritization. For each SNP, the SCAN data base is intended to provide:
- Summary information available from public data bases such as physical and functional annotations, frequencies in HapMap reference samples, FST values, and whether the SNP is on (or tags) a haplotype implicated in genomic studies identifying the signature of natural selection
- Linkage Disequilibrium (LD) information, including what genes have variation in strong LD (pairwise or multi-locus LD) with the variant, how well the SNP is interrogated (i.e. multi-locus LD measure) by SNPs on each of the high-throughput platforms
- Summary information from analyzes conducted to characterize HapMap SNP associations to gene expression in the full set of HapMap lymphoblastoid cell lines derived from individuals of European (CEU) and African (YRI) ancestry for 9,156 transcript clusters evaluated using the Affymetrix GeneChip® Human Exon 1.0 ST Array. This information would be summarized both globally (all transcript clusters showing association at a user-specified threshold with the chosen variant) and locally (associations to local transcript clusters with at least nominal significance, including information on rank for local associations)
- Summary information from other genome-wide association studies (e.g. all phenotypes showing association at a user-specified threshold)
In addition, for each gene, we provide annotations on:
- All SNPs showing association with the transcript cluster relevant for the gene (user-specified thresholds for cis- and trans-regulators)
- How well all variants in the HapMap at that gene are interrogated on each high-throughput platform (average multi-locus LD coefficient for each SNP within and up to 2 kbp from the gene)
[edit] Think about including CNVR information.
- Check database.
[edit] Genotype and Copy Number calling
[edit] NOTES
[edit] OUTREACH
[edit] Ataxia (Chris Gomez, Neurology)
[edit] 100K Kidney GWA (Jay Koyner, Medicine)
[edit] Ken Onel, Pediatrics
[edit] Slow-wave activity in Sleep (Esra Tasali, Medicine)
Interview with Leslie Stahl on 60 Minutes
Full 3-part 60 Minutes series on sleep research
[edit] Analysis of 6.0 data for genomic abnormalities in LAM (Lucia Schuger, Wayne State)
[edit] Michael Maitland, Medicine
[edit] Diabetes Research & Training Center (DRTC) Consultation work
[edit] *Neonatal Diabetes (Graeme Bell, Medicine)
[edit] *Bioinformatics/Mouse expression data (M. Roe, Medicine)
[edit] *Completed DRTC Projects
[edit] MISCELLANEOUS PROJECTS
[edit] Cox Lab Cluster
A computer cluster of about 10 two processor dedicated workstations in w615.
[edit] Computer Lab
[edit] QC Analysis
[edit] FINISHED PROJECTS
[edit] Mystery Data
[edit] PERL CLASS
NEW!
Scripts and homeworks will be posted here (soon!) --Croe 11:49, 18 February 2008 (CST)
