![]() |
|||||||||||
|
|||||||||||
|
|||||||||||
Projects |
|
|
|
Our
group works on the following projects:
MHC haplotype sequencing: An integrated approach to common disease (NGFN-Plus IG/ Central Research Project) The human major histocompatibility complex (MHC) is recognized as the most important genetic region in relation to common human diseases including inflammatory, infectious and autoimmune diseases as well as transplant medicine (Lechler and Warren, 2000). Major National and International Genome Research Networks including the largest whole-genome association study to date (WTCCC, Nature 447: 661, 2007) have now demonstrated associations between the MHC and numerous disease phenotypes of interest such as inflammatory bowel disease (IBD), psoriasis, sarcoidosis, atopic eczema, susceptibility to sepsis, asthma, diabetes type 1, and many more. The question now arises, how to move from the regions of association to the underlying causal variants for functional analysis and translation into diagnostics and therapeutics. While clinically highly informative, the complex nature of the MHC presents major challenges to genetic analysis: structural variation in the form of copy-number variations, insertions, deletions and inversions coupled with unprecedented levels of single nucleotide polymorphisms and differing degrees of recombination and linkage disequilibrium have made the MHC the most variable and plastic region in the human genome and severely hampered the hunt for disease genes. Genomic sequencing is by far the most efficient and possibly the only means by which this extraordinary genetic complexity can be unravelled. The potential of this approach has already been demonstrated by sequencing a small number of haplotypes (Stewart et al., 2004, Horton et al., 2008), which led to the identification of a novel susceptibility locus for multiple sclerosis (Yeo et al., 2007). These studies also showed that, despite the large number (>70,000) of MHC variations already known, no variation plateau has yet been reached suggesting that many more potentially disease causing variants must exist. The recent advent of new sequencing platforms has now created the opportunity to capture and clinically harness the full variation content of the MHC by sequencing disease-associated haplotypes on the population level. Our major goal therefore is to sequence complete MHC haplotypes conferring risk to specific diseases of prime interest to the National Genome Research Networks. This will deliver sets of ‘candidate causal variations’, the missing link between association and disease gene. This essential information will enable research groups to track down the causative variants. We are in a leading position worldwide to embark on such an effort, because of our unique Haploid Reference Resource (HRR) and availability of second generation sequencing technologies. The HRR consists of 100 fosmid libraries representing 200 haploid genomes, including 200 ‘homozygous’ MHC haplotypes. We have demonstrated feasibility and validity of mapping and sequencing HRR MHC clones using NGS technology. In addition, Affymetrix 1000 K genotypic data are available for all 100 HRR DNA samples, allowing the mapping of clones into haplotypes. Moreover, four-digit HLA-typing for all HRR samples confirms presence of a broad spectrum of both risk and protective haplotypes for many common MHC-related diseases analyzed by the National Genome Research Networks, national and international collaborators. Haplotype
approaches to
disease gene discovery: A
systematic investigation and establishment of reference resources
(NGFN2
Systematic Methodological Plattform DNA-Project) Haplotype-based
approaches to disease gene discovery
have become a central theme. The ‘International HapMap Project’ has
been
launched (Nature 425: 758-9, 2003; Nature 426: 789-96, 2003),
specifying
strategies and resources to be made available to the international
community.
The HapMap Project relies on the assumption that the human genome can
be
resolved into ‘blocks’ of common haplotypes, with only few haplotypes
per block
and few SNPs necessary to tag each block, allowing genome-wide
association and
candidate gene studies at much higher efficiency. With view of
future
lines of investigation it has now been recognized that the evaluation
of high
resolution genetic variation data will be a next important and
necessary step
in order to 1) assist optimisation of SNP selection and
analysis of LD
and haplotype structures and extraction of tags and 2) systematically
assess
the ‘completeness of the information’ (Nature 426: 793, 2003). In depth
knowledge on the amount and nature of information that will be added
will be
indispensable and critically reflect on the power of current haplotype
approaches
to represent underlying LD and haplotype structures and their validity
as a
tool to map causative variants. It will, moreover, critically guide the
design
of the ultimately successful approaches to haplotype-based disease gene
discovery. It will, at last, provide the basis to make informed
decisions on the
meaningful investments in this line of research in the future. We
will perform a first systematic investigation in
this direction, analysing high resolution genetic variation data in
comparison
to the data provided by the HapMap Project. We rely on the following
prerequisites: 1) High resolution data sets obtained by the comparative
sequence analysis of nuclear loci in an average of several hundred
individuals
including cases and controls, a significantly greater depth than
achieved
previously; 2) a novel, highly efficient haplotyping technology (CSH),
which
allows the genome-wide determination of the molecular haplotype
structures of
any gene or chromosomal region of interest, respectively, and 3) an
(inter)national network of leading experts in haplotype analysis. We
will
establish a reference resource of haploid clone pools from a total of
250
individuals (500 haploid genomes) from a representative German
population. We
will type the same loci, using the high resolution-derived SNPs
on the
one side, the HapMap-derived SNPs on the other side, in the same
sample of
haploid genomes. We will then systematically analyse and compare
the LD and
haplotype structures and tag SNPs derived by these two approaches,
respectively. Major objectives are: 1) to evaluate to which extent the
HapMap-derived SNPs and tag SNPs in fact capture the LD and haplotype
structures given at the ultimate level of resolution, DNA sequence,
and, in
particular, candidate gene-related haplotype structures; 2) to assess,
which
types of information will be added at increasing levels of resolution;
3) to
test at given data sets whether the disease associations derived by
high
resolution analyses could have been captured by HapMap-derived SNPs and
to
which extent evaluation of rare (disease-related) haplotypes may be of
relevance. Moreover, proposed haploid reference system provides the
basis to
systematically assess the correspondence of haplotype structures (both
phase
and block decomposition) predicted in silico with their
molecular
correlates. Thus, it will serve as basis to comparatively evaluate,
develop,
optimise and validate algorithms, an issue of increasing importance
with view
of the increasingly complex data sets expected in the future. Undoubtedly,
the proposed project will provide
essential information for all present and future disease gene
discovery
projects in the NGFN that rely on genetic variation/haplotype
approaches. The results
will have important implications on the development of successful
strategies
and investment of resources. Importantly, this project implies the
establishment of a ‘community resource’, accessible to any
collaborators from
the national / international genome networks: a resource for the
validation of
haplotype structures, a reference system for the development of
algorithms and
a ‘permanent’ control group and reference resource for all
NGFN2 SNP and
haplotype-based disease association studies. Moreover, access to the
molecular
haplotypes of any gene/potential drug target in a population of
substantial size
will provide essential information to pharmaceutical and biotech
companies,
which will help elucidate individually different drug response and
facilitate
processes of drug target evaluation, prioritisation and clinical
trials. Thus,
it represents a key resource for pharmacogenomic approaches to drug
development. Proposed haploid reference resource represents the basis
1) to
test for existence of numerous individually different forms of a gene
and 2) to
provide the templates for their in vitro functional
characterisation.
This represents a key step in the evaluation of gene function,
dysfunction, the
molecular basis of drug response and disease processes.
Comparative
Candidate Gene Sequencing, Haplotype
Analysis and Genetic Risk Profile Identification (longstanding line of
research, recently funded by the NGFN Optimization Fonds) The
identification of genes predisposing to human
diseases is of paramount importance for understanding the molecular
basis of
the disease and individually different drug response, and will
establish new
routes to diagnosis and therapeutic advances of immense medical
benefit. A key
step in all strategies for disease
gene identification is the comparative sequence analysis of candidate
genes in
patients and controls to identify those specific sequence variations
(SNPs)
associated with common, complex disease. The importance of
haplotype-based
analysis over single SNP scoring of disease gene candidates has at last
been
established. The present work of
the group relies on longstanding
lines of research and development (since 1990, see http://www.molgen.mpg.de/~genetic-variation/ProjectProposal1990
and ‘Background’), that have focussed on the systematic analysis of
human
interindividual DNA sequence differences and their potential functional
implications. The underlying, implicit concept that has been pursued
was that
of whole gene ‘causal haplotypes’: Since it is the entire gene and its
encoded
protein that act as the units of function which potentially affect a
phenotype
and ultimately allow first conclusions on disease mechanisms, we will
have to
analyse the entire sequences of the individual genes including their
regulatory
and critical intronic regions. It is therefore essential in diploid
organisms
to determine the specific combinations of given gene sequence variants
for each
of the chromosomes defined here as haplotypes. Only the correct
determination
of the underlying haplotypes will allow establishment of meaningful
relationships between gene variants (SNPs), gene function and
phenotype. It has now become evident that genes and the human genome
are much more variable than previously thought. Our as well as other studies that have
systematically
compared
individual candidate gene sequences have revealed that single genes may
contain
multiple SNPs. The abundant gene variability presents major
challenges to the analysis and establishment of complex
genotype/haplotype
phenotype relationships against a background of high natural genome
sequence
diversity.
Past and present lines of research and development
were/are: -
The development of high
throughput (HT) resequencing technologies to compare candidate gene
sequence
information in multiple individuals, specifically (automated)
‘Multiplex Sequence Comparison’, which allowed the
simultaneous sequence analysis of 5 up to 55 PCR products in one
reaction tube;
later, HT capillary sequencing was implemented. -
The prediction of
haplotypes from numerous variants, first by development of a haplotype
program
(MULTIHAP) based on the EM algorithm that allowed prediction of the
most likely
haplotype pair for each genotype in a given sample and moreover
processing a
high number of variants. Additional/available programs have been
implemented.
Recently, a program version has been developed that improves over
popular
methods by introducing a general complete-data-likelihood framework
(Zhang J et
al., in press). -
The validation of the
genetic haplotypes by application of molecular genetic techniques, such
as for
instance by a combination of allele-specific PCR and generation of
allele-specific products, or by (cosmid) cloning and subsequent DNA
sequencing
or DNA marker typing, respectively. -
The development of
approaches to reduce haplotype complexity, for instance by
classification of
haplotypes into functionally related (or ideally functionally
equivalent)
groups. For instance, a hierarchical cluster analysis procedure has
been
applied for classification; additional clustering methods are
available.
Genotypes are being analysed/classified accordingly.
-
The development and
application of approaches to perform haplotype-based association
studies and
identify those specific sequence variants, or combinations of variants
(risk pattern(s))
that are associated with the disease phenotype or individually
different drug
response.
At present, substantial
data sets from the comparative candidate gene sequence analysis in an
average
of several hundred individuals are being tested and analysed in
collaboration
with international institutions with respect to given genetic
variation,
underlying haplotype structures and the extraction of phenotypically
relevant
patterns of variants.
Genetics and
Pharmacogenomics of Obesity (BMBF ‘BioProfile
Nutrigenomics Program’ Potsdam/ Berlin) The
World Health Organisation (WHO) has declared
obesity the largest global chronic health problem in adults. The burden
it
imposes on public health and economy is enormous. Therefore, there is a
tremendous
need for more effective drugs and diagnostics that allow optimised
therapy,
prediction and prevention of this disease, as well as prediction of
individual
treatment outcome. The major goals of
this project are: 1) The
identification of disease genes/key molecules involved in the
development of
obesity, specifically, its molecular phenotypes; 2) the identification
of novel
drug targets that allow development of innovative, highly efficient
therapeutics and, in consequence, disease prevention; 3) the
identification of
genetic markers that allow prediction of specific molecular disease
phenotypes
and/or individual responsiveness to pharmaceutical drugs, dietary
habits/nutrition and surgical measures to food restriction, which will
provide
the basis for the development of diagnostic procedures/chips; 4) the
development of highly cost-efficient high throughput genotyping
technologies
that facilitate disease gene discovery and whole genome association
studies. To
achieve these goals, unique resources and expertise from investigators
at major
research and clinical institutions have been combined with
complementary
industrial resources, technologies and know-how. Competitive advantages
include: 1) unique, high quality clinical material and a notable body
of data
to draw from; these point to specific molecules/functional pathways
involved
and promise innovative approaches to drug target identification and
treatment,
such as for instance the modulation of pre-adipocyte differentiation
and
adipocyte biology; 2) key/high throughput technologies in all relevant
areas of
molecular genetic variation analysis, functional genomics and
bioinformatics.
Importantly, the collaborators constituting this network have already
shown to
be capable of generating positive results in terms of potential genetic
risk
factors, drug target candidates and genetic variants of predictive
value. Thus,
we anticipate that the necessary spectrum of marketable results,
products and
technologies can be generated within the funding period for the purpose
of
subsequent commercialisation. The proposed lines of research and
development
are designed to prepare the ground for
the establishment of a company that will integrate the
following components:
1) a patient-oriented ‘Service/Competence Center’ delivering molecular
phenotyping,
genotyping and diagnostic services; 2) a ‘Medical Consulting Center’
for
disease treatment and prevention; 3) an extensive clinical DNA resource
as the
basis for industrial co-operations and disease gene discovery; 4) a
‘Center for
Clinical Development and Drug Trials’ and 5) a
genotyping/sequencing/high
throughput technology platform, serving industrial co-operations and in
house
disease gene discovery. First steps towards the foundation of an
‘Integrated
Medical Genomics’ company have been taken. A strong interest in several
of
these components has already been expressed by both pharmaceutical and
biotech
companies
External funding BMBF/NGFN-Plus: MHC Haplotype
Sequencing: An integrated approach to common disease
BMBF/NGFN2: Haplotype approaches to disease gene discovery: A systematic investigation and establishment of reference resources. BMBF/NGFN (Optimization Funds): Comparative Candidate Gene Sequencing, Haplotype Analysis and Genetic Risk Profile Identification. BMBF BioProfile Potsdam Berlin: Genetics and Pharmacogenomics of Obesity. German-Israeli Foundation (GIF): Haplotyping and Association Algorithms and their Applications to Model Disease Genes. BMBF BioProfile Potsdam Berlin: Verbundvorhaben ‘Innovation des Therapiekonzeptes für das Metabolische Syndrom’ – Teilprojekt: Haplotype analysis. GlaxoSmithKline Award: Analysis of high-resolution genetic variation data with particular emphasis on haplotype structures and LD patterns.
|