Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Silhouette scores for assessment of SNP genotype clusters
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Molecular tools.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences.
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Molecular Medicine.
2005 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 6, article id 35Article in journal (Refereed) Published
Abstract [en]

Background: High-throughput genotyping of single nucleotide polymorphisms ( SNPs) generates large amounts of data. In many SNP genotyping assays, the genotype assignment is based on scatter plots of signals corresponding to the two SNP alleles. In a robust assay the three clusters that define the genotypes are well separated and the distances between the data points within a cluster are short. "Silhouettes" is a graphical aid for interpretation and validation of data clusters that provides a measure of how well a data point was classified when it was assigned to a cluster. Thus "Silhouettes" can potentially be used as a quality measure for SNP genotyping results and for objective comparison of the performance of SNP assays at different circumstances. Results: We created a program (ClusterA) for calculating "Silhouette scores", and applied it to assess the quality of SNP genotype clusters obtained by single nucleotide primer extension ("minisequencing") in the Tag-microarray format. A Silhouette score condenses the quality of the genotype assignment for each SNP assay into a single numeric value, which ranges from 1.0, when the genotype assignment is unequivocal, down to -1.0, when the genotype assignment has been arbitrary. In the present study we applied Silhouette scores to compare the performance of four DNA polymerases in our minisequencing system by analyzing 26 SNPs in both DNA polarities in 16 DNA samples. We found Silhouettes to provide a relevant measure for the quality of SNP assays at different reaction conditions, illustrated by the four DNA polymerases here. According to our result, the genotypes can be unequivocally assigned without manual inspection when the Silhouette score for a SNP assay is > 0.65. All four DNA polymerases performed satisfactorily in our Tag-array minisequencing system. Conclusion: "Silhouette scores" for assessing the quality of SNP genotyping clusters is convenient for evaluating the quality of SNP genotype assignment, and provides an objective, numeric measure for comparing the performance of SNP assays. The program we created for calculating Silhouette scores is freely available, and can be used for quality assessment of the results from all genotyping systems, where the genotypes are assigned by cluster analysis using scatter plots.

Place, publisher, year, edition, pages
2005. Vol. 6, article id 35
National Category
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)
Identifiers
URN: urn:nbn:se:uu:diva-92070DOI: 10.1186/1471-2164-6-35ISI: 000228002500001PubMedID: 000228002500001OAI: oai:DiVA.org:uu-92070DiVA, id: diva2:165022
Available from: 2004-09-15 Created: 2004-09-15 Last updated: 2017-12-14Bibliographically approved
In thesis
1. Methods for Analysis of Disease Associated Genomic Sequence Variation
Open this publication in new window or tab >>Methods for Analysis of Disease Associated Genomic Sequence Variation
2004 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In Molecular Medicine a wide range of methods are applied to analyze the genome to find genetic predictors of human disease. Apart from predisposing disease, genetic variations may also serve as genetic markers in the search for factors underlying complex diseases. Additionally, they provide a means to distinguish between species, analyze evolutionary relationships and subdivide species into strains.

The development and improvement of laboratory techniques and computational methods was a spin-off effect of the Human Genome Project. The same techniques for analyzing genomic sequence variations may be used independent of organism or source of DNA or RNA. In this thesis, methods for high-throughput analysis of sequence variations were developed, evaluated and applied.

The performance of several genotyping assays were investigated prior to genotyping 4000 samples in a co-operative genetic epidemiological study. Sequence variations in the estrogen receptor alpha gene were found to be associated with an increased risk of breast and endometrial cancer in Swedish women.

Whole genome amplification (WGA) enables large scale genetic analysis of sparse amounts of biobanked DNA samples. The performance of two WGA methods was evaluated using four-color minisequencing on tag-arrays. Our in-house developed assay and “array of arrays” format allow up to 80 samples to be analyzed in parallel on a single microscope slide. Multiple displacement amplification by the Φ29 DNA polymerase gave essentially identical genotyping results as genomic DNA. To facilitate accurate method comparisons, a cluster quality assessment approach was established and applied to assess the performance of four commercially available DNA polymerases in the tag-array minisequencing assay.

A microarray method for genotyping human group A rotavirus (HRV) was developed and applied to an epidemiological survey of infectious HRV strains in Nicaragua. The method combines specific capture of amplified viral sequences on microarrays with genotype-specific DNA-polymerase mediated extension of capture oligonucleotides with fluorescent dNTPs.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2004. p. 89
Series
Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine, ISSN 0282-7476 ; 1371
Keywords
Molecular medicine, microarray, molecular medicine, single nucleotide polymorphism, whole genome amplification, breast cancer, endometrial cancer, human rotavirus, Molekylärmedicin
National Category
Medical Genetics
Identifiers
urn:nbn:se:uu:diva-4525 (URN)91-554-6027-5 (ISBN)
Public defence
2004-10-08, Rudbecksalen, Rudbecklaboratoriet, Dag Hammarskjölds väg 20, Uppsala, 09:15
Opponent
Supervisors
Available from: 2004-09-15 Created: 2004-09-15 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(270 kB)23 downloads
File information
File name FULLTEXT01.pdfFile size 270 kBChecksum SHA-512
b191bc531670762759053cc991278e9204f6ca5fb479d213d118672d2ba30bd786eeb7752255d9718a32f5dfc98ca002f9f58a8ad9e82b1f04c73f46baea5d8b
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Syvänen, Ann-Christine
By organisation
Department of Medical SciencesMolecular toolsMolecular Medicine
In the same journal
BMC Genomics
Medical Biotechnology (with a focus on Cell Biology (including Stem Cell Biology), Molecular Biology, Microbiology, Biochemistry or Biopharmacy)

Search outside of DiVA

GoogleGoogle Scholar
Total: 23 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 598 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf