deCODEme and 23andme material

This page contains material and links to material useful to analyze deCODEme and 23andme test results, especially the Y chromosome.

Merged Y-file

David Reynolds and Adriano Squecco are keeping a repository of Y-chromosome decodeme and 23andme results:
Y-Chromosome Genome Comparison

An older repository was kept by Ann Turner. The file will not be updated. However, the file is very useful because it contains also the first version of HapMap Y chromosome data and a list of phylogenetically relevant SNP on decodeme and 23andme:

I have merged her worksheets for decodeme, 23andme (versions 1 and 2), and the hapmaps into one big csv file:

(the file has not been updated for a while). I merged using Matlab:
Matlab files

Note for the matlab programs

Regarding the output csv file: I have also a smaller version in which I do some cleaning:

This file deletes most Hapmap's SNP in the recombinant part of the Y choromosome (that is, they can have different alleles). Then it tries to change some of the SNP values for hapmap so that they are compatible with decodeme and 23andme. It also estimates the haplogroup for hapmap observations. Of course, beware: things have not been checked carefully. The matlab codes used to create the file are enclosed, so one can check the data cleaning.

Simple admixture

Dienekes has created a program to compute a simple admixture model of NW European, SE European, and Ashkenazi Jew (see his blog entry for an explanation). The model uses frequency data from Price 2007 to make a guess about a likely percentage of each of these three populations in one's genome. I have replicated his computation in matlab:

Chromosome extraction

Since the result file is huge, this matlab code extracts the (decodeme, for now) data for each chromosome and saves them to another csv file (note: the program takes a long time to run).