This directory contains genotypes inferred using the method described in Burdick JT, Chen WM, Abecasis GR and Cheung VG (2006). In silico method for inferring genotypes in pedigrees. Nat Genet 38:1002-4. http://genomics.med.upenn.edu/genotypeinference/Nat.Genet.vol.38-pp.1002.pdf To do the inference, we started with HapMap genotypes for 60 CEU individuals (the ten families which had HapMap genotypes for four grandparents and two parents), using HapMap build 16c1 and build 21. We omitted chromosomes X and Y. Next, we computed IBD sharing in those ten families, using genotypes from The SNP Consortium (TSC) for 1,978 markers, combined with a set of genotypes for 4,586 markers generated using the Illumina platform. Then, we looked for cases in which we knew which grandparent transmitted an allele, and we can determine which allele was transmitted. For instance, if the IBD information indicates that the maternal and paternal grandmothers had each transmitted an allele to a child, and we can tell from the HapMap genotypes that "C" and "T" had been transmitted, then the child's genotype must be "CT". (Of course, if a parent's genotype was "CC", then we know that parent transmitted a "C" to the child, even if we lack IBD information at that location.) The resulting genotypes are for 78 CEU children in rel 16c1 and rel21, or 73 CEU children in rel22 in those ten families (see corresponding samples lists). Missing alleles are represented by "N". Note that in some cases, the inference program could only infer one allele. So, for instance, genotypes of "NT" or "TN" mean that only one allele ("T") could be determined. (This could be because of missing IBD information, or because we couldn't determine which allele was transmitted.) "NN" represents cases in which the inference program couldn't determine either allele. The program used to infer genotypes is available at http://genomics.med.upenn.edu/genotypeinference/ Josh Burdick jburdick@gradient.cis.upenn.edu 20061009 == The genotype files contain the following columns: snp_rsid: The SNP rsid alleles: The alleles according to dbSNP chrom: chromosome pos: position on the chromosome strand: strand releative to the genome build: genome build source_center: The center which provided the original genotypes in HapMap source_assayLSID: The assayLSID of the original genotypes in HapMap protLSID: protocol lsid for these inferred genotypes followed by the genotypes for all the inferred individuals. == The file 'sample_list.txt' contains information on the individuals in the list. There are the following columns: sample_name: Sample name (nothe this is the cel line identifier. For comparison to HapMap ids, substitute 'NA' for 'GM' ceph_family: CEPH family number ceph_individual: individual number in CEPH family gender: sex of individual relation: familial relationship within the pedigree inferred: 'yes' for those with inferred genotypes. 'no' marks the family members typed within the HapMap project.