What is imputation?

Try to solve the puzzle by filling in the missing letters in the sentence below.

Wh_ is my g_nome inco_plete?

If you were able to complete the puzzle, you can understand how genotype imputation works. Genotype imputation is performed using a similar method to the one your brain used to fill in the missing letters above. You used context clues from the letters and words surrounding the missing letters as well as your knowledge of the known words in the english dictionary. Genotype imputation relies on a reference population of many individuals who have essentially no missing letters and looks for shared haplotypes between reference sequences and your sequence. Imputation algorithms provide a probability for each possible genotype at every imputed locus in the genome.


The Imputation pipeline was architected with genipe
Within genipe:
plink is used for pre-phasing data cleaning and format conversion.
SHAPEIT is used for phasing.
IMPUTE2 is used for imputation, using the 1000 Genomes Phase 3 reference data.

