Joint PCA (with a reference)

The joint PCA method works by first merging (joining), by locus and alleles, the input dataset with the reference dataset. This is followed by “normal” PCA on the merged dataset

Below is a code on how you can run joint PCA via the command-line or inside a Python script Use

  1. Command line

    pca --data-dirname data/ --data-basename 1kg_annotated --out-dir data/ --input-type hail --reference grch37 --pca-type joint
    
  2. Python (inside a Python script)

    import gwaspy.pca as pca
    pca.pca.pca(data_dirname="data/", data_basename="1kg_annotated",  out_dir="data/",
                input_type="hail", reference="GRCh37", pca_type="joint")