Selecting Homologs¶
Homologs are related genes and the output are the raw (unaligned) sequences.
$ eti homologs -i data/apes-115 --outdir apes_homologs --ref human --coord_names 22 --limit 5
Homolog search ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
Extracting 🧬 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
$ ls apes_homologs
ENSG00000093072.fa
ENSG00000100312.fa
ENSG00000100412.fa
ENSG00000128165.fa
ENSG00000128274.fa
logs
md5
not_completed
By default: "protein_coding" genes are selected and --homology_type (see compara summary) is set ortholog_one2one. You can specify different gene biotypes (see species summary) by providing a delimited file to --ref_genes. This file must contain a "stableid" column where the values are the Ensembl stable IDs. To the get the full gene list for the reference species see the eti dump-genes command.