Programs and Optimal Solutions for the Haplotype Assembly Problem
- The new program,
an example matrix file, and
the simulated dataset Simu95_MP1.
To run the new program, you need to proceed as follows.
- Install CPLEX on your computer in any folder you like.
- Uncompress the program in any folder you like.
- Go to the program folder.
- Copy the matrix file (you want to solve) to the program folder.
- Modify the Makefile so that it contains the correct path to your CPLEX installation.
- Further modify the Makefile by replacing example.matrix with the matrix file you want to solve.
( Note: To see what each command-line argument means
when running "java HapAssemble", just type "java HapAssemble".)
- Type "make run" to start solving the matrix.
- Note on the output: If the input matrix has a column in which no 0's
(respectively, 1's) appear, then the output haplotype is always 1 (respectively, 0) in the column
even if you use the program to solve the all-heterozygous case. So, if you don't like this in the
all-heterozygous case, you need to add a length-1 read which contains only a 0 (respectively, 1)
in the column.
- The old program and
an example matrix file.
To compile and run the program, you need to proceed as follows.
- Install CPLEX on your computer in any folder you like.
- Uncompress the program in any folder you like.
- Go to the program folder.
- Modify the Makefile so that it contains the correct path to your CPLEX installation.
- Compile the program by typing the command "make CPLEX".
- Copy the matrix file (you want to solve) to the program folder.
( Note:
your matrix file has to be of the same format as the example matrix file.
That is, each row of the file must start with the ID of the row followed by
two spaces and then a read. Moreover, when you solve an instance stored in
a file of name Prefix.Suffix, there should be no other files in the same
directory whose names start with Prefix.)
- To solve the general problem (respectively, the all-heterozygous case) for the matrix,
type the command "sh hapAssembly4.sh matrix_file_name 0 -1"
(respectively, "sh hapAssembly2.sh matrix_file_name 0").
( Note: To see how to run the Shell script files hapAssembly4.sh
and hapAssembly2.sh, please read the first paragraph of the files.)
- Note on the output: If the input matrix has a column in which no 0's
(respectively, 1's) appear, then the output haplotype is always 1 (respectively, 0) in the column
even if you use the program to solve the all-heterozygous case. So, if you don't like this in the
all-heterozygous case, you need to add a length-1 read which contains only a 0 (respectively, 1)
in the column.
- The supplementary material.
- Optimal Solutions for the HuRef Dataset in the All-Heterozygous Case:
chromosome 1,
chromosome 2,
chromosome 3,
chromosome 4,
chromosome 5,
chromosome 6,
chromosome 7,
chromosome 8,
chromosome 9,
chromosome 10,
chromosome 11,
chromosome 12,
chromosome 13,
chromosome 14,
chromosome 15,
chromosome 16,
chromosome 17,
chromosome 18,
chromosome 19,
chromosome 20,
chromosome 21,
chromosome 22.
- Optimal Solutions for the HuRef Dataset in the General Case:
chromosome 1,
chromosome 2,
chromosome 3,
chromosome 4,
chromosome 5,
chromosome 6,
chromosome 7,
chromosome 8,
chromosome 9,
chromosome 10,
chromosome 11,
chromosome 12,
chromosome 13,
chromosome 14,
chromosome 15,
chromosome 16,
chromosome 17,
chromosome 18,
chromosome 19,
chromosome 20,
chromosome 21,
chromosome 22.