Bioinformatics – Optimal Leaf Ordering

For my Bioinformatics project, I, with the help of Dr. Ivaylo Illinkin,  designed a program that takes a bunch of numerical sequences (i.e. a list of lists), and orders them – the lists in the “bigger” list – on the basis of their closeness (measured in terms of their distances) amongst each other. This project was based on a paper by Ziv B. J. et. al. entitled “Fast Optimal Leaf Ordering for Hierarchical Clustering”. The basic algorithm for the program was to generate a leaf cluster using the UPGMA algorithm, followed by finding the leaf order that carries the minimum calculated distance score with particular leaves on the very first and the very last position of the list. The algorithm checks all the possible combination of the leaves generated from the UPGMA clustering and therefore runs in O4 time space.

Here is a demonstration of what the program actually does.

optimal leaf ordering example

 

Please visit this site on Optimal Leaf Ordering to learn more what the program or use the program itself.