Getting ready for my Transfer of Status and the trouble with MPJ-Express and rJava

I am coming to the end of my time as a Probationer Research Student and I have about 2.5 months to put together a 15-30 page document in order to convince the Dept. that I should be allowed to become a D.Phil student properly and continue my research.  I am hopeful about the process but I am very stressed out about it. The basic idea is to see if I can produce original research and if I can communicate what I find.  I know I can do both. I started doing original research as part of my undergraduate and I don’t really see a problem with striking out on your own. I have written enough papers and essays that I am confident that after a number of drafts I will have a document that will pass muster.

But that doesn’t stop the nerves, I am going to be tested and I will have to answer questions about my research with a panel for an hour or two. I need to remember that I can do this and that I am a capable researcher.

It doesn’t help that today I have had a series of failures of code to deal with. I have moved to a new set of servers operated by the Oxford Supercomputing Centre. This means that I have access to up to 128 cores at a time but I have had to teach myself how to code for supercomputers and I am regretting not taking that class as part of my undergraduate. I have been teaching myself Message Passing Interface (MPI) in the form of MPJ-Express so that I can continue to code in Java and just pass my .jar files to the server for execution.  My main task with MPJ-Express is to parallelise Niall Cardin’s treesim. Treesim is the program that I use to create trees from HapMap and 100 Genomes data and then I analyse it to determine where positive selection is occurring. It has taken me about 2 weeks to get this up and running. The MPJ-Express itself wasn’t too hard, I managed to get the wrapper up and running fairly quickly but not perfectly. It was controlling treesim with Java that proved to be the big problem. I was trying to use Java’s Runtime.getExec() process or ProcessBuilder to launch and manage the single thread process of treesim, but I had to replace that with the Apache common exec library (btw THANK YOU ASF for being there!).

Currently my MPJ-Express code launches all the demons and then the head node send a series of non-blocked command to all the cores. Each core then takes the command and runs the instance of treesim, but I would like to change it so that when a core is free it polls the head node for the next available command. But I am just happy that it is working at this point in time.

I have also been learning rJava to deal with the out put of another Java application that I have written. rJava is a bit of a nightmare because I don’t think enough people blog about it so it took me a long time to find a simple command that fixed my problem. I was having problems translating a Java matrix into a R matrix but the following code sorted out the problem.

This is the signature of the Java method

 

public double[][] getData()

 

I tested that within a Java environment and it was working fine. But when I moved it to a R environment it wouldn’t translate. The following code is what was needed to fix the problem of moving a 2d array from Java to R in rJava. Your .class file needs to be in the R/library/rJava/java directory of your install or you need to .addclassPath in R.

 

library(“rJava”) // loads the rJava library

.jinit(parameters=”-Xmx10240m”) // starts the JVM with the parameter -Xmx10240m since I needed 10GB of memory for my process

s <- .jarray(“string”, “args”) // creates a String[]

javaobj <- .jnew(“NameOfClass”, s) // executes public static void main

array <- .jcall(javaobj, “[[D]”, “getData”) // executes the method of your choice

array <- sapply(array, .jevalArray) // this is what I was missing

 

After sapply I could use the matrix normally.

In other news, I ran across one of my students and he told me how he used my class and some of my advice to replace a program he was using with a better one he wrote himself. It felt very good. I have been in touch with ITLP at Oxford and they want me to run the Perl class again next term but they also want me to put together a new distance learning Python class. I am very excited about this.

Advertisements
Leave a comment

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: