Categories

Rocketship Overview


Background

Many years ago I decided that I liked the idea of UML.  The main drawback was that it imposed some pretty strict restraints when you were only trying to get the initial ideas presented.  It failed for what I was trying to accomplish, namely loose specification of a piece of software before I began writing it.   I tried to use it as a sort of "scratch pad" to flesh out my ideas so I could come up with the general approach.   While it did not fit this role very well, the documentation it presented made it easy to refer to when looking back on things I had done.  There was no denying the benefits of good documentation.

There are several tools out there that will automatically generate documentation about your code for you.  Doxygen is one of my personal favorites, generating multiple diagrams as well as the full documentation you add alongside your code.  However, it only provides a limited set of diagrams and graphs (dependency graphs, inheritance diagrams and collaboration diagrams).   It is designed around the idea of extracting limited information from the code and creating useful documentation based on that information.

One of the often overlooked UML diagrams is the Activity Diagram.   The usefulness of activity diagrams is quickly lost if they are not constantly maintained with every code change.   They describe the actual behavior of a system, easily representing complex behaviors in a straightforward manner.  While it would be great if they could be generated by Doxygen, it requires semantic knowledge of language internals to be able to automatically generate the diagrams.  Doxygen does not have the framework in place for the level of analysis required.

Introducing RocketShip

RocketShip was created as a way to extract more useful documentation from the actual code that has been written.  It will currently generate a directed graph that is very similar to UML activity diagrams from the code itself.   It is implemented as an analysis pass for LLVM's optimizer.   This allows it to be used to automatically generate the corresponding graphs every time the code is compiled (providing you load the RocketShip module and pass the -rocketship flag to the optimizer).

Key Features

Current Limitations

Release

RocketShip is available at http://github.com/ismarc/RocketShip.  As I have not decided on a license yet, you are free to download and use the software for evaluation, but please refrain from distribution.   Once I have decided on a license, the repository will be updated (and the license will be an Open Source license, I just wanted to make sure the source was available before I finished weighing the pros of each one).

Example

What new product would be complete without a demonstration of capabilities? Rather than a contrived set of code to generate a graph that looks wonderful, I have been testing RocketShip against code I had laying around from working on Project Euler problems. Below is the source code for the function, the assembly representation of the bitcode and the subsequent graph that was generated automatically.

Original Source

void multiplyBigInt(int first[], int second[], int destination[])
{
    int i;
    int temp[2000] = { 0 };
    int temp_two[2000] = { 0 };

    for (i = 0; i < 2000; i++) {
        assignBigInt(temp, 0);
        multiplyBigDigit(second, temp, first[i]);
        shiftBigDigit(temp, i);
        addBigInts(temp_two, temp, destination);
        copyBigInt(destination, temp_two);
    }
}

LLVM Assembly bitcode representation

define void @multiplyBigInt(i32* %first, i32* %second, i32* %destination) nounwind {
entry:
  %first_addr = alloca i32*                       ;  [#uses=2]
  %second_addr = alloca i32*                      ;  [#uses=2]
  %destination_addr = alloca i32*                 ;  [#uses=3]
  %i = alloca i32                                 ;  [#uses=6]
  %j = alloca i32                                 ;  [#uses=0]
  %temp = alloca [2000 x i32]                     ; <[2000 x i32]*> [#uses=5]
  %temp_two = alloca [2000 x i32]                 ; <[2000 x i32]*> [#uses=3]
  %"alloca point" = bitcast i32 0 to i32          ;  [#uses=0]
  store i32* %first, i32** %first_addr
  store i32* %second, i32** %second_addr
  store i32* %destination, i32** %destination_addr
  %temp1 = bitcast [2000 x i32]* %temp to i8*     ;  [#uses=1]
  call void @llvm.memset.i64(i8* %temp1, i8 0, i64 8000, i32 4)
  %temp_two2 = bitcast [2000 x i32]* %temp_two to i8* ;  [#uses=1]
  call void @llvm.memset.i64(i8* %temp_two2, i8 0, i64 8000, i32 4)
  store i32 0, i32* %i, align 4
  br label %bb9

bb:                                               ; preds = %bb9
  %temp3 = bitcast [2000 x i32]* %temp to i32*    ;  [#uses=1]
  call void @assignBigInt(i32* %temp3, i32 0) nounwind
  %0 = load i32** %first_addr, align 8            ;  [#uses=1]
  %1 = load i32* %i, align 4                      ;  [#uses=1]
  %2 = sext i32 %1 to i64                         ;  [#uses=1]
  %3 = getelementptr inbounds i32* %0, i64 %2     ;  [#uses=1]
  %4 = load i32* %3, align 1                      ;  [#uses=1]
  %5 = load i32** %second_addr, align 8           ;  [#uses=1]
  %temp4 = bitcast [2000 x i32]* %temp to i32*    ;  [#uses=1]
  call void @multiplyBigDigit(i32* %5, i32* %temp4, i32 %4) nounwind
  %temp5 = bitcast [2000 x i32]* %temp to i32*    ;  [#uses=1]
  %6 = load i32* %i, align 4                      ;  [#uses=1]
  call void @shiftBigDigit(i32* %temp5, i32 %6) nounwind
  %temp_two6 = bitcast [2000 x i32]* %temp_two to i32* ;  [#uses=1]
  %temp7 = bitcast [2000 x i32]* %temp to i32*    ;  [#uses=1]
  %7 = load i32** %destination_addr, align 8      ;  [#uses=1]
  call void @addBigInts(i32* %temp_two6, i32* %temp7, i32* %7) nounwind
  %8 = load i32** %destination_addr, align 8      ;  [#uses=1]
  %temp_two8 = bitcast [2000 x i32]* %temp_two to i32* ;  [#uses=1]
  call void @copyBigInt(i32* %8, i32* %temp_two8) nounwind
  %9 = load i32* %i, align 4                      ;  [#uses=1]
  %10 = add nsw i32 %9, 1                         ;  [#uses=1]
  store i32 %10, i32* %i, align 4
  br label %bb9

bb9:                                              ; preds = %bb, %entry
  %11 = load i32* %i, align 4                     ;  [#uses=1]
  %12 = icmp sle i32 %11, 1999                    ;  [#uses=1]
  br i1 %12, label %bb, label %bb10

bb10:                                             ; preds = %bb9
  br label %return

return:                                           ; preds = %bb10
  ret void
}

Generated Graph

multiplyBigInt

Discussion