As I had discussed in my last blog post, we now have two engine specific constants: `read_time`

and `scan_time`

, and two global constants `time_for_compare`

and `time_for_compare_rowid`

.
To determine the time taken by the operations related to each constants, we are counting how many times these operations are being performed. This has been implemented for the constants mentioned above.
After each query, we write the data gathered into a file. To avoid collisions, we maintain different files for different threads.
Next step is to solve for t_{1}, t_{2} …, t_{130} from the equation:
> a_{1}t_{1} + a_{2}t_{2} + … + a_{130}t_{130}= t_{total}

Note we have 130 constants, `(2 + 2*64)`

, where 64 is maximum allowed engines. Next, I needed a set of equations. I gathered some data by running `mtr`

. Most of the coefficients would be 0, as we would be only using a few engines.
I have started with the least squares method to solve these equations. Here is a simple octave script that does that for me,

```
A = dlmread('coefficients.txt', ' ');
X = A(:, [1:end-1]);
Y = A(:, end);
S = ols(Y, X);
dlmwrite('solutions.txt', S);
```

The results are not promising. I get negative values for some of the coefficients. This may be because of the outliers in my sample data, introduced because of a bug in my code. I will fix it soon and then try again. I will also try out other methods, some of which will be specific to the case of sparse matrix input. You can download the dataset from here.

As usual, you can find the code at my repo. Comments and suggestions are welcome.

blog comments powered by Disqus