Matrix Transpose

I did ten thousand times multiplication with 800x800 matrix, each element is raged from one to five, integer type.

yrchen tole me a hint yesterday afternoon, that, the matrices should be tramsposed before did a huge compute. In today's experiment, I wrote two version codes, one doesn't has any transpose loop (called m1), the other does (called m1_t). Both them compiled with gcc, with -O3 optimization. The result is:

real 5m31.300s
user 5m31.245s
sys 0m0.020s

real 5m30.802s
user 5m30.265s
sys 0m0.008s

It looks like that -O3 did the transpose before has the huge matrices multiplication.


#define DIE 800

void gen( int a[][DIE]);
void mul( int a[][DIE], int b[][DIE]);

int main(int argc, char *argv[])
int a[DIE][DIE];
int b[DIE][DIE];

int i;
for (i = 0; i < 10000; i++) {
mul(a, b);

return 0;

void gen( int a[][DIE] )
int i, j;
for (i = 0; i < DIE; i++)
for (j = 0; j < DIE; j++)
a[i][j] = random() % 5 + 1;

void mul( int a[][DIE], int b[][DIE])
int i, j, k;
int c[DIE][DIE];

for (i = 0; i < DIE; i++)
for (j = 0; j < DIE; j++)
mul[i][j] = b[j][i];

for (i = 0; i < DIE; i++) {
for (j = 0;j < DIE; j++) {
c[i][j] = 0;
for (k = 0; k < DIE; k++)
c[i][j] += a[i][k] * mul[i][k];
//printf("%d\t", c[i][j]);