Benchmark

miniBench, a open source benchmark

miniBench is part of OpenSourceMark, a open source benchmark project. Despite the sound of the name, its not platform independent(maybe the source code is cross platform). I have tried it in Windows and I don't know if the source will work without a OS.
miniBench mainly benchmark the CPU, but it also bench mark the I/O and memory.
miniBench's output is directly to the command line, so it's best if you use methods to capture the output to a file. Open command line and go to the directory containing minibench.exe, then type this
minibench>filename.txt
in command line, any program>filename will result writing the output to the a file. To append to the file, use >> instead of >.

and now, just wait. don't let the computer do any other process because it affects the accuracy of the benchmark.
Below is the Thinkpad T60 with a Core duo T2500 2GHz CPU and 1G ram. Faster CPUs have smaller elapsed time. The LCD looks strange doesn't it? Because once upon a time, alcohol was poured into it.

        *** SUMMARY  ***
=====================================
Aes encrypt 8kB test 	888.55
Aes decrypt 8kB test 	701.85
Aes encrypt 16kB test 	901.23
Aes decrypt 16kB test 	703.75
Aes encrypt 32kB test 	901.23
Aes decrypt 32kB test 	701.85
Aes encrypt 64kB test 	894.84
Aes decrypt 64kB test 	701.85
Aes encrypt 128kB test 	894.84
Aes decrypt 128kB test 	701.85
Aes encrypt 256kB test 	897.41
Aes decrypt 256kB test 	701.31
Aes encrypt 512kB test 	894.20
Aes decrypt 512kB test 	701.18
Aes encrypt 1,024kB test 	895.06
Aes decrypt 1,024kB test 	701.81
Aes encrypt 2,048kB test 	891.53
Aes decrypt 2,048kB test 	699.77
Aes encrypt 4,096kB test 	897.96
Aes decrypt 4,096kB test 	704.82
Aes encrypt 8,192kB test 	901.53
Aes decrypt 8,192kB test 	704.82
Aes encrypt 16,384kB test 	898.80
Aes decrypt 16,384kB test 	705.61
Dhrystone test 	1895.73
Double-precision floating-point arithmetic test 	1061.05
Fib( 43 ) test 	966.49
FFT test 	747.44
FlopsFpuFloatAdd (MFLOPS) 	324.08
FlopsSseFloatAdd (MFLOPS) 	1624.37
FlopsFpuFloatMul (MFLOPS) 	316.02
FlopsSseFloatMul (MFLOPS) 	1600.00
FlopsFpuFloatDiv (MFLOPS) 	61.80
FlopsSseFloatDiv (MFLOPS) 	233.78
FlopsSseFloatMulAdd (MFLOPS) 	2825.61
FlopsFpuDoubleAdd (MFLOPS) 	839.45
FlopsSse2DoubleAdd (MFLOPS) 	839.45
FlopsSse2DoubleAddMax (MFLOPS) 	877.51
FlopsFpuDoubleMul (MFLOPS) 	930.23
FlopsSse2DoubleMul (MFLOPS) 	853.79
FlopsSse2DoubleMulMax (MFLOPS) 	786.89
FlopsFpuDoubleDiv (MFLOPS) 	64.08
FlopsSse2DoubleDiv (MFLOPS) 	64.08
FlopsFpuDoubleMulAdd3 (MFLOPS) 	1428.57
FlopsSse2DoubleMulAdd (MFLOPS) 	1295.55
FlopsSse2DoubleMulAdd2 (MFLOPS) 	1536.00
FlopsSse2DoubleMulAdd3 (MFLOPS) 	1428.57
FlopsSse2DoubleMulAdd4 (MFLOPS) 	1545.89
FlopsSse2DoubleMulAdd5 (MFLOPS) 	1464.53
Heap sort test 	725.93
Integer arithmetic test 	887.82
Integer matrix multiplication test 	695.05
Linpack (FLOPS DP Rolled) 	1102.00
Linpack (FLOPS DP Unrolled) 	1105.00
Pi test 	857.37
Queen test 	989.29
Sha-1 small strings score 	972.97
Sha-1 50k string score 	995.02
Sha-1 1,000,000 string score 	998.64
Sha-1 10,000,000 score 	995.02
Sha-1 overall score 	1010.06
Sha-256 small strings score 	1130.96
Sha-256 50k string score 	1168.59
Sha-256 1,000,000 string score 	1176.72
Sha-256 10,000,000 score 	1173.69
Sha-256 overall score 	1160.85
Sieve test 	1070.96
String concatenation test 	830.00
Transcendental floating-point function test 	526.32
Whetstone test 	1170.96
I/O test 	24.90
Random Assignment test (MB/s) 	40.21
Stream copy bandwidth (MB/s)  	2123.91
Stream scale bandwidth (MB/s) 	2101.02
Stream add bandwidth (MB/s)   	2334.82
Stream triad bandwidth (MB/s) 	2334.95
Memory Bandwidth Integer Read test (MB/s) 	2414.73
Memory Bandwidth Integer Read Reverse test (MB/s) 	2285.71
Memory Bandwidth Integer Read BP64 test (MB/s) 	3977.53
Memory Bandwidth Integer Read PREFETCHNTA test (MB/s) 	4330.72
Memory Bandwidth Integer Write test (MB/s) 	1889.47
Memory Bandwidth Integer Write Unroll test (MB/s) 	1260.31
Memory Bandwidth Integer Copy test (MB/s) 	2197.50
Memory Bandwidth Integer Add test (MB/s) 	2370.37
Memory Bandwidth Integer Scale test (MB/s) 	2085.51
Memory Bandwidth Integer Triad test (MB/s) 	2359.65
Memory Bandwidth Integer Copy BP64 test (MB/s) 	1740.19
Memory Bandwidth Double Read test (MB/s) 	3250.71
Memory Bandwidth Double Read Reverse test (MB/s) 	2909.09
Memory Bandwidth Double Write test (MB/s) 	1314.49
Memory Bandwidth Double Copy test (MB/s) 	2204.46
Memory Bandwidth memcpy test (MB/s) 	2098.36
=====================================
=====================================
Total elapsed time: 594.67 s.

For the other computer, Dell Inspiron 1200 with Celeron M 1.4GHz and 256MB memory, it's way much poorer result.

         *** SUMMARY  ***
=====================================
Aes encrypt 8kB test 	596.52
Aes decrypt 8kB test 	486.51
Aes encrypt 16kB test 	626.39
Aes decrypt 16kB test 	489.37
Aes encrypt 32kB test 	624.89
Aes decrypt 32kB test 	490.35
Aes encrypt 64kB test 	624.79
Aes decrypt 64kB test 	490.29
Aes encrypt 128kB test 	624.89
Aes decrypt 128kB test 	489.37
Aes encrypt 256kB test 	596.85
Aes decrypt 256kB test 	488.86
Aes encrypt 512kB test 	624.83
Aes decrypt 512kB test 	488.92
Aes encrypt 1,024kB test 	620.79
Aes decrypt 1,024kB test 	487.11
Aes encrypt 2,048kB test 	620.79
Aes decrypt 2,048kB test 	487.17
Aes encrypt 4,096kB test 	625.27
Aes decrypt 4,096kB test 	470.79
Aes encrypt 8,192kB test 	625.27
Aes decrypt 8,192kB test 	489.63
Aes encrypt 16,384kB test 	624.57
Aes decrypt 16,384kB test 	491.07
Dhrystone test 	1280.00
Double-precision floating-point arithmetic test 	734.10
Fib( 43 ) test 	658.63
FFT test 	361.28
FlopsFpuFloatAdd (MFLOPS) 	208.96
FlopsSseFloatAdd (MFLOPS) 	868.15
FlopsFpuFloatMul (MFLOPS) 	206.45
FlopsSseFloatMul (MFLOPS) 	890.37
FlopsFpuFloatDiv (MFLOPS) 	41.51
FlopsSseFloatDiv (MFLOPS) 	162.29
FlopsSseFloatMulAdd (MFLOPS) 	1863.17
FlopsFpuDoubleAdd (MFLOPS) 	453.00
FlopsSse2DoubleAdd (MFLOPS) 	492.31
FlopsSse2DoubleAddMax (MFLOPS) 	602.26
FlopsFpuDoubleMul (MFLOPS) 	445.19
FlopsSse2DoubleMul (MFLOPS) 	457.14
FlopsSse2DoubleMulMax (MFLOPS) 	495.87
FlopsFpuDoubleDiv (MFLOPS) 	43.84
FlopsSse2DoubleDiv (MFLOPS) 	44.52
FlopsFpuDoubleMulAdd3 (MFLOPS) 	690.15
FlopsSse2DoubleMulAdd (MFLOPS) 	711.11
FlopsSse2DoubleMulAdd2 (MFLOPS) 	1041.21
FlopsSse2DoubleMulAdd3 (MFLOPS) 	975.61
FlopsSse2DoubleMulAdd4 (MFLOPS) 	1050.04
FlopsSse2DoubleMulAdd5 (MFLOPS) 	952.38
Heap sort test 	413.71
Integer arithmetic test 	459.98
Integer matrix multiplication test 	506.86
Linpack (FLOPS DP Rolled) 	567.00
Linpack (FLOPS DP Unrolled) 	559.00
Pi test 	590.96
Queen test 	687.82
Sha-1 small strings score 	674.46
Sha-1 50k string score 	695.32
Sha-1 1,000,000 string score 	688.47
Sha-1 10,000,000 score 	656.42
Sha-1 overall score 	692.07
Sha-256 small strings score 	785.62
Sha-256 50k string score 	802.67
Sha-256 1,000,000 string score 	815.67
Sha-256 10,000,000 score 	815.86
Sha-256 overall score 	804.27
Sieve test 	725.45
String concatenation test 	33.59
Transcendental floating-point function test 	399.07
Whetstone test 	813.45
I/O test 	16.93
Random Assignment test (MB/s) 	24.07
Stream copy bandwidth (MB/s)  	983.69
Stream scale bandwidth (MB/s) 	1036.84
Stream add bandwidth (MB/s)   	1199.33
Stream triad bandwidth (MB/s) 	1200.60
Memory Bandwidth Integer Read test (MB/s) 	1510.29
Memory Bandwidth Integer Read Reverse test (MB/s) 	1497.01
Memory Bandwidth Integer Read BP64 test (MB/s) 	2164.80
Memory Bandwidth Integer Read PREFETCHNTA test (MB/s) 	2200.00
Memory Bandwidth Integer Write test (MB/s) 	1004.02
Memory Bandwidth Integer Write Unroll test (MB/s) 	619.91
Memory Bandwidth Integer Copy test (MB/s) 	1065.53
Memory Bandwidth Integer Add test (MB/s) 	1179.71
Memory Bandwidth Integer Scale test (MB/s) 	998.07
Memory Bandwidth Integer Triad test (MB/s) 	1157.52
Memory Bandwidth Integer Copy BP64 test (MB/s) 	854.89
Memory Bandwidth Double Read test (MB/s) 	1793.52
Memory Bandwidth Double Read Reverse test (MB/s) 	1799.57
Memory Bandwidth Double Write test (MB/s) 	585.48
Memory Bandwidth Double Copy test (MB/s) 	1076.75
Memory Bandwidth memcpy test (MB/s) 	1006.86
=====================================
=====================================
Total elapsed time: 904.39 s.

My sister's laptop

=== miniBench benchmark version 1.0===
         *** SUMMARY  ***
=====================================
Aes encrypt 8kB test 	491.15
Aes decrypt 8kB test 	469.98
Aes encrypt 16kB test 	478.01
Aes decrypt 16kB test 	442.62
Aes encrypt 32kB test 	488.21
Aes decrypt 32kB test 	448.14
Aes encrypt 64kB test 	462.21
Aes decrypt 64kB test 	448.14
Aes encrypt 128kB test 	414.10
Aes decrypt 128kB test 	461.34
Aes encrypt 256kB test 	481.84
Aes decrypt 256kB test 	470.50
Aes encrypt 512kB test 	484.76
Aes decrypt 512kB test 	480.91
Aes encrypt 1,024kB test 	477.88
Aes decrypt 1,024kB test 	472.93
Aes encrypt 2,048kB test 	467.24
Aes decrypt 2,048kB test 	452.64
Aes encrypt 4,096kB test 	468.73
Aes decrypt 4,096kB test 	481.33
Aes encrypt 8,192kB test 	475.41
Aes decrypt 8,192kB test 	484.28
Aes encrypt 16,384kB test 	481.55
Aes decrypt 16,384kB test 	479.34
Dhrystone test 	1603.21
Double-precision floating-point arithmetic test 	1159.57
Fib( 43 ) test 	1333.60
FFT test 	538.92
FlopsFpuFloatAdd (MFLOPS) 	185.46
FlopsSseFloatAdd (MFLOPS) 	453.77
FlopsFpuFloatMul (MFLOPS) 	171.80
FlopsSseFloatMul (MFLOPS) 	287.30
FlopsFpuFloatDiv (MFLOPS) 	78.00
FlopsSseFloatDiv (MFLOPS) 	193.14
FlopsSseFloatMulAdd (MFLOPS) 	2485.44
FlopsFpuDoubleAdd (MFLOPS) 	379.87
FlopsSse2DoubleAdd (MFLOPS) 	236.34
FlopsSse2DoubleAddMax (MFLOPS) 	279.72
FlopsFpuDoubleMul (MFLOPS) 	371.57
FlopsSse2DoubleMul (MFLOPS) 	214.53
FlopsSse2DoubleMulMax (MFLOPS) 	389.61
FlopsFpuDoubleDiv (MFLOPS) 	82.31
FlopsSse2DoubleDiv (MFLOPS) 	95.32
FlopsFpuDoubleMulAdd3 (MFLOPS) 	810.13
FlopsSse2DoubleMulAdd (MFLOPS) 	505.21
FlopsSse2DoubleMulAdd2 (MFLOPS) 	759.49
FlopsSse2DoubleMulAdd3 (MFLOPS) 	707.44
FlopsSse2DoubleMulAdd4 (MFLOPS) 	488.36
FlopsSse2DoubleMulAdd5 (MFLOPS) 	1412.80
Heap sort test 	549.20
Integer arithmetic test 	360.28
Integer matrix multiplication test 	387.53
Linpack (FLOPS DP Rolled) 	827.00
Linpack (FLOPS DP Unrolled) 	813.00
Pi test 	456.84
Queen test 	967.82
Sha-1 small strings score 	814.92
Sha-1 50k string score 	839.37
Sha-1 1,000,000 string score 	824.74
Sha-1 10,000,000 score 	832.07
Sha-1 overall score 	844.27
Sha-256 small strings score 	1160.68
Sha-256 50k string score 	1238.79
Sha-256 1,000,000 string score 	1235.87
Sha-256 10,000,000 score 	1232.60
Sha-256 overall score 	1216.92
Sieve test 	890.98
String concatenation test 	581.44
Transcendental floating-point function test 	547.34
Whetstone test 	1194.74
I/O test 	87.18
Random Assignment test (MB/s) 	27.79
Stream copy bandwidth (MB/s)  	2083.04
Stream scale bandwidth (MB/s) 	1974.68
Stream add bandwidth (MB/s)   	1535.04
Stream triad bandwidth (MB/s) 	1569.99
Memory Bandwidth Integer Read test (MB/s) 	1721.17
Memory Bandwidth Integer Read Reverse test (MB/s) 	1692.41
Memory Bandwidth Integer Read BP64 test (MB/s) 	3072.75
Memory Bandwidth Integer Read PREFETCHNTA test (MB/s) 	2861.05
Memory Bandwidth Integer Write test (MB/s) 	1544.70
Memory Bandwidth Integer Write Unroll test (MB/s) 	1563.00
Memory Bandwidth Integer Copy test (MB/s) 	2026.86
Memory Bandwidth Integer Add test (MB/s) 	2051.28
Memory Bandwidth Integer Scale test (MB/s) 	1976.28
Memory Bandwidth Integer Triad test (MB/s) 	2051.28
Memory Bandwidth Integer Copy BP64 test (MB/s) 	1927.53
Memory Bandwidth Double Read test (MB/s) 	2039.26
Memory Bandwidth Double Read Reverse test (MB/s) 	2039.00
Memory Bandwidth Double Write test (MB/s) 	1231.34
Memory Bandwidth Double Copy test (MB/s) 	2097.40
Memory Bandwidth memcpy test (MB/s) 	2011.06
=====================================
=====================================
Total elapsed time: 712.45 s.

More on Loops--Fusion and Unwinding

I wonder why it's not fusion and defusion...
Loop fusion should be a very common thing in PHP programming.
For example, 2 arrays;

for($i= 0;$i<30;++$i){
	$a[$i]++;
}
for($i = 0;$i<30;++$i){
	$b[$i]++;
}

FUSION!!!
for($i = 0l $i<30; ++$i){
	$a[$i]++;
	$b[$i]++;
}

On the other hand, loop unwinding is rarely seen in PHP.
Unwinding is to reduce the loop overhead by hand code the next few loop into the system.
For example
for($i=0;$i<101;++$i){
echo $i;
}

unwind
for($i=0;$i<101;$i+=4){
echo $i;
echo $i+1;
echo $i+2;
echo $i+3;
}

This technique is proven to have some speed gain, but, unwind is not all powerful, it comes with a cost.
For example, your code will be longer, and harder to read.
and, the unwinding require to have some previous knowledge about the loop. If the loop above is for 100 times instead of 101 times, the 2nd one will output 4 less items.
Even though it is possible to treat this by find the remain ones and use another loop to loop it though, but this make the code more complex than it should.

How do you loop though an non-associative arrays?

Often, I see people loop though non-associative arrays like this:

for($i=0;$i<count($array);$i++){
//do stuff to $array[$i];
}

List of my initial thoughts

so I would rewrite it as

 $count = count($array):
while($i<$count){
	//do stuff to $array[$i]
	++$i;
}

Ok, now many people would start moaning about how while loop is evil since for loop is easier to understand and takes less lines. use for loop if you want...

Usually, I would stop optimize since I can't see anything that can increase the speed. This morning, I had to test one of my thoughts about loop the array backward, so I got something like this:

 $count = count($array):
$i = $count;
while($i){
--$i;	
//do stuff to $array[$i]
 
}

This is the most optimized loop yet, one less comparison ($i<$count)
BIG DEAL!
benchmark the non optimized code with the last one, loop though 10000 key array. The non optimized loop is 3 times slower than the most optimized version.

imagedestroy VS unset

Boring days drives the sanest people do craziest jobs.
Glad I'm not one of them. Still, I benchmarked imagedestroy() and unset() function with a little script:

<?php
    $cool = imagecreatefrompng('imagefile.png');
    echo memory_get_usage(),'<br />';
 
      $timeparts = explode(' ',microtime());
  $starttime = $timeparts[1].substr($timeparts[0],1);
	  $cool = imagecreatefrompng('primespiral2000.png');
    //imagedestroy($cool);
    unset($cool);
      $timeparts = explode(' ',microtime());
  $endtime = $timeparts[1].substr($timeparts[0],1);
  echo bcsub($endtime,$starttime,6),'<br />';
    echo memory_get_usage(),'<br />';
    ?>

This script proves unset() uses less memory and it's the better choice. unset() result less memory is most likely because unset() actually delete the variable, while imagedestroy() only clean up the image structure inside GD.

    $cool = imagecreatefrompng('imagefile.png');
    imagedestroy($cool);
echo isset($cool); //1
 $cool = imagecreatefrompng('imagefile.png');
    unset($cool);
echo isset($cool); //echos nothing...

Sacrifice a slight amount of memory for the amazing speed

My last statement on array_shift about there is no point of using array_shift bring some controversy. Jeff Standen pointed out that queue and stack can be done by array_shift() and array_pop() with a reasonable speed.
After some testing, I agrees that array_pop() is reasonable, and it's the most fit way to implement a stack because the popping action runs in O(1) time. Although I still have to disagree with using array_shift() for queue.

To support what I said before, I created a queue system that uses slightly more memory but much faster than the array_shift() implementation.
The queue class I fixed up in a few minute

class queue{
	var $q;
	var $p=0;
 
	function enqueue($stuff=null){
		$this->q[]=$stuff;
	}
	function dequeue(){
		$a = $this->q[$this->p];
		unset($this->q[$this->p++]);
		return $a;
	}
}
$queue = new queue;
//add things to queue 
$queue->enqueue('stuff');
//take things out
$queue->dequeue();

The usual native array queue:

$queue = array();
//add something in the queue
$queue[] = 'stuff';
//take something out
array_shift($queue);

The queue class I fixed is slightly slower in the enqueue process, but each dequeuing is running in O(1) time, while the native way running in O(n) time.

Time to say my most famous line in past month1
I can't think of any time that you need to use array_shift().

  1. 1. "I live in Long Island state" comes 2nd.
Syndicate content
Honey Pot that kill bots