I'm trying to generate the sum of a collection of strings by calcuating each character's byte value into a sum:
$sum = 0;
foreach( $array as $item ) {
$bytes = unpack( 'C*', $item );
for( $i = 1; $i <= count( $bytes ); $i++ ) {
if( isset( $bytes[$i+1] ) ) {
$sum += $bytes[$i] - $bytes[$i+1];
} else {
$sum -= $bytes[$i];
}
}
}
return $sum;
The goal here is to compare past sums with newly generated sums (that is to say, check if there has been a new addition to the collection, suppose Item4
) and perform actions if yes.
As such, it's very important that:
- The algorithm can compute the sum irrespective of the order of the items.
- The algorithm doesn't get confused by a case where let's say
Item3
now becomestemI3
(and therefore the sum value is still the same, even though it's clearly not the sameItem
).
The whole secondary loop is to check against exactly that: loop through each byte (character) from each Item$i
and to the final summ, add the differene between the first and second bytes. If there isn't a next one and we are at the end of the string, simply subtract if from the whole sum.
As such, the following:
Input(s): ['Item1', 'Item2', 'Item3']
/ ['Item1', 'Item2', 'Item3']
, output: the same int
. Where as ['Item1', 'Item2', 'Ite3m']
outputs a different int
.
The performance as of now is as follows:
100000 items across 100 runs: 0.18s / 1000000 items across 100 runs: 2.05s
And although I understand that to parse and do these calculations for a million items in 2s is rather fast for PHP, I still think that, if you look at what the thing does, it's still slow.
Any way to speed this up?
count( $bytes )
over and over during each iteration of the nested loop? Try calling it once. (I guess this is a review and can be an answer, but it feels pretty meagar.) Or how about++$i
instead of$i++
? Can you decrement instead of increment to somehow avoidisset()
? \$\endgroup\$===
to compare$bytes
arrays? \$\endgroup\$