Code pondering – Number 1: Swapping two integers

Having done a fair few academic computer science courses and various coding jobs in a past-life I found that it was often thinking about the smallest code snippets carefully that you pick-up the raw building blocks for good, safe and efficient code. I haven’t written much code recently so have been brushing up my basics.

Compilers are now so good and computers so powerful that in practice you can write pretty terrible code and still make a perfectly performant and functional application.

One famous code problem is to swap two numbers without using an additional temporary variable. So you use only the memory already storing the numbers without requiring extra for the temporary variable. Consider….

 

Method 1: Using a temporary variable

int temp;

int a = 6;

int b = 5;

 

temp = a;

a = b;

b = temp;

 

Method 2: Avoiding a temporary variable

int a = 6;

int b = 5;

 

a = a + b;

b = a – b;

a = a – b;

 

OK now let’s think about this harder, which method is better?:

  • Method 2 seems clever and no unnecessary memory used

But:

  • It’s far less readable
  • If you really are just swapping two integers the extra memory for a single integer is negligible
  • the sum (a+b) can larger than the addressable space of an integer so overflow could occur, i.e. (a+b) could wrap around and the algorithm break!
  • If using real or another data type you could pick up machine level errors
  • I suspect many modern compilers would take care of this for you
  • With method 1; a is safely stored in temp should something go wrong
  • Both methods use the same number of copies but the amount of checking required to ensure no overflow occurs on (a+b) and similar really is more trouble than it’s worth and would make the code look messy

So when would you avoid a temporary variable:

  • When you are certain your data is going to be well below half the capacity of the data type (e.g. int) – more relevant if looking at facetted data in real format; i.e. scenarios where you can safely avoid extra checks and code
  • When you don’t care about machine level noise
  • When it isn’t just two pieces of data but a vast array of data perhaps representing a frame buffer or screen

Of course, we can now go on to ponder how this code could be made fail safe…. But it’s amazing how three lines of code make you think a bit harder…. And another discussion is using pointers …

I’d be interested to hear whether people actually use Method 2 (other than those using EDSAC2) and how much can be relied on to be taken away by compilers… and whether people still think hard about such things…