Vegastrike 0.5.1 rc1  1.0
Original sources for Vegastrike Evolved
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
RadixSort Class Reference

#include <IceRevisitedRadix.h>

Public Member Functions

 RadixSort ()
 
 ~RadixSort ()
 
RadixSortSort (const udword *input, udword nb, RadixHint hint=RADIX_SIGNED)
 
RadixSortSort (const float *input, udword nb)
 
inline_ const udwordGetRanks () const
 Access to results. mRanks is a list of indices in sorted order, i.e. in the order you may further process your data. More...
 
inline_ udwordGetRecyclable () const
 mIndices2 gets trashed on calling the sort routine, but otherwise you can recycle it the way you want. More...
 
udword GetUsedRam () const
 
inline_ udword GetNbTotalCalls () const
 Returns the total number of calls to the radix sorter. More...
 
inline_ udword GetNbHits () const
 Returns the number of eraly exits due to temporal coherence. More...
 

Detailed Description

Revisited Radix Sort. This is my new radix routine:

  • it uses indices and doesn't recopy the values anymore, hence wasting less ram
  • it creates all the histograms in one run instead of four
  • it sorts words faster than dwords and bytes faster than words
  • it correctly sorts negative floating-point values by patching the offsets
  • it automatically takes advantage of temporal coherence
  • multiple keys support is a side effect of temporal coherence
  • it may be worth recoding in asm... (mainly to use FCOMI, FCMOV, etc) [it's probably memory-bound anyway]

History:

  • 08.15.98: very first version
  • 04.04.00: recoded for the radix article
  • 12.xx.00: code lifting
  • 09.18.01: faster CHECK_PASS_VALIDITY thanks to Mark D. Shattuck (who provided other tips, not included here)
  • 10.11.01: added local ram support
  • 01.20.02: bugfix! In very particular cases the last pass was skipped in the float code-path, leading to incorrect sorting......
  • 01.02.02: - "mIndices" renamed => "mRanks". That's a rank sorter after all.
    • ranks are not "reset" anymore, but implicit on first calls
  • 07.05.02: - offsets rewritten with one less indirection.
  • 11.03.02: - "bool" replaced with RadixHint enum
Author
Pierre Terdiman
Version
1.4
Date
August, 15, 1998

Definition at line 26 of file IceRevisitedRadix.h.

Constructor & Destructor Documentation

RadixSort::RadixSort ( )

Constructor.

Definition at line 171 of file IceRevisitedRadix.cpp.

171  : mCurrentSize(0), mRanks(null), mRanks2(null), mTotalCalls(0), mNbHits(0)
172 {
173 #ifndef RADIX_LOCAL_RAM
174  // Allocate input-independent ram
175  mHistogram = new udword[256*4];
176  mOffset = new udword[256];
177 #endif
178  // Initialize indices
180 }
RadixSort::~RadixSort ( )

Destructor.

Definition at line 187 of file IceRevisitedRadix.cpp.

188 {
189  // Release everything
190 #ifndef RADIX_LOCAL_RAM
191  DELETEARRAY(mOffset);
192  DELETEARRAY(mHistogram);
193 #endif
194  DELETEARRAY(mRanks2);
195  DELETEARRAY(mRanks);
196 }

Member Function Documentation

inline_ udword RadixSort::GetNbHits ( ) const
inline

Returns the number of eraly exits due to temporal coherence.

Definition at line 47 of file IceRevisitedRadix.h.

47 { return mNbHits; }
inline_ udword RadixSort::GetNbTotalCalls ( ) const
inline

Returns the total number of calls to the radix sorter.

Definition at line 45 of file IceRevisitedRadix.h.

45 { return mTotalCalls; }
inline_ const udword* RadixSort::GetRanks ( ) const
inline

Access to results. mRanks is a list of indices in sorted order, i.e. in the order you may further process your data.

Definition at line 37 of file IceRevisitedRadix.h.

Referenced by Opcode::SweepAndPrune::Init().

37 { return mRanks; }
inline_ udword* RadixSort::GetRecyclable ( ) const
inline

mIndices2 gets trashed on calling the sort routine, but otherwise you can recycle it the way you want.

Definition at line 40 of file IceRevisitedRadix.h.

40 { return mRanks2; }
udword RadixSort::GetUsedRam ( ) const

Gets the ram used.

Returns
memory used in bytes

Definition at line 514 of file IceRevisitedRadix.cpp.

515 {
516  udword UsedRam = sizeof(RadixSort);
517 #ifndef RADIX_LOCAL_RAM
518  UsedRam += 256*4*sizeof(udword); // Histograms
519  UsedRam += 256*sizeof(udword); // Offsets
520 #endif
521  UsedRam += 2*CURRENT_SIZE*sizeof(udword); // 2 lists of indices
522  return UsedRam;
523 }
RadixSort & RadixSort::Sort ( const udword input,
udword  nb,
RadixHint  hint = RADIX_SIGNED 
)

Main sort routine. This one is for integer values. After the call, mRanks contains a list of indices in sorted order, i.e. in the order you may process your data.

Parameters
input[in] a list of integer values to sort
nb[in] number of values to sort, must be < 2^31
hint[in] RADIX_SIGNED to handle negative values, RADIX_UNSIGNED if you know your input buffer only contains positive values
Returns
Self-Reference

Definition at line 239 of file IceRevisitedRadix.cpp.

Referenced by Opcode::SweepAndPrune::Init().

240 {
241  // Checkings
242  if(!input || !nb || nb&0x80000000) return *this;
243 
244  // Stats
245  mTotalCalls++;
246 
247  // Resize lists if needed
248  CheckResize(nb);
249 
250 #ifdef RADIX_LOCAL_RAM
251  // Allocate histograms & offsets on the stack
252  udword mHistogram[256*4];
253 // udword mOffset[256];
254  udword* mLink[256];
255 #endif
256 
257  // Create histograms (counters). Counters for all passes are created in one run.
258  // Pros: read input buffer once instead of four times
259  // Cons: mHistogram is 4Kb instead of 1Kb
260  // We must take care of signed/unsigned values for temporal coherence.... I just
261  // have 2 code paths even if just a single opcode changes. Self-modifying code, someone?
262  if(hint==RADIX_UNSIGNED) { CREATE_HISTOGRAMS(udword, input); }
263  else { CREATE_HISTOGRAMS(sdword, input); }
264 
265  // Compute #negative values involved if needed
266  udword NbNegativeValues = 0;
267  if(hint==RADIX_SIGNED)
268  {
269  // An efficient way to compute the number of negatives values we'll have to deal with is simply to sum the 128
270  // last values of the last histogram. Last histogram because that's the one for the Most Significant Byte,
271  // responsible for the sign. 128 last values because the 128 first ones are related to positive numbers.
272  udword* h3= &mHistogram[768];
273  for(udword i=128;i<256;i++) NbNegativeValues += h3[i]; // 768 for last histogram, 128 for negative part
274  }
275 
276  // Radix sort, j is the pass number (0=LSB, 3=MSB)
277  for(udword j=0;j<4;j++)
278  {
280 
281  // Sometimes the fourth (negative) pass is skipped because all numbers are negative and the MSB is 0xFF (for example). This is
282  // not a problem, numbers are correctly sorted anyway.
283  if(PerformPass)
284  {
285  // Should we care about negative values?
286  if(j!=3 || hint==RADIX_UNSIGNED)
287  {
288  // Here we deal with positive values only
289 
290  // Create offsets
291 // mOffset[0] = 0;
292 // for(udword i=1;i<256;i++) mOffset[i] = mOffset[i-1] + CurCount[i-1];
293  mLink[0] = mRanks2;
294  for(udword i=1;i<256;i++) mLink[i] = mLink[i-1] + CurCount[i-1];
295  }
296  else
297  {
298  // This is a special case to correctly handle negative integers. They're sorted in the right order but at the wrong place.
299 
300  udword i;
301  // Create biased offsets, in order for negative numbers to be sorted as well
302 // mOffset[0] = NbNegativeValues; // First positive number takes place after the negative ones
303  mLink[0] = &mRanks2[NbNegativeValues]; // First positive number takes place after the negative ones
304 // for(udword i=1;i<128;i++) mOffset[i] = mOffset[i-1] + CurCount[i-1]; // 1 to 128 for positive numbers
305  for(i=1;i<128;i++) mLink[i] = mLink[i-1] + CurCount[i-1]; // 1 to 128 for positive numbers
306 
307  // Fixing the wrong place for negative values
308 // mOffset[128] = 0;
309  mLink[128] = mRanks2;
310 // for(i=129;i<256;i++) mOffset[i] = mOffset[i-1] + CurCount[i-1];
311  for(i=129;i<256;i++) mLink[i] = mLink[i-1] + CurCount[i-1];
312  }
313 
314  // Perform Radix Sort
315  ubyte* InputBytes = (ubyte*)input;
316  InputBytes += j;
317  if(INVALID_RANKS)
318  {
319 // for(udword i=0;i<nb;i++) mRanks2[mOffset[InputBytes[i<<2]]++] = i;
320  for(udword i=0;i<nb;i++) *mLink[InputBytes[i<<2]]++ = i;
322  }
323  else
324  {
325  udword* Indices = mRanks;
326  udword* IndicesEnd = &mRanks[nb];
327  while(Indices!=IndicesEnd)
328  {
329  udword id = *Indices++;
330 // mRanks2[mOffset[InputBytes[id<<2]]++] = id;
331  *mLink[InputBytes[id<<2]]++ = id;
332  }
333  }
334 
335  // Swap pointers for next pass. Valid indices - the most recent ones - are in mRanks after the swap.
336  udword* Tmp = mRanks; mRanks = mRanks2; mRanks2 = Tmp;
337  }
338  }
339  return *this;
340 }
RadixSort & RadixSort::Sort ( const float input2,
udword  nb 
)

Main sort routine. This one is for floating-point values. After the call, mRanks contains a list of indices in sorted order, i.e. in the order you may process your data.

Parameters
input[in] a list of floating-point values to sort
nb[in] number of values to sort, must be < 2^31
Returns
Self-Reference
Warning
only sorts IEEE floating-point values

Definition at line 352 of file IceRevisitedRadix.cpp.

353 {
354  // Checkings
355  if(!input2 || !nb || nb&0x80000000) return *this;
356 
357  // Stats
358  mTotalCalls++;
359 
360  udword* input = (udword*)input2;
361 
362  // Resize lists if needed
363  CheckResize(nb);
364 
365 #ifdef RADIX_LOCAL_RAM
366  // Allocate histograms & offsets on the stack
367  udword mHistogram[256*4];
368 // udword mOffset[256];
369  udword* mLink[256];
370 #endif
371 
372  // Create histograms (counters). Counters for all passes are created in one run.
373  // Pros: read input buffer once instead of four times
374  // Cons: mHistogram is 4Kb instead of 1Kb
375  // Floating-point values are always supposed to be signed values, so there's only one code path there.
376  // Please note the floating point comparison needed for temporal coherence! Although the resulting asm code
377  // is dreadful, this is surprisingly not such a performance hit - well, I suppose that's a big one on first
378  // generation Pentiums....We can't make comparison on integer representations because, as Chris said, it just
379  // wouldn't work with mixed positive/negative values....
380  { CREATE_HISTOGRAMS(float, input2); }
381 
382  // Compute #negative values involved if needed
383  udword NbNegativeValues = 0;
384  // An efficient way to compute the number of negatives values we'll have to deal with is simply to sum the 128
385  // last values of the last histogram. Last histogram because that's the one for the Most Significant Byte,
386  // responsible for the sign. 128 last values because the 128 first ones are related to positive numbers.
387  udword* h3= &mHistogram[768];
388  udword i;
389  for(i=128;i<256;i++) NbNegativeValues += h3[i]; // 768 for last histogram, 128 for negative part
390 
391  // Radix sort, j is the pass number (0=LSB, 3=MSB)
392  for(udword j=0;j<4;j++)
393  {
394  // Should we care about negative values?
395  if(j!=3)
396  {
397  // Here we deal with positive values only
399 
400  if(PerformPass)
401  {
402  // Create offsets
403 // mOffset[0] = 0;
404  mLink[0] = mRanks2;
405 // for(udword i=1;i<256;i++) mOffset[i] = mOffset[i-1] + CurCount[i-1];
406  for(udword i=1;i<256;i++) mLink[i] = mLink[i-1] + CurCount[i-1];
407 
408  // Perform Radix Sort
409  ubyte* InputBytes = (ubyte*)input;
410  InputBytes += j;
411  if(INVALID_RANKS)
412  {
413 // for(i=0;i<nb;i++) mRanks2[mOffset[InputBytes[i<<2]]++] = i;
414  for(udword i=0;i<nb;i++) *mLink[InputBytes[i<<2]]++ = i;
416  }
417  else
418  {
419  udword* Indices = mRanks;
420  udword* IndicesEnd = &mRanks[nb];
421  while(Indices!=IndicesEnd)
422  {
423  udword id = *Indices++;
424 // mRanks2[mOffset[InputBytes[id<<2]]++] = id;
425  *mLink[InputBytes[id<<2]]++ = id;
426  }
427  }
428 
429  // Swap pointers for next pass. Valid indices - the most recent ones - are in mRanks after the swap.
430  udword* Tmp = mRanks; mRanks = mRanks2; mRanks2 = Tmp;
431  }
432  }
433  else
434  {
435  // This is a special case to correctly handle negative values
437 
438  if(PerformPass)
439  {
440  // Create biased offsets, in order for negative numbers to be sorted as well
441 // mOffset[0] = NbNegativeValues; // First positive number takes place after the negative ones
442  mLink[0] = &mRanks2[NbNegativeValues]; // First positive number takes place after the negative ones
443 // for(udword i=1;i<128;i++) mOffset[i] = mOffset[i-1] + CurCount[i-1]; // 1 to 128 for positive numbers
444  for(i=1;i<128;i++) mLink[i] = mLink[i-1] + CurCount[i-1]; // 1 to 128 for positive numbers
445 
446  // We must reverse the sorting order for negative numbers!
447 // mOffset[255] = 0;
448  mLink[255] = mRanks2;
449 // for(i=0;i<127;i++) mOffset[254-i] = mOffset[255-i] + CurCount[255-i]; // Fixing the wrong order for negative values
450  for(i=0;i<127;i++) mLink[254-i] = mLink[255-i] + CurCount[255-i]; // Fixing the wrong order for negative values
451 // for(i=128;i<256;i++) mOffset[i] += CurCount[i]; // Fixing the wrong place for negative values
452  for(i=128;i<256;i++) mLink[i] += CurCount[i]; // Fixing the wrong place for negative values
453 
454  // Perform Radix Sort
455  if(INVALID_RANKS)
456  {
457  for(udword i=0;i<nb;i++)
458  {
459  udword Radix = input[i]>>24; // Radix byte, same as above. AND is useless here (udword).
460  // ### cmp to be killed. Not good. Later.
461 // if(Radix<128) mRanks2[mOffset[Radix]++] = i; // Number is positive, same as above
462 // else mRanks2[--mOffset[Radix]] = i; // Number is negative, flip the sorting order
463  if(Radix<128) *mLink[Radix]++ = i; // Number is positive, same as above
464  else *(--mLink[Radix]) = i; // Number is negative, flip the sorting order
465  }
467  }
468  else
469  {
470  for(udword i=0;i<nb;i++)
471  {
472  udword Radix = input[mRanks[i]]>>24; // Radix byte, same as above. AND is useless here (udword).
473  // ### cmp to be killed. Not good. Later.
474 // if(Radix<128) mRanks2[mOffset[Radix]++] = mRanks[i]; // Number is positive, same as above
475 // else mRanks2[--mOffset[Radix]] = mRanks[i]; // Number is negative, flip the sorting order
476  if(Radix<128) *mLink[Radix]++ = mRanks[i]; // Number is positive, same as above
477  else *(--mLink[Radix]) = mRanks[i]; // Number is negative, flip the sorting order
478  }
479  }
480  // Swap pointers for next pass. Valid indices - the most recent ones - are in mRanks after the swap.
481  udword* Tmp = mRanks; mRanks = mRanks2; mRanks2 = Tmp;
482  }
483  else
484  {
485  // The pass is useless, yet we still have to reverse the order of current list if all values are negative.
486  if(UniqueVal>=128)
487  {
488  if(INVALID_RANKS)
489  {
490  // ###Possible?
491  for(udword i=0;i<nb;i++) mRanks2[i] = nb-i-1;
493  }
494  else
495  {
496  for(udword i=0;i<nb;i++) mRanks2[i] = mRanks[nb-i-1];
497  }
498 
499  // Swap pointers for next pass. Valid indices - the most recent ones - are in mRanks after the swap.
500  udword* Tmp = mRanks; mRanks = mRanks2; mRanks2 = Tmp;
501  }
502  }
503  }
504  }
505  return *this;
506 }

The documentation for this class was generated from the following files: