Distributed Memory Radix Sort

Distributed Memory Radix Sort

Project reference: 1915

A Radix Sort is an array sorting algorithm that can run in O(N) time, compared with typical sorting algorithms like QuickSort which run in O(Nlog(N)) time. Typical sorting algorithms sort elements using pairwise comparisons to determine ordering, meaning they can be easily adapted to any weakly ordered data types. The Radix Sort, however, uses the bitwise representation of the data and some bookkeeping to determine ordering, limiting its scope. In the case of numerical simulation, it is typically a combination of integer and floating point data being sorted, meaning a bitwise representation is accessible. Another potential issue is the stack size used. However, with a sufficiently clever implementation, it may be possible to avoid this issue.

This project aims to implement a Radix Sort in MPI for an array distributed across a number of processes. Initial implementations will look at sorting fixed-width integers, and then moving on to arbitrary bit-wise representable data. Following on, supporting various array distribution and chunking approaches can be investigated. Upon achieving sufficient usefulness,  functionality, and performance, the project can be wrapped up into a library (finding immediate use in the ExSeisDat project).

Project Mentor:  Paddy Ó Conbhuí

Project Co-mentor: /

Site Co-ordinator: Simon Wong

Participant: Jordy Innocentius Ajanohoun

Learning Outcomes:
Solving problems using parallel and recursive decomposition in MPI. Writing clean and reusable C for use in other HPC projects.

Student Prerequisites (compulsory):
Intermediate C, Basic familiarity with MPI.

Student Prerequisites (desirable):

  • Familiarity with recursive algorithms.
  • Familiarity with sorting algorithms.
  • Familiarity with Scan, Reduction, and AllToAll algorithms in MPI.
  • Familiarity with creating sub-communicators in MPI.
  • Experience with Linux.

Training Materials:
C++Now 2017: M. Skarupke “Sorting in less than O(n log n): Generalizing and optimizing radix sort” https://www.youtube.com/watch?v=zqs87a_7zxw


  • Week 1: Training
  • Week 2: Learn about Radix Sort, serial implementations, distributed memory approaches. Set up an initial project, attempt initial implementation.
  • Week 3: Write Plan
  • Week 4: Implement Radix Sort for int8, int16, int32, and int64
  • Week 5: Implement Radix Sort for float, double, and user-defined functions returning a bitwise representation of their data.
  • Week 6: Profile + Optimize implementations.
  • Week 7: Write Report
  • Week 8: Write Report

Final Product Description:
A reusable library with a clean interface that uses Radix Sort to sort a distributed array using MPI.

Adapting the Project: Increasing the Difficulty:
The student could attempt to incorporate it into the ExSeisDat project, or go quite deep into the optimization of the sorting and communications.

Adapting the Project: Decreasing the Difficulty:
The student could stop at implementing int8, or int16, or anywhere along that path, and move on to profiling.


  • A desk. There’s one available next to Dr. Ó Conbhuí.
  • A computer. The student should bring their own.
  • Access to a distributed memory machine for profiling the performance of the sorting routine. (Can access to / time on Kay be arranged?)

Irish Centre for High-End Computing

Tagged with: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.