I am using the histogram kernel because it is very simple and clearly demonstrates some very important concepts in parallel programming: thread spawning, critical sections, atomic operations, barriers, false sharing, and thread join. Here is our problem statement:
Problem: Count the number of times each ASCII character occurs on a page of text.
Input: ASCII text stored as an array of characters.
Output: A histogram with 128 buckets –one for each ascii character– where each entry stores the number of occurrences of the corresponding ascii character on the page.