Advanced Sorting
Due: July 7, 2019 @ 11:59PM
Grab the handout tarfile from here: http://cs.millersville.edu/~wkillian/2019/summer/files/csci362/lab7b-advanced-sorts.tar
This lab is split into two parts. The first part (worth 50 points) is implementing various sorting algorithms
h_sort
shell_sort
quick_sort
merge_sort
heap_sort
radix_sort
counting_sort
The second part is writing a "driver" application which will run the various sorts, time them, and report the total number of comparisons, swaps, and/or moves (if relevant).
namespace SortUtils
{
struct Statistics {
size_t compares, swaps, moves;
...
};
// ----------------------------------------------------------------------
// [5] h-sort on a range
// - implement what was discussed in class
// - Update "stat" -- number of compares and swaps should increase.
// - You will need to split the compound logical test into two parts
//
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
h_sort (Iter begin, Iter end, Statistics& stat, size_t h = 1);
// [5] -- Shell sort!
// - Use the Knuth sequence (as discussed in class)
// - ultimately call h_sort a bunch
//
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
shell_sort (Iter begin, Iter end, Statistics& stat);
// ----------------------------------------------------------------------
// extend your implementation of median3 from the last lab to count the
// number of comparisons and swaps.
//template<typename Iter>
Iter
median3 (Iter begin, Iter end, Statistics& stat);
// extend your implementation of partition from the last lab to count the
// number of comparisons and swaps.
//
template<typename Iter, typename Value>
std::pair<Iter, Iter>
partition (Iter first, Iter last, Value const& pivot, Statistics& stat)
// [5] -- recursive implementation of quicksort.
// - Special case to add:
// If distance(begin, end) <= 10, just call insertion sort (h_sort = 1)
//
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
quick_sort (Iter begin, Iter end, Statistics& stat);
// ----------------------------------------------------------------------
// extend your implementation of merge from the last lab to count the
// number of comparisons and MOVES
//
// NOTE: whenever you have an assignment *out = *iter, you will want to say:
// *out = std::move(*iter);
//
template<typename Iter, typename OIter,
Requires (concepts::RandomAccessIterator, Iter),
Requires (concepts::RandomAccessIterator, OIter)>
OIter
merge (Iter first1, Iter last1, Iter first2, Iter last2, OIter out, Statistics& stat);
// [5] -- recursive implementation of mergesort.
//
// NOTE: after the call to merge, remember to MOVE the range [out, out + N) back to begin
// see: std::move (begin, end, dest) -- similar to std::copy()
// Be sure to add "N" to stat's count of moves
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
merge_sort (Iter begin, Iter end, Iter out, Statistics& stat);
// ----------------------------------------------------------------------
// [5] -- heap sort
// use std::make_heap and std::pop_heap to implement heap_sort
//
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
heap_sort (Iter begin, Iter end);
// ----------------------------------------------------------------------
// [10] LSD (least significant digit) radix sort.
//
// - Use a std::array of 10 std::vector's to perform the sort (one for
// each decimal digit).
//
// - BONUS: make it work for any base by including the following line
// near the top [+2pts]:
// static constexpr size_t BASE = 10 /* but could be changed*/;
//
// Note: you should only reference BASE and not "10". e.g. I should
// be able to change the 10 to 2, 16, or any other number and it
// should still properly sort.
//
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
radix_sort (Iter begin, Iter end, size_t d);
// ----------------------------------------------------------------------
// [10] Counting sort!
//
// - Do a SINGLE pass through to find the minimum and maximum values
// - figure out the size of the vector (hint: [min, max] contains how many?
// - allocate a vector of the appropriate size of type "int"
// - iterate through each value and increment the appropriate index (subtracting min)
// - compute the partial sum of the counts to determine which index to
// place values
// - reverse iterate through the range [end - 1, begin] and:
// 1. determine the index of the value (hint: same as first loop)
// 2. get the value stored in the counts array at the index -- this is where
// you need to copy the value
// 3. decrement the counts array element
// 4. assign the dereferenced iterator to "output"'s index
// - copy all elements from output back to [begin, end)
// - ONLY 2*(N-1) total comparisons of the values should be performed
//
template<typename Iter, Requires (concepts::RandomAccessIterator, Iter)>
void
counting_sort (Iter begin, Iter end);
} // end namespace SortUtils
N
denoting the size of the vector
to populate/generateseed
denoting the seed to use for random number generationmax
denoting the maximum value for random numbers. We will generate numbers in the range [0, max)
d
denoting the number of digits to use for radix sortWe will create random numbers by combining two pieces found within the C++ Standard Library
std::mt19937_t
(Mersenne Twister engine). The fancy number (19937) denotes the period of the random numbers generated $2^19937 - 1$std::uniform_int_distribution
std::mt19937_t gen (seed); // constructs a generator to use
std::uniform_int_distribution<int> dist(1, 6); // fair dice roll
int roll = dist(gen); // generates a single number [1, 6]
HINT: I recommend creating a function called "populateVector" that takes in the size and seed and returns a vector.
I've included a Timer
class for you to use. It's API is pretty simple:
class Timer {
public:
Timer(); // default construct -- also "starts" the timer
void start(); // starts the timer
void stop(); // stops the timer
double elapsedMs(); // gets the elapsed total time in milliseconds
};
For any sorting algorithm you wish to time, you'll need to do the following:
Statistics
instance by calling .reset()
Timer
Timer
I recommend creating a scoped block:
// running std::sort
{
std::vector<int> copy (original);
Statistics stat;
Timer t;
std::sort (copy.begin(), copy.end());
t.stop();
// print code goes here
}
// repeat above block for other algorithms
Algorithm | Time | Comparisons | Swaps | Moves |
---|---|---|---|---|
std::sort |
Yes | |||
shell_sort |
Yes | Yes | Yes | |
quick_sort |
Yes | Yes | Yes | |
merge_sort |
Yes | Yes | Yes | |
heap_sort |
Yes | |||
radix_sort |
Yes | |||
counting_sort |
Yes |
Note: whenever there is something in {}
, I want you to print the value
N ===> [read from user here]
seed ===> [read from user here]
max ===> [read from user here]
d ===> [read from user here]
Sorting {N} elements generated with a seed of {seed} in the range [0, {max})
std::sort
Time (ms): xxxx.xx
shell_sort
Time (ms): xxxx.xx
Comps: xxxx
Swaps: xxxx
quick_sort
Time (ms): xxxx.xx
Comps: xxxx
Swaps: xxxx
merge_sort
Time (ms): xxxx.xx
Comps: xxxx
Moves: xxxx
heap_sort
Time (ms): xxxx.xx
radix_sort (d={d})
Time (ms): xxxx.xx
counting_sort
Time (ms): xxxx.xx
With the driver written, I want you to run your driver with a few different settings:
N = 10'000'000 seed = 161 max = 10'000 d = 5
N = 10'000'000 seed = 162 max = 100'000 d = 6
N = 10'000'000 seed = 362 max = 1'000'000 d = 7
N = 10'000'000 seed = 330 max = 10'000'000 d = 8
N = 20'000'000 seed = 340 max = 20'000'000 d = 8
N = 40'000'000 seed = 370 max = 40'000'000 d = 8
N = 80'000'000 seed = 380 max = 80'000'000 d = 8
Discuss your results. Specifically talk about how each algorithm performs with regard to (1) size and (2) max. Mention swaps/moves and comparisons if relevant. Include copy-pasted output in your discussion. Include your discussion in WRITEUP.txt
Invoking make
should create the autograder.
There can be a lot of output shown from the autograder. Some of it could be overwhelming at times. I recommend adding a couple of arguments to the autograder to help ease the pain.
./autograder -a
Will run the autograder and stop as soon as the first FAILURE is found
./autograder -t
Will list all of the tags for each of the tests. If you want to run a specific subset of tags you can do so by stating the following:
./autograder [access],[iterator]
Will run all tests that have the access and iterator tags
./autograder --failed
Will list all failed tests at the end of the autograder output
./autograder --passed
Will list all passed tests at the end of the autograder output
Use good programming style:
Since this consists of multiple files, a tarball will be required for submission.
invoking make handin
should create handin.tar
Please submit handin.tar
to autolab
This lab is graded out of 100 points.
[45pts] Unit tests
[35pts] Driver program -- this must have clean and clear formatting. Use good style!
[20pts] Writeup
NOTE: if your program does not compile/run, the highest score you will earn will be a 20/100