Here are three C++ programs to find the median of a set of numbers. The first is short and simple. The last is fiddly and fast.
Simple but not linear (n*log n time):
Sort using STL sort (presumably quicksort), then find the middle value:
#include
#include
int main() {
int A[7] = {23, 1, 33, -20, 6, 6, 9};
std::sort(A,A+7);
std::cout << "The medium is: " << A[3] << std::endl;
}
Linear time (But too much memory used in recursion):
The running time of the following algorithm is linear rather than logarithmic. The reasoning behind this is that each recursive call takes linear time (on average), and each recursive call reduces the size of the problem by a constant factor:
#include
#include
using namespace std;
// Find Kth element using recursion
int getKElem(vector A,int K) {
int a = *A.begin(); // Pick randomly a number a from A = {a1, ..., an}.
// gather all numbers smaller, equal and bigger than a.
vector S, E, B;
for(vector::const_iterator ci = A.begin(); ci!=A.end(); ci++) {
if (*ci < a) S.push_back(*ci);
else if (*ci > a) B.push_back(*ci);
else E.push_back(*ci);
}
if (S.size() >= K) return getKElem(S,K);
else if (S.size()+E.size() >= K) return a;
else return getKElem(B,K-S.size()-E.size());
}
int main() {
int myints[7] = {23, 1, 33, -20, 6, 6, 9};
vector A (myints, myints + sizeof(myints) / sizeof(int) );
cout << "Element " << A.size()/2 << " is: " << getKElem(A,A.size()/2) << endl;
}
Linear time with decent memory footprint:
Here is the algorithm with the recursion unraveled:
#include
#include
using namespace std;
// Find Kth element without recusion
int findKMedian(vector A,int K) {
int l,m;
l=0;
m=A.size()-1;
while (l A (myints, myints + sizeof(myints) / sizeof(int) );
cout << "Element " << A.size()/2 << " is: " << findKMedian(A,A.size()/2) << endl;
}
These algorithms do not run in linear time.
First of all,
int a = *A.begin();
— is a bad choice: for a sorted array this algorithm will find median in quadratic(!) time.
int a = A[random()%A.size()];
— a bit better, but worst case is still quadratic, though expected time is linear.
See wikipedia for better (and more complex) choices.
Or you can just use the BFPRT algorithm. guaranteed linear time, yet still elegant.
What joe said: https://en.wikipedia.org/wiki/Median_of_medians