{"id":205,"date":"2007-08-12T22:00:37","date_gmt":"2007-08-12T21:00:37","guid":{"rendered":"https:\/\/www.tomfotherby.com\/blog\/index.php\/2007\/08\/finding-the-median-in-linear-time\/"},"modified":"2013-03-05T16:33:25","modified_gmt":"2013-03-05T16:33:25","slug":"finding-the-median-in-linear-time","status":"publish","type":"post","link":"https:\/\/www.tomfotherby.com\/blog\/index.php\/2007\/08\/finding-the-median-in-linear-time\/","title":{"rendered":"Finding the Median in Linear time"},"content":{"rendered":"<p>\n<img data-recalc-dims=\"1\" decoding=\"async\" alt=\"some eggs\" src=\"https:\/\/i0.wp.com\/www.tomfotherby.com\/Images\/median.jpeg?w=620&#038;ssl=1\" class=\"alignleft\" \/><br \/>\nHere are three C++ programs to find the median of a set of numbers. The first is short and simple. The last is fiddly and fast.<\/p>\n<h4>Simple but not linear (n*log n time):<\/h4>\n<p>Sort using STL sort (presumably quicksort), then find the middle value:<\/p>\n<pre><code>\r\n#include <iostream>\r\n#include <algorithm>\r\n\r\nint main() {\r\n  int A[7] = {23, 1, 33, -20, 6, 6, 9};\r\n  std::sort(A,A+7);\r\n  std::cout << \"The medium is: \" << A[3] << std::endl;\r\n}\r\n<\/code><\/pre>\n<h4>Linear time (But too much memory used in recursion):<\/h4>\n<p>The running time of the following algorithm is linear rather than logarithmic. The reasoning behind this is that each recursive call takes linear time (on average), and each recursive call reduces the size of the problem by a constant factor:<\/p>\n<pre><code>\r\n#include <iostream>\r\n#include <vector>\r\nusing namespace std;\r\n\r\n\/\/ Find Kth element using recursion\r\nint getKElem(vector<int> A,int K) {\r\n  int a = *A.begin();  \/\/ Pick randomly a number a from A = {a1, ..., an}.\r\n\r\n  \/\/ gather all numbers smaller, equal and bigger than a.\r\n  vector<int> S, E, B;\r\n  for(vector<int>::const_iterator ci = A.begin(); ci!=A.end(); ci++) {\r\n    if (*ci < a)      S.push_back(*ci);\r\n    else if (*ci > a) B.push_back(*ci);\r\n    else              E.push_back(*ci);\r\n  }\r\n\r\n  if (S.size() >= K) return getKElem(S,K);\r\n  else if (S.size()+E.size() >= K) return a;\r\n  else return getKElem(B,K-S.size()-E.size());\r\n}\r\n\r\nint main() {\r\n  int myints[7] = {23, 1, 33, -20, 6, 6, 9};\r\n  vector<int> A (myints, myints + sizeof(myints) \/ sizeof(int) );\r\n\r\n  cout << \"Element \" << A.size()\/2 << \" is: \" << getKElem(A,A.size()\/2) << endl;\r\n}\r\n<\/code><\/pre>\n<h4>Linear time with decent memory footprint:<\/h4>\n<p>Here is the algorithm with the recursion unraveled:<\/p>\n<pre><code>\r\n#include <iostream>\r\n#include <vector>\r\nusing namespace std;\r\n\r\n\/\/ Find Kth element without recusion\r\nint findKMedian(vector<int> A,int K) {\r\n  int l,m;\r\n  l=0;\r\n  m=A.size()-1;\r\n  while (l<m) {\r\n    int x=A[K];\r\n    int i=l;\r\n    int j=m;\r\n    do {\r\n      while (A[i]<x) i++;\r\n      while (x<A[j]) j--;\r\n      if (i<=j) {\r\n        int t = A[i]; A[i]= A[j]; A[j] = t; \/\/ Swap A[i] &#038; A[j]\r\n        i++; j--;\r\n      }\r\n    } while (i<=j);\r\n    if (j<K) l=i;\r\n    if (K<i) m=j;\r\n  }\r\n  return A[K];\r\n}\r\n\r\nint main() {\r\n  int myints[7] = {23, 1, 33, -20, 6, 6, 9};\r\n  vector<int> A (myints, myints + sizeof(myints) \/ sizeof(int) );\r\n\r\n  cout << \"Element \" << A.size()\/2 << \" is: \" << findKMedian(A,A.size()\/2) << endl;\r\n}\r\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Here are three C++ programs to find the median of a set of numbers. The first is short and simple. The last is fiddly and fast. Simple but not linear (n*log n time): Sort using STL sort (presumably quicksort), then find the middle value: #include #include int main() { int A[7] = {23, 1, 33, [&hellip;]<\/p>\n","protected":false},"author":52,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[3],"tags":[],"class_list":["post-205","post","type-post","status-publish","format-standard","hentry","category-techjournal"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/205","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/users\/52"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=205"}],"version-history":[{"count":3,"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/205\/revisions"}],"predecessor-version":[{"id":1951,"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/205\/revisions\/1951"}],"wp:attachment":[{"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tomfotherby.com\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}