function varargout = histc(data, edges, dim) %HISTC histogram counts of data falling in bins defined by edges. % % [counts, bins] = histc(data, edges, dim) % % Inputs: % data LxNxMx... arbitrary array % edges Kx1 kth bin defined by edges(k) <= x < edges(k+1) % except last bin, which counts only x == edges(K) % dim 1x1 dimension D of data from which to extract vectors % % Outputs: % dimension D is K long, other dimensions match size(data) % | % counts LxKxMx... Each vector of counts gives the number of items from the % corresponding data vector falling into each bin. That is: % counts(l,k,m,...) == sum(bins(l,:,m,...)==k) % bins LxNxMx... integers 1..K giving the data's bin-assignments % % Clone of Matlab's histc for use with Octave, which currently has no % implementation. It works in Matlab too, but its standard library's mex file is % much faster, so use that! I was too lazy to write a .oct file, but that is the % right way to go for this (harder to maintain though). % % Note: sum(counts) might not equal prod(size(data)), even with extremal edges % of -Inf and Inf because of NaN's, which don't go in any bin. % % ----------------------------------------------------------------------- % % These 5 points, beyond extreme edges, are ignored % \ / % \ | | | | | / % data: xx | x x xx | | xxx| xx x % | | | | | % edge: 1 2 3 4 5 % % cum_counts: 2 3 6 6 9 % counts: 1 3 0 3 0 % \_ special count: no data exactly on % final boundary % % NOTE, Nov 2009: I never got around to making this pass all of my own weird % testcases. Then Octave got it's own histc. That passes even fewer test cases, % but if I ever find a tuit, I should fix probably that version. % Iain Murray, November 2007, July 2008 edges = edges(:); % Matlab seems to do this if any(isnan(edges)) error('This implementation of histc forbids NaNs in edges. It is unclear what you might want in this case.'); end if any(diff(edges) < 0) error('Edges must be in increasing order'); end if nargin < 3 dim = find(size(data)~=1, 1); % first non-singleton dimension (can be zero) if isempty(dim) dim = 2; % This is just what Matlab does end end if ~isempty(data) varargout = cell(1, max(1, nargout)); [varargout{:}] = dimfun(@(x) histc_vec(x, edges, dim), dim, data); else siz = size(data); siz(length(siz)+1:dim) = 1; siz(dim) = length(edges); counts = zeros(siz); ids = zeros(size(data)); varargout = {counts, ids}; end function [counts, bins] = histc_vec(data, edges, dim) K = length(edges); N = length(data); get_bins = nargout > 1; % Set up and filling last bin counts = zeros([ones(1, dim-1), K, 1]); if get_bins bins = zeros([ones(1, dim-1), N, 1]); end if isempty(edges) return % prevent crashes for this trivial case end the_end = (data == edges(end)); counts(end) = sum(the_end); if get_bins bins(the_end) = K; end % Valid ones valid = (edges(1) <= data) & (data < edges(end)); if any(valid) if get_bins [counts(1:end-1), bins(valid)] = histc_helper(data(valid), edges, 1); else counts(1:end-1) = histc_counts_only_helper(data(valid), edges); end end function [counts, bins] = histc_helper(data, edges, base) K = length(edges); N = length(data); if K == 2 counts = N; bins = base*ones(N, 1); else mid = ceil(K/2); set1 = (data < edges(mid)); set2 = ~set1; counts = zeros(K-1, 1); bins = zeros(N, 1); [counts(1:mid-1), bins(set1)] = histc_helper(data(set1), edges(1:mid), base); [counts(mid:end), bins(set2)] = histc_helper(data(set2), edges(mid:end), base + mid - 1); end function counts = histc_counts_only_helper(data, edges) % Just the above function with bins stuff stripped out for efficiency K = length(edges); N = length(data); if K == 2 counts = N; else mid = ceil(K/2); set1 = (data < edges(mid)); set2 = ~set1; counts = zeros(K-1, 1); counts(1:mid-1) = histc_counts_only_helper(data(set1), edges(1:mid)); counts(mid:end) = histc_counts_only_helper(data(set2), edges(mid:end)); end