#157 Reducing Overlapping Ranges
Looking at solutions for how to aggregate a series of ranges while excluding any overlaps (e.g. set of date ranges)
Notes
Here’s the essential problem: given a set of ranges, calculate the net range coverage, excluding any overlap or duplication. The specific case I have in mind is a set of date ranges and I need to calculate the total duration without duplication/double counting.
The Ruby Range class offers some tantilizing functions
such as cover?, and Rails adds some spice e.g. overlaps?,
but they don’t get much closer to the goal.
Overlap of a date with an array of date intervals from stackoverflow is getting closer, and gave me a pointer to the range_operators that looks promising.
A Naïve Algorithm
Here’s the thinking:
- the data pairs are sorted by their first element
- then each subsquent pair will fall into one of three cases:
- start after the last end item, so can be added to the result in full
- end after the last end item, so the difference between the last end and the “new” end is added to the result
- end before the end of the last item, so can be ignored (already covered)
And here’s the implementation I used:
def naive_calculation(data)
data.sort_by { |pair| pair.first }.each_with_object(
{ duration: 0, last_end: 0 }
) do |pair, memo|
if pair.first > memo[:last_end]
memo[:duration] += pair.last - pair.first + 1
memo[:last_end] = pair.last
elsif pair.last > memo[:last_end]
memo[:duration] += pair.last - memo[:last_end]
memo[:last_end] = pair.last
end
end[:duration]
end
This passes all the test cases I’ve thrown at it so far, and actually turns out to be relatively straight-forward.
Using the Range Operators Gem
The range_operators gem offers a very useful method for this problem:
rangify. It reduces a set of ranges down to the smallest set of non-overlapping ranges.
e.g.
[5..10, 8..12, 6..8, 20..22].rangify
=> [5..12, 20..22]
So here’s the equivalent algorithm. The rangify call is the core, surrounded:
- pre: convert the 2-D array into an array of ranges
- post: sum the sizes of the resulting ranges
def range_operators_calculation(data)
data.map do |pair|
Range.new *pair
end.rangify.collect(&:size).sum
end
Test Run
See examples.rb for the implementation of both the naïve and range operators versions, tested with a couple of different datasets. Executing the file runs the test suite:
$ ./examples.rb
Run options: --seed 32192
# Running:
....
Finished in 0.001889s, 2117.5228 runs/s, 2117.5228 assertions/s.
4 runs, 4 assertions, 0 failures, 0 errors, 0 skips
Credits and References
- Range
- range_operators gem
- Overlap of a date with an array of date intervals from stackoverflow