Requirement: based on starting date and ending date, calculate all months (truncated dates like yyyy-mm-01
) that the range spans. Should include the month even if it’s not fully within the range.
Existing solution: iterate over all days within the range, truncate, and then remove duplicates using Unique():
Sequence(thisRow.[Start date], thisRow.[End date]).FormulaMap(
DateTimeTruncate(CurrentValue, "month")
).Unique()
My better solution: why iterate over every single day if we can safely skip 27 days? This way we’ll avoid unnecessary iterations and will only rarely have collisions, resulting in much fewer Unique()
checks. Note that we still need to append the final date in case it gets “jumped over”:
Sequence(thisRow.[Start date], thisRow.[End date], 28).ListCombine(thisRow.[End date]).FormulaMap(
DateTimeTruncate(CurrentValue, "month")
).Unique()
Performance improvement: 20x!
Moral of the story: always look for low-hanging performance optimization tricks
Even better (but more cumbersome) solution:
Both of the above are at best of O(N log N)
and at worst of O(N^2)
complexity. This is still not the most optimal, because the required calculation is pretty much linear, and we can avoid using the expensive unique check. If you have larger ranges of dates and want to optimize further, here’s what you can do:
- Calculate starting month as
Year * 12 + Month - 1
- Same for ending month
- Iterate over the range of month-indices, and in
FormulaMap
reconstrunct the date asDate(RoundDown(CurrentValue / 12), Remainder(CurrentValue, 12) + 1, 1)
This will be of proper O(N)
complexity.
Playground doc (see Date range section):