Using unix timestamps for generating a date series based on intervals like
monthly or quaterly etc, can be dificult. How many days does a month have? 30?
31? 28? 29? No matter which number you pick: it will inevitably lead to the
problem of, what I call, time swaying
.
NOTE: I will be using dd/mm/yyyy
format for dates.
If we were to define a month as 31days
, the dates will sway forwards. Why does
the sway happen is left as an exercise for the reader to figure out. Similar
problems will be faced when generating a weekly series, quaterly series, yearly
series, etc.
In this writeup, I will define a set of algorithms and operations that will help us generate date series that are more intuitive to humans, at the expense of accuracy.
Motivation
Generating a date series is required whenever a schedule is to be defined. For example, “schedule a weekly meeting”, or “scan the systems every month”, or “remind me to clean the room every week”.
Us humans tend to ignore the nuances of the calendar, when saying things “every month” or “every week”; in order to reduce the cognitive burden it brings upon us. And I believe the computers should reflect this tendency and embrace our human nature.
So when generating a monthly series I would expect it to look like:
or a weekly series:
or a month’s end series:
Notice that these series are not accurate, that is, the gap between the dates changes. However, the dates are much more intuitive, as in, I can easily memorize them and anticipate them without thinking much.
The rest of this writeup will be motivated by the desire to find a way to generate a series that have this property.
Date
The most common method of defining a date is using a unix timestamp. However, the unix timestamp mashes up the day, month and year into a single number which will get in our way. So we will define a separate data structure for keeping dates.
We will use an ordered triple of (year, month, day)
where the day and month
will be 0-indexed. That is, 01/01/2024
would be represented as (2024, 0, 0)
.
Delta: Change in time
Our series is essentially an artithematic progression defined as:
Where a
is the series we want, a[0]
is the starting point, and d
is the difference between each date.
Let’s only consider the case of n = 1
.
Here, as an example, both a[0]
and d
can be integers as the operation +
is defined on integers. For our case we will clearly have a[0]
as a Date
,
but we can’t have d
be a date as it wouldn’t make logical sense to define Date + Date
.
Therefore, we must define d
as, say, Delta
; and we must define the
operation Date + Delta
to return a Date
.
Using the number of miliseconds or days as Delta
will force us to define
1month
, which we know is problematic. And as such we would like to avoid that.
So we will define Delta
in a similar fashion as our Date
.
We will use the ordered triple (d_years, d_months, d_days)
.
Now, we can naively (and incorrectly) define Date + Delta
as (year + d_years, months + d_months, days + d_days)
. We can build upon that by taking into
account that the month and day are cyclic and modulate between their ranges. We
can use modular arithematic to help us with that.
This is better. Aside from defining a month to be 31, we have another edge case.
Consider adding 1month
to 31/01/2024
. Based on our algorithm, we would get
31/02/2024
, which clearly doesn’t exist.
For both of these issues we will have to face our biggest enemy.
The Greogorian Calendar
If you have ever seen a calendar, then it’s most likely to be the gregorian calendar. For our purposes, we will look at it as the root of all evil. In the gregorian calendar, each month is assigned a number of days except for february, for which the number of days depends on the year.
Let’s define a function that returns us the number of days for a pair month/ year.
With this knowledge we can now go back and try to improve the way we get our
day. I have chosen to use similar logic as std::num::Saturaing
towards days_in(month, year)
instead of simply modulating, so as to
potentially skipping months.
The reason we recompute day
after evalauting month
and year
at the end is
to avoid the edge cases such as adding the delta (0, 1, 1)
to 29/01/2024
.
Finally, we can get rid of the hardcoded number of days in a month by replacing
it with days_in(month, year)
.
With this, we now have a fairly decent definition of Date + Delta
and we can
now calculate a[1]
for any given delta and a[0]
. Now it should be trivial to
extend to arbitrary n
’s, right? …right?
Associativity of Date + Delta + Delta + ...
Consider the case of n = 2
, we can formulate a[2]
as:
This works for deltas like, d = (0, 1, 0)
or d = (0, 0, 1)
but with d = (0, 0, 2)
or d = (0, 0, 7)
we run into the same Time Swaying
problem as before.
For example,
This is due to the fact that the range of days_in
is {28, 29, 30, 31}
and
2
or 7
doesn’t fully divide most of them.
We can pick the number of days in a month depending on the number of days in our delta to make sure they fit together perfectly. But that is a step back.
Another approach will be to add the deltas together and modulate them instead.
This introduces the worst concept in this entire system:
Cycles
Cycles are objects that build upon deltas to capture the concept of something repeating. In our system, they are simply the definition of addition for two deltas. But they also allow us to specify how to modulate the day component in our delta.
Let’s use an ordered tuple of (delta, cycle)
where delta
is the delta
we are gonna another delta to and cycle
is the number of days we are gonna
modulate the day over.
For example a weekly cycle will be the tuple ((0, 0, 7), 28)
.
Aside: Here cycle
can be derived implicitly from the day component of the
delta, but I am opting to keep it explicit until I am sure there’s no edge case
to consider.
We can add a delta to this cycle with the following definition.
We can also define some popular cycles like:
At this point it is trivial to extend from n = 2
to arbitrary n
s.
Putting it all together
I have placed this all together in a crate and wrapped them in a nice iterator interface. This allows me to write code like:
Which produces the dates:
or
which produces:
Conclusion
We were able to define some methods to achieve the objective that we set out for, but along the way we noticed there are a bunch of edge cases to care of. And because this is not a formal proof of the method, I can’t guarantee there are no other edge cases in our system.
However, I am quite sure that exposing this idea to the public will bring in some extremely valuable scrutiny and critics and improvements. I welcome you, reader, to scrutinize the concepts stated here so we can together get the applications to adopt more human centric methods of auto-scheduling.
To save you all the trouble, I will releasing an implementation of these methods as a crate in the near future.
Thank you for reading.