Discussion:
The path to a Time type
Milktrader
2012-11-27 22:19:21 UTC
Permalink
What work is being done to create a Time type?

I did find two time-related functions in base/libc.jl ...

julia> lastday = strptime("%Y-%m-%d","2012-12-21")
1.356066e9

julia> typeof(lastday)
Float64

julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"

So presumably lastday is a Float64 representing seconds since Unix epoch.

My end goal is a type similar to zoo, xts in R, which is basically a matrix
indexed by time class vector.

Is anyone working on this and if not, what is a good place for me to start?

Regrads,

Dan

--
Avik Sengupta
2012-11-28 01:22:14 UTC
Permalink
There have been some efforts towards dates and times

There is a very simple datetime class
at https://github.com/aviks/julia/blob/dates/base/date.jl

and a more complex implementation is
at https://github.com/JeffreySarnoff/julia/tree/jtm/extras/jtm . However,
all this is work in progress.

Regards
-
Avik
Post by Milktrader
What work is being done to create a Time type?
I did find two time-related functions in base/libc.jl ...
julia> lastday = strptime("%Y-%m-%d","2012-12-21")
1.356066e9
julia> typeof(lastday)
Float64
julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"
So presumably lastday is a Float64 representing seconds since Unix epoch.
My end goal is a type similar to zoo, xts in R, which is basically a
matrix indexed by time class vector.
Is anyone working on this and if not, what is a good place for me to start?
Regrads,
Dan
--
Milktrader
2012-11-28 14:11:12 UTC
Permalink
Thanks, this is a good start.

Dan
Post by Avik Sengupta
There have been some efforts towards dates and times
There is a very simple datetime class at
https://github.com/aviks/julia/blob/dates/base/date.jl
and a more complex implementation is at
https://github.com/JeffreySarnoff/julia/tree/jtm/extras/jtm . However,
all this is work in progress.
Regards
-
Avik
Post by Milktrader
What work is being done to create a Time type?
I did find two time-related functions in base/libc.jl ...
julia> lastday = strptime("%Y-%m-%d","2012-12-21")
1.356066e9
julia> typeof(lastday)
Float64
julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"
So presumably lastday is a Float64 representing seconds since Unix epoch.
My end goal is a type similar to zoo, xts in R, which is basically a
matrix indexed by time class vector.
Is anyone working on this and if not, what is a good place for me to start?
Regrads,
Dan
--
Stefan Karpinski
2012-11-28 15:13:46 UTC
Permalink
This post might be inappropriate. Click to display it.
John Myles White
2012-11-28 16:22:25 UTC
Permalink
I feel like we should just implement something simple (maybe using Avik's work) and get people to push on it until it breaks, then propose a clean, finalized design after we see where pain points exist.

-- John
A 64-bit timestamp compatible with NumPy's still-experimental datetime64 type. They made the decision to ignore leap seconds for simplicity reasons which I feel rather strongly to be a poor choice and a bit of a copout. So I would advocate handling leap seconds correctly in the Julia implementation and otherwise being compatible.
A 128-bit timestamp with a similar design to the 64-bit one but with much greater simultaneous range and precision.
Corresponding timedelta types.
The first step here is to understand the NumPy datetime64 design better.
Thanks, this is a good start.
Dan
There have been some efforts towards dates and times
There is a very simple datetime class at https://github.com/aviks/julia/blob/dates/base/date.jl
and a more complex implementation is at https://github.com/JeffreySarnoff/julia/tree/jtm/extras/jtm . However, all this is work in progress.
Regards
-
Avik
What work is being done to create a Time type?
I did find two time-related functions in base/libc.jl ...
julia> lastday = strptime("%Y-%m-%d","2012-12-21")
1.356066e9
julia> typeof(lastday)
Float64
julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"
So presumably lastday is a Float64 representing seconds since Unix epoch.
My end goal is a type similar to zoo, xts in R, which is basically a matrix indexed by time class vector.
Is anyone working on this and if not, what is a good place for me to start?
Regrads,
Dan
--
--
--
Stefan Karpinski
2012-11-28 16:43:09 UTC
Permalink
I think the NumPy datetime64 thing is pretty well thought through, not that
complicated, and good for high-performance numerical stuff – which the
"time components" style representations are not good for (not compact,
operations are very slow). Since the design work is already done, once
someone groks that, implementing it shouldn't be all that hard in Julia as
the language lets you add bits types so easily.

Let's at least explore the NumPy-compatible option first. Dan, would you be
willing to do some research on the NumPy datetime64 type and figure out how
it works? My impression is that it represents timestamps as integers at
some unit of granularity. I'm unclear on whether it has flexible units of
granularity or if its fixed. I haven't found a real spec yet either (after
a very cursory bit of googling). We should maybe drop Wes McKinney a line
since I'm pretty sure he knows all about this.
Post by John Myles White
I feel like we should just implement something simple (maybe using Avik's
work) and get people to push on it until it breaks, then propose a clean,
finalized design after we see where pain points exist.
-- John
I've thought about this a bit and these are the time-related bits types I
1. A 64-bit timestamp compatible with NumPy's still-experimental
datetime64 type<http://docs.scipy.org/doc/numpy-dev/reference/arrays.datetime.html>.
They made the decision to ignore leap seconds for simplicity reasons which
I feel rather strongly to be a poor choice and a bit of a copout. So I
would advocate handling leap seconds correctly in the Julia implementation
and otherwise being compatible.
2. A 128-bit timestamp with a similar design to the 64-bit one but
with much greater simultaneous range and precision.
3. Corresponding timedelta types.
The first step here is to understand the NumPy datetime64 design better.
Post by Milktrader
Thanks, this is a good start.
Dan
Post by Avik Sengupta
There have been some efforts towards dates and times
There is a very simple datetime class at https://github.com/aviks/**
julia/blob/dates/base/date.jl<https://github.com/aviks/julia/blob/dates/base/date.jl>
and a more complex implementation is at https://github.com/**
JeffreySarnoff/julia/tree/jtm/**extras/jtm<https://github.com/JeffreySarnoff/julia/tree/jtm/extras/jtm>. However, all this is work in progress.
Regards
-
Avik
Post by Milktrader
What work is being done to create a Time type?
I did find two time-related functions in base/libc.jl ...
julia> lastday = strptime("%Y-%m-%d","2012-12-**21")
1.356066e9
julia> typeof(lastday)
Float64
julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"
So presumably lastday is a Float64 representing seconds since Unix epoch.
My end goal is a type similar to zoo, xts in R, which is basically a
matrix indexed by time class vector.
Is anyone working on this and if not, what is a good place for me to start?
Regrads,
Dan
--
--
--
--
Milktrader
2012-11-28 17:23:33 UTC
Permalink
Okay, I'll look into it. Pandas (Wes) and xtime (Jeff Ryan) should be good
places to look also.
Post by Stefan Karpinski
I think the NumPy datetime64 thing is pretty well thought through, not
that complicated, and good for high-performance numerical stuff – which the
"time components" style representations are not good for (not compact,
operations are very slow). Since the design work is already done, once
someone groks that, implementing it shouldn't be all that hard in Julia as
the language lets you add bits types so easily.
Let's at least explore the NumPy-compatible option first. Dan, would you
be willing to do some research on the NumPy datetime64 type and figure out
how it works? My impression is that it represents timestamps as integers at
some unit of granularity. I'm unclear on whether it has flexible units of
granularity or if its fixed. I haven't found a real spec yet either (after
a very cursory bit of googling). We should maybe drop Wes McKinney a line
since I'm pretty sure he knows all about this.
Post by John Myles White
I feel like we should just implement something simple (maybe using Avik's
work) and get people to push on it until it breaks, then propose a clean,
finalized design after we see where pain points exist.
-- John
I've thought about this a bit and these are the time-related bits types I
1. A 64-bit timestamp compatible with NumPy's still-experimental
datetime64 type<http://docs.scipy.org/doc/numpy-dev/reference/arrays.datetime.html>.
They made the decision to ignore leap seconds for simplicity reasons which
I feel rather strongly to be a poor choice and a bit of a copout. So I
would advocate handling leap seconds correctly in the Julia implementation
and otherwise being compatible.
2. A 128-bit timestamp with a similar design to the 64-bit one but
with much greater simultaneous range and precision.
3. Corresponding timedelta types.
The first step here is to understand the NumPy datetime64 design better.
Post by Milktrader
Thanks, this is a good start.
Dan
Post by Avik Sengupta
There have been some efforts towards dates and times
There is a very simple datetime class at https://github.com/aviks/**
julia/blob/dates/base/date.jl<https://github.com/aviks/julia/blob/dates/base/date.jl>
and a more complex implementation is at https://github.com/**
JeffreySarnoff/julia/tree/jtm/**extras/jtm<https://github.com/JeffreySarnoff/julia/tree/jtm/extras/jtm>. However, all this is work in progress.
Regards
-
Avik
Post by Milktrader
What work is being done to create a Time type?
I did find two time-related functions in base/libc.jl ...
julia> lastday = strptime("%Y-%m-%d","2012-12-**21")
1.356066e9
julia> typeof(lastday)
Float64
julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"
So presumably lastday is a Float64 representing seconds since Unix epoch.
My end goal is a type similar to zoo, xts in R, which is basically a
matrix indexed by time class vector.
Is anyone working on this and if not, what is a good place for me to start?
Regrads,
Dan
--
--
--
--
Stefan Karpinski
2012-11-28 21:06:33 UTC
Permalink
So I looked into datetime64 a bit more and found this "NEP" for it:

https://github.com/numpy/numpy/blob/master/doc/neps/datetime-proposal.rst

The proposal purports to add two types – datetime64 and timedelta64 – but
actually adds 28: a datetime64 and timedelta64 types for each supported
time unit, from years down to attoseconds. Moreover, there's no good way to
implement code for these generically, because there's a fixed set of units
and the ones greater than seconds don't always consist of a fixed number of
seconds (e.g. there's a variable number of seconds in a year). I was hoping
for something significantly less complicated.

A slightly different (partially compatible) design would allow a single
parametric type to be at the core of all time operations. Something like
this:

bitstype 64 TimeDelta{p}


This is a bit like a NumPy timedelta64 except that the the units are
seconds*10^-p. That way, you can generically write all the code for
TimeDeltas instead of having to write code for each unit separately. It's
also more flexible since you don't need to write any new code to get
precisions like zeptoseconds or yoctoseconds. Since 2^63 seconds is 292
billion years and the universe has only been around for 14 billion or so
years, it seems unlikely that there will be much call for computations with
p < 0, although we can, of course, write the code to can handle that too.
As a bonus, unix timestamps have the identical in-memory representation as
TimeDelta{0} values. For units that are smaller than seconds, NumPy
datetime64 values have identical in-memory representation as corresponding
TimeDelta types.

[Aside: I would argue that letting "1 year" be a time delta is
fundamentally broken because "1 year" is not any fixed length of time. If
you want to add or subtract years, you are really doing integer arithmetic
and then interpreting the result as a number of years, not actually adding
duration of time.]

Since an absolute time is really just a time delta relative to a fixed
time, you can pretty easily define Time{p} in terms of TimeDelta{p} and
inherit all the functionality, just changing the presentation (how they
print, etc.) and a few other behaviors. In essence, all time values are
deltas, but some of those deltas are "free" and some are "fixed" relative
to a specific point in time, namely the UNIX epoch. The fixed ones know how
to display themselves as human-readable times whereas the free ones display
themselves as durations.

Operations with TimeDeltas are dirt simple: they just work like integers.
Operations with mixed p values have to be a little more tricky, but not too
badly so. NumPy opted for a "keep the precision" rule when using mixed
values of p and that seems sane so we might as well do the same thing. The
hard parts of all of this is the presentation: "what are the year, month,
and day, hour, minute, and second portions of that number of attoseconds?"
If we can write the code to compute those things, then I think we're
basically good. Oh yeah, and time zones. Those bastards.
Post by Milktrader
Okay, I'll look into it. Pandas (Wes) and xtime (Jeff Ryan) should be good
places to look also.
Post by Stefan Karpinski
I think the NumPy datetime64 thing is pretty well thought through, not
that complicated, and good for high-performance numerical stuff – which the
"time components" style representations are not good for (not compact,
operations are very slow). Since the design work is already done, once
someone groks that, implementing it shouldn't be all that hard in Julia as
the language lets you add bits types so easily.
Let's at least explore the NumPy-compatible option first. Dan, would you
be willing to do some research on the NumPy datetime64 type and figure out
how it works? My impression is that it represents timestamps as integers at
some unit of granularity. I'm unclear on whether it has flexible units of
granularity or if its fixed. I haven't found a real spec yet either (after
a very cursory bit of googling). We should maybe drop Wes McKinney a line
since I'm pretty sure he knows all about this.
Post by John Myles White
I feel like we should just implement something simple (maybe using
Avik's work) and get people to push on it until it breaks, then propose a
clean, finalized design after we see where pain points exist.
-- John
I've thought about this a bit and these are the time-related bits types
1. A 64-bit timestamp compatible with NumPy's still-experimental
datetime64 type<http://docs.scipy.org/doc/numpy-dev/reference/arrays.datetime.html>.
They made the decision to ignore leap seconds for simplicity reasons which
I feel rather strongly to be a poor choice and a bit of a copout. So I
would advocate handling leap seconds correctly in the Julia implementation
and otherwise being compatible.
2. A 128-bit timestamp with a similar design to the 64-bit one but
with much greater simultaneous range and precision.
3. Corresponding timedelta types.
The first step here is to understand the NumPy datetime64 design better.
Post by Milktrader
Thanks, this is a good start.
Dan
Post by Avik Sengupta
There have been some efforts towards dates and times
There is a very simple datetime class at https://github.com/aviks/**ju
**lia/blob/dates/base/date.jl<https://github.com/aviks/julia/blob/dates/base/date.jl>
and a more complex implementation is at https://github.com/**JeffreyS*
*arnoff/julia/tree/jtm/**extras/**jtm<https://github.com/JeffreySarnoff/julia/tree/jtm/extras/jtm>. However, all this is work in progress.
Regards
-
Avik
Post by Milktrader
What work is being done to create a Time type?
I did find two time-related functions in base/libc.jl ...
julia> lastday = strptime("%Y-%m-%d","2012-12-**2**1")
1.356066e9
julia> typeof(lastday)
Float64
julia> strftime(lastday)
"Fri Dec 21 00:00:00 2012"
So presumably lastday is a Float64 representing seconds since Unix epoch.
My end goal is a type similar to zoo, xts in R, which is basically a
matrix indexed by time class vector.
Is anyone working on this and if not, what is a good place for me to start?
Regrads,
Dan
--
--
--
--
--
Avik Sengupta
2012-11-28 22:42:43 UTC
Permalink
For what its worth, I've created a pkg with my dates code, at
http://github.com/aviks/SimpleDate.jl . Its far from complete, and thus
does not exist in METADATA, but will hopefully help in the discussion.

That implementation uses a fixed delta of 1 day (actually a Julian Day),
but with parameterisable precision. It chooses a basis of a day since that
is the basis of most astronomical calendars.
The hard parts of all of this is the presentation: "what are the year,
month, and day, hour, minute, and second portions of that number of
attoseconds?"
The code to do this is within my package above. Again, the code is based on
a julian day, but you can easily convert to and from, eg, unix epochs.
leap seconds
Among other things, leap seconds cannot be determined algorithmically,
they'll need a lookup. Hence, they can cause a significant overhead in time
calculations. My code does not do leap seconds, but I believe Jeffrey's
does.
Time Zone
If timezone support is complicated, DST is yet another level! :)

One of the issues I've struggled with is the fact that timezone is an
additional attribute you need to store with a time, and so, without
bitmasking tricks, you end up using a composite type rather than a bits
type for a timezone recognising datetime object.

However, we already have julia code to query the unix timezone db.

--
Stefan Karpinski
2012-11-29 17:16:20 UTC
Permalink
This post might be inappropriate. Click to display it.
Patrick O'Leary
2012-11-29 17:21:40 UTC
Permalink
A time is a fixed actual point in time (UTC) which can be presented in
different ways, including translating it into different timezones.
Why UTC? If we're picking a base time reference, shouldn't we pick one
which doesn't have discontinuities?

--
Stefan Karpinski
2012-11-29 17:54:16 UTC
Permalink
You're right. I started looking into this right after I wrote that and
realized it's a horribly mistake. The basic thing to do here is TimeDelta
which is just an amount of time and doesn't need to be tied to any place or
frame of reference, so that's simple at least. Time becomes harder because
it needs a frame of reference and then conversion to something like UTC
requires handling leap seconds, etc. But the TimeDelta definition is at
least simple and clear.


On Thu, Nov 29, 2012 at 12:21 PM, Patrick O'Leary
Post by Patrick O'Leary
A time is a fixed actual point in time (UTC) which can be presented in
different ways, including translating it into different timezones.
Why UTC? If we're picking a base time reference, shouldn't we pick one
which doesn't have discontinuities?
--
--
Jeffrey Sarnoff
2012-11-30 06:01:36 UTC
Permalink
Hi, all -- I have been traveling,

Yes -- I have a flock of code for times and dates all timezone ready and
accompanying datafiles for handling timezones.

I have a datetime/timezone 64 bit bitstype with requisite scaffolding and
algorithmic support for a while.
At present I am re-modularizing source to comport with recent releases
(relying on modules without trepidation) and general simplifying.
I am reworking the in-memory datastore for timezone specifics, to reduce
the aggregate size of the datafiles required.

Regarding TimeDelta:

There are a number of easily overlooked subtleties involved in getting
interconversions and round-trip evaluations to be stable and correct.
I wrote some notes about implementing a TimeDelta bits type at
https://github.com/JuliaLang/julia/pull/698#issuecomment-10858431.
Additional complexity is introduced if one would like to handle Julia's
high-resolution timer within a 64bit TimeDelta bitstype.
The high resolution timer uses nanosecond resolution. The most utilitarian
TimeDelta resolution is 1/10 of a microsecond
(this resolution is explained in the issuecomment linked above).

The additional 1/100 scale required for nanosecond resolution bumps up
against encoding many years unless 128bit Ints are used.
The only way to get 1000s of years of nanoseconds in a 64bits is to
leverage Julia's support for parameterized bitstypes.
At present, I am trying to fold that in an intelligent way.

I am preparing a coherent source post for datetime handling with support
some common timezones preloaded.
The source to handle all standard timezones in relatively compact way
requires more refactoring.
To assist the momentous efforts of others, I will post whatever is clean on
Monday.
Post by Stefan Karpinski
You're right. I started looking into this right after I wrote that and
realized it's a horribly mistake. The basic thing to do here is TimeDelta
which is just an amount of time and doesn't need to be tied to any place or
frame of reference, so that's simple at least. Time becomes harder because
it needs a frame of reference and then conversion to something like UTC
requires handling leap seconds, etc. But the TimeDelta definition is at
least simple and clear.
Post by Patrick O'Leary
A time is a fixed actual point in time (UTC) which can be presented in
different ways, including translating it into different timezones.
Why UTC? If we're picking a base time reference, shouldn't we pick one
which doesn't have discontinuities?
--
--
Jeffrey Sarnoff
2012-12-04 03:59:00 UTC
Permalink
I am handwashing files and letting them dry at
https://github.com/jsarnoff/jtm.
The migration has started. The flock is large and pauses to feed its young.
Post by Jeffrey Sarnoff
Hi, all -- I have been traveling,
Yes -- I have a flock of code for times and dates all timezone ready and
accompanying datafiles for handling timezones.
I have a datetime/timezone 64 bit bitstype with requisite scaffolding and
algorithmic support for a while.
At present I am re-modularizing source to comport with recent releases
(relying on modules without trepidation) and general simplifying.
I am reworking the in-memory datastore for timezone specifics, to reduce
the aggregate size of the datafiles required.
There are a number of easily overlooked subtleties involved in getting
interconversions and round-trip evaluations to be stable and correct.
I wrote some notes about implementing a TimeDelta bits type at
https://github.com/JuliaLang/julia/pull/698#issuecomment-10858431.
Additional complexity is introduced if one would like to handle Julia's
high-resolution timer within a 64bit TimeDelta bitstype.
The high resolution timer uses nanosecond resolution. The most
utilitarian TimeDelta resolution is 1/10 of a microsecond
(this resolution is explained in the issuecomment linked above).
The additional 1/100 scale required for nanosecond resolution bumps up
against encoding many years unless 128bit Ints are used.
The only way to get 1000s of years of nanoseconds in a 64bits is to
leverage Julia's support for parameterized bitstypes.
At present, I am trying to fold that in an intelligent way.
I am preparing a coherent source post for datetime handling with support
some common timezones preloaded.
The source to handle all standard timezones in relatively compact way
requires more refactoring.
To assist the momentous efforts of others, I will post whatever is clean
on Monday.
Post by Stefan Karpinski
You're right. I started looking into this right after I wrote that and
realized it's a horribly mistake. The basic thing to do here is TimeDelta
which is just an amount of time and doesn't need to be tied to any place or
frame of reference, so that's simple at least. Time becomes harder because
it needs a frame of reference and then conversion to something like UTC
requires handling leap seconds, etc. But the TimeDelta definition is at
least simple and clear.
Post by Patrick O'Leary
A time is a fixed actual point in time (UTC) which can be presented in
different ways, including translating it into different timezones.
Why UTC? If we're picking a base time reference, shouldn't we pick one
which doesn't have discontinuities?
--
--

Avik Sengupta
2012-11-29 17:39:59 UTC
Permalink
Post by Stefan Karpinski
Seems like a very reasonable choice for dates. I wonder if it makes sense
for date calculations and time calculations to be separated.. They are
awfully different applications.
Yes, they are indeed very different use cases. I think it is reasonable to
have separate date and time classes, especially if you want arbitrary
precision times.
Post by Stefan Karpinski
However, we already have julia code to query the unix timezone db.
That is nice. Is that good enough? Jeffrey, you seem to have done a ton of
work on this stuff. Maybe we can ship with a very distilled timezone
database and leap second table? Depending on the system for this kind of
thing seems suboptimal.
Yes of course, I actually meant reading the IANA source db, not the
locally installed unix binary files.

--
Jacek Generowicz
2012-11-29 06:41:05 UTC
Permalink
I would argue that letting "1 year" be a time delta is fundamentally
broken because "1 year" is not any fixed length of time.
But it is a culturally important one.

Its implementation would be nontrivial and irregular, but whether it is
broken or not is up to the implementor.

--
Stefan Karpinski
2012-11-29 16:59:31 UTC
Permalink
On Thu, Nov 29, 2012 at 1:41 AM, Jacek Generowicz <
Post by Jacek Generowicz
I would argue that letting "1 year" be a time delta is fundamentally
broken because "1 year" is not any fixed length of time.
But it is a culturally important one.
Its implementation would be nontrivial and irregular, but whether it is
broken or not is up to the implementor.
Presenting an interface to this is a no-brainer – people want to add and
subtract years. We can even wrap Time objects in some higher-level type
that shows everything in terms of years. Or does whatever else is
convenient. Baking a broken concept into your core data types, however, is
a completely different matter. Obviously, it's not a show stopper since the
NumPy datetime64 stuff works just fine, but I suspect this is probably a
source of huge complication.

Basically, what I'm saying is that "year" shouldn't be a fundamental time
unit because there is no fixed time duration corresponding to a year.
Instead, it should be a presentation matter that is built on top of time
units that have an actual duration. Fortunately, with 64 bits, you're not
sacrificing anything by doing that because you can represent as many years
as you would ever need (585 billion of them) with time-precision down to
seconds.

--
Loading...