Am programmer. Idk wtf that is. But if it converts easily to a datetime object, or if I can easily parse the parts out of it, I’m all for it. Idgaf if it’s easy to read as-is. Just make it efficient and make it sort predictably, and I’m all for it lol.
A single separator is better than a choice of separators to mean the same thing.
A space is not as apparent in a large log of data as a capital T
Human language is not as strict as a programming language. There is a reason you see people still using “alot” and “a lot”. That just proves it’s easy to overlook and commonly happens.
This is the killer for me. Most people promote ISO 8601 as a “definitive” date structure, when it actually supports a lot of different formats. What they actually want is usually RFC 3339.
2023-12.12T21:18-05 for time zone as central. The UTC time zone code at the end just tells you where the time is taken from. Usually Z is used since, well, it’s “universal,” but having a +13 or -06 or whatever else brings context, and allows computers to synchronize the string of text into a comparable time for event logs and such.
I’d rather have an explicit time zone any time a datetime is being passed around code as a string. Communicating it to a human is relatively safe since even if there’s a mistake, it’s directly visible. Before that last step, incorrect time zone parsing or implicit time zone assumptions in code that was written by “who knows” in the year “who knows” can be really annoying.
Thanks for the link. Reading it gave me a headache. Not because of the proposal, but because of the very clear explanation it includes of just how annoying time zones are. I never even thought about the fact that a time relative to a UTC timestamp isn’t uniquely associated with another UTC timestamp because the local UTC offset can change. It’s obvious when you say it, but now I’m wondering if I have more time zone bugs somewhere.
It only just hit me a month or two ago just what a timezone, as described by IANA, actually is.
I’m from the eastern half of the US state of North Dakota. We run on what we’d collloquially call “central time”, often abbreviated CST. That’s UTC-6:00 in winter and UTC-5:00 in summer (technically CDT, but whatever).
Long ago I had it passed down to me from on high that the IANA timezone indicator I should use for my local time is America/Chicago. Ok. Easy enough. Why Chicago, though? I long guessed because it happens to be one of the largest localities in the CST block? That is in fact the answer if you read the rationale of the tz database, but I did not know this at the time.
What threw me off, though, is that there are other localities that seemingly map to the same time zone block. Like America/Mexico_City, or America/Indianapolis. What’s up with those? When I set my computer system clock to them, they behave just like America/Chicago does. Why are these here? And why these cities, specifically?
Then, imagine the loop I was thrown for when I discovered three timezone definitions exclusive to North Dakota. Those being America/North_Dakota/Beulah, …/…/Center, and …/…/New_Salem. What the fuck…?? These are literal nowhere towns. Midwest America is the middle of nowhere. North Dakota is the middle of nowhere within the Midwest. And these three towns are the middle of nowhere to the rest of us in North Dakota. What is going on? Why are there three tiny timezones in the middle of nowhere in the middle of nowhere in the middle of nowhere? And they’re all right next to each other!
Then, it clicked. What do these three places have in common? These towns all used to be in the next timezone over (“Mountain Time”, MST), but later decided to jump over to CST.
There’s a humorous story for why this happened. Supposedly, drinkers in the capital city, Bismarck, would stay to bar close. Then, they’d all hop in their cars and drunk drive to the sister city across the river, Mandan, for an extra hour of fun, causing untold chaos in the process. The jump was allegedly to curb this. Sadly, that story apocryphal. In reality, it was just because it was economically favorable to be time-aligned with the state capital city. But I digress…
If you were, say, looking over historic records of events recorded in both Bismarck and Beulah, where records are always taken simultaneously, and your data happened to span back before this switchover, there would be an inexplicable point in time where after it the timestamps would match, but before it, they’d be offset. So, to encode that, Beulah gets its own unique timezone all to itself that indicates this historical switchover exists.
It also explains why there are three tiny timezones all right next to one another. Three counties participated in this switchover, and to make it happen, each one had to individually pass laws to enact it. These laws all took effect on slightly different dates. Thus, if we wish to capture the nuanced time shifts in all three counties, each county needs its own bespoke timezone.
IANA timezones aren’t just representations of all the time zones that currently exist. They are representations of every unique permutation of historic clock changes for every place on Earth. That’s fucking nuts! Knowing that, I went from being shocked that there are so many timezones to being shocked that the list of timezones is as short as it is!
What he means is, if you want to download the document from ISO that describes the standard, you have to pay a fee. Here’s their store page: click.
It’s about 190 USD for a 38 page document describing the rules of the standard. There’s another document with extensions for a similar price. Quite pricey for a PDF file obviously, and the RFC is free to download.
On the other hand, no one in the history of time has gone “hmm, I don’t know how ISO-8601 works, let me go buy this document from the ISO store to figure it out.” Most people just call datetime.isoformat() or whatever their library function is called.
This is about the old argument around how date strings are formatted.
MMDDYYYY vs YYYYMMDD, spaces or hyphens may differ. It’s an old and passionate argument (mostly due to the American approach of starting with the month being insane)
That’s a certain kind of skill I wouldn’t want the need to have. I just copy paste those timestamps into a terminal with date -d @ (and always forget the right syntax for that :D)
ISO standards need to be purchased to be viewed, RFCs are freely available requests for comment. The RFC 3339 format is effectively the same is the ISO format, except RFC 3339 allows for a space between the date and time components whereas the ISO format uses a “T” character to separate date and time components.
If you want to get real weird, RFCs are not standards but rather a request for other participants to comment on the proposal. RFCs tend to be pointed towards as de facto standards though, even before they become a BCP or STD.
This is the most junior developer comment I’ve seen in a while.
Nobody that’s competent thinks that’s shit is hard. That’s not the point.
The point is, it makes it easy to make mistakes. Somebody might see all of one type of strings, assume that’s the format, and forget to enclose the thing in quotes, causing mysterious bugs years later when a differently created date filters into the system. You might have a regex error, you might split incorrectly, you might make a query that works the wrong way and gives an incorrect aggregate, and none of that is due to lack of skill. It’s due to not knowing it’s the rfc standard, not the iso. It could be due to not even realizing the rfc allows for that or is different.
Software engineering in practice is not about making sure there is at least some way for people to use your library/standard/pattern. It’s about making sure the way to do it that’s most intuitive/obvious is also foolproof, easy, and efficient. Adding the space makes debugging harder and adds footguns which is exactly what good software engineers want to stay away from. Otherwise we’d all be writing in assembly. But since you aren’t, maybe you are the one with a skill issue. Either that or you really misunderstand this field.
The amount of things allowed by ISO 8601 is even more than what’s allowed by RFC 3339, if you take the time to look at ijmacd.github.io/rfc3339-iso8601/
On the command line, space is what separates each argument. If a path contains a space, you either have to quote the entire path, or use an escape character (e.g. the `` character in most shells, the backtick in Powershell because Microsoft is weird, or the character’s hexadecimal value), otherwise the path will be passed to the command as separate arguments. For example, cat hello world.txt would try to print the files hello and world.txt.
It is a good practice to minimize the character set used by filenames, and best to only use English alphanumeric characters and certain symbols like -, _, and .. Non-printable characters (like the lower half of ASCII), weird diacritics (like ő or ű), ligatures, or any characters that could be misinterpreted by a program should be avoided.
This is why byte-safe encodings, like base64 or percent-encoding, are important. Transmitting data directly as text runs the risk of mangling the characters because some program misinterpreted them.
but what does the command line matter for dates? sure every once in a while you’ll have to pass a date as an argument on the command line but I think usually that kind of data is handled by APIs without human intervention, so once these are set up properly, I don’t see the problem
<span style="color:#323232;">rsync -a "somedir" "somedir_backup_$(date)"
</span>
If the date command returns an RFC-3339-formatted string, the filename will contain a space. If, for example, you want to iterate over the files using for d in $(find…) and forget to set $IFS properly, it can cause issues.
Yeah? I once spent an entire week debugging a plaintext database because the software expected the record identifiers to be tokenized a certain way, but the original data source had spaces in those strings.
The software was the ISC DHCP server, the industry standard for decades and only EOL’d a year ago.
Sounds like a weekend that you could have saved if the software was just implemented properly and accepted spaces.
Something being an industry standard does not necessarily mean it’s good. Sometimes it just means it was the cheapest, or sometimes even just because it was used for so long. How long did it take for Torx to somewhat replace philips head screws despite being better in most cases?
I think date strings are made for human and machine readability. Similar to XML or JSON. So, why not improve systems so that we can have more human readable date strings? If you don’t care about human readability and want to make sure there is no confusion with spaces, you can just use epoch timestamps.
But $(date) does return a string with spaces, at least on every system I’ve ever used. And what’s so bad about the possibility of spaces in filenames? They’re slightly inconvenient in a command line, but I haven’t used a commuter this century that didn’t support spaces in filenames.
Ok, I just reread it. I don’t see what you think I’m missing. You mean an improperly written find command misbehaving? The fact that a different date format could prevent a bug from manifesting doesn’t seem like much of an argument.
Spaces can exist in filenames. The only problem is that they have to be escaped. As the comment that you reread explained, cat hello world.txt would print the files hello and world.txt. If you wanted to print the file “hello world.txt” you’d either need to quote it (cat “hello world.txt”) or escape the space (cat hello world.txt)
will process the space-separated parts of each path as separate items. I had to work around this issue just two days ago, it’s an obscure thing that not everyone will keep in mind.
I’m not exactly fond of the space either, but man, the T is noisy. They could’ve gone with an underscore or something, so it actually looks like two different sections.
Add comment