dating

The naming of data is a practical matter. It is an ordinary game.

We can give our files names which unambiguously are sorted by the computer in date time order. This is probably the best method for matching data with notebook entries or practically finding experiments.

There are simple rules regarding these numbers. The naming rules to keep our files ordered are consistent format of fixed digits from largest to smallest.

No delimiteres are necessary because the digits are fixed. A year is always four digits (at least into the useful future). Each month and each day is fixed two digits, respectively in the range of 01 to 12 and 01 to 31. Similar with the time; hours are always two digits in the range of 00 to 23 and minutes in the range of 00 to 59.

Another way to express the format is YYYYMMDDHHMM. Seconds could be added. They could be expressed as four digits, as milliseconds, if required, or you could go to even finer units, but I have not been doing this.

Padding single digits with zeros is required. 198911 is always November 1989. It is never 1 January 1989; this would be 19890101. Note that removing finer numbers is consistent with this system. Omitting hours and minutes in one file name and including them in another works fine for sorting.

Because these numbers are ordered from largest to smallest, from year at the left to the next smaller unit, months, down to the smallest, minutes, they can easily be sorted alphabetically or numerically which keeps them in chonological order. This is a critical aspect of the system; in addition to having no ambiguity, it is trivial for a computer to keep all files in order.

After the date, which should be the first part of the name for this to work, anything else can be added. For instance, all the following data from 4 March 2021 will be grouped and easy to find:
20210304 GFP Hela cells confocal
20210304 personal pics from phone
20210304 Western blots

This is not the ISO standard, but the underlying idea is the same. Name the data in a consistent unambiguous manner that facilitates sorting. A Google search returns, "The most common ISO Date Format yyyy-MM-dd — for example, '2000-10-31'." The hyphens make is easier to read, but are not necessary for the computer. I argue that hyphens unnecessarily make the names longer, but really all that matters is consistency; always use hyphens or never use hyphens. (Never use periods.)

The ISO standard 8601 is meant for insternational data and, therefore, includes time zones and text cues for variability. Practically, we do not need these. For instance, we do not need "T" to signify where the days stop and hours begin. Also, we are not dating based on weeks. And we are breaking time into months and days; an alternative method is year followed by a three digit day, but this is not how we normally think. A problem with the strictest ISO format is that it includes special characters, such as "2022-04-03T10:08:16+00:00," but we cannot use these with our computer operating systems. ":" and "+" should not be used, but "20220403T100816Z" or "202204031008" would work for our purposes, or change the time to be a count of seconds or milliseconds if you are going this fine.

The naming rules to keep our files ordered are consistent format of fixed digits from largest to smallest.

-----------

I learned this method not in a scientific context, but in a historical one. To recreate a timeline of historical events, each fact could be dated and easily sorted by date to try to suss out cause and effect. In the 1981 published novel The Chaneysvill Incident by David Bradley, the protagonist recreates events by keeping facts on index cards, each one numbered. This is also how telling the tale, the chapters, are named. (Also, he uses an additional element, the name of the day of the week, but this is superfluous and inconsistent for our purposes, and I forgot about it until I reopened the book 40 years after first reading it.)

I realized early on that the key to organizing data was a strict time ordering of events or, even more important than the time ordering of the events, the ordering of when the data were recorded. In 1982, what I consciously took from this incredibly complex novel on race in America was an image of a man running in the woods, how his girlfriend held him, and set of rules for recording data. The blurb on the cover by Charles R Larson of the Detroit News says, "A book which will have a remarkable effect on generations to come." It is now 20220403 and I'm still pushing generations to name their files by following these rules. It's 20220403 and I'm rereading the novel. I recommend reading it for its other qualities.

Bradley, David.
Chaneysville incident.
New York : Harper & Row, ©1981; Avon softcover May 1982, a.k.a. 198205 or 1982-05
ISBN: 0060104910 9780060104917

For those of us old enough to remember money as cash, this is how we count money. We start with the largest bills and work our way down to smaller denominations.

This is also how we count. We use digits from left to right, from biggest to smallest. In base 10, 231 is
2*(10^2) + 3*(10^1) + 1*(10^0)

Naming is discrete and incremental. Naming counts.

< back