QUESTION :
The question of how to handle dates stored as text in Excel is common, and so are the answers. But can anyone explain why it happens — and in particular, whose “fault” is it?
I use a tool that can export financial reports to Excel, and the dates in the resulting Excel data have this problem. But is the problem being caused by the tool not exporting sensibly; or is there perhaps something quirky about Excel that makes this seemingly simple data representation job very difficult?
I am inclined to go for the former and blame the exporting tool. I’d have thought that what it should export is the date’s Excel serial number. (Of course Excel may still need to be told to display that as a date, but at least the data would be accurate.) But instead what the tool exports is the string corresponding to the date formatted according to whatever formatting options I am using (in the exporting tool). So, for example, I display dates in the big endian ISO 8601 format, in which today would be 2019-09-27. And so I’d have thought the tool should export that to Excel as the number 43735.0. But it doesn’t. Instead what I find in the relevant Excel cell is, literally, the ten-character string, “2019-09-27”.
Thing is, Excel does know how to handle that on its own — if you give it a kick. So if I select the string-filled cell, position my cursor in the Formula Bar at the rightmost end of the content, and just hit Return, then the text immediately gets reinterpreted as a date, the string gets replaced by the appropriate serial number, and it even gets formatted as Date (albeit not in yyyy-mm-dd — instead it gets formatted in the default according to my region settings).
So, which is the culprit: exporting tool, or Excel?
(If it’s Excel…well whatchagonnado. But if it’s the tool, then someone’s getting a cuff around the ear. Please give me permission to dispense said cuffing.)
ANSWER :
So, which is the culprit: exporting tool, or Excel?
In one word, both.
- Any export utility that creates an Excel formatted file (xls, xlsx, etc) should use Excel formats but far too many don’t and we all have to live with the consequences. Cuffing dispensary is authorized.
But it really takes two to Tango…
- Excel has a horrible reputation for making it difficult to work CSV files. Most people open a CSV in Excel by double clicking the file. That is understandable because that is how we’ve been conditioned to do that through repetition; and it’s also wrong. Excel makes makes assumptions about files opened this way and it usually does poorly. The correct way is to open Excel and then import the CSV from the data tab. This gives you the opportunity to explicitly declare the data type and format. It is more cumbersome than double clicking but it solves many of the problems people complain about. Excel is not blameless, but in most cases, it is the user that needs the cuffing.
There’s an amusing cycle of annoyance that frequently occurs when lesser informed people are tasked with editing a CSV. They open the file by double clicking, make thier edits, grumble about fixing dates and currencies, save the file as a CSV, and then go ballistic after they open the file to review it and see they’ve lost all the format changes.
They will often repeat the same process two, three, who knows how many times!
They expect the CSV file to retain Excel’s formatting and that will never happen. Even a CSV that was edited in Excel must be imported back into Excel. Sometimes it’s hard to convince them because they know they changed the date format. Those are the times I particularly enjoy, when I am there to open the perfectly formatted CSV in Notepad and see the look on thier face as they start counting the hours they’ve wasted on a non-existent problem. The smarter among them them cuff themselves, the others are not worth cuffing.