German Umlaute with Perl Text::CSV

Why are my umlauts missing?

2 minute read

The following code is simple, easy to understand but it took me a while to figure that out. It was late at night, a production quick hack was waiting and some more things that happened at that time. I was using Text::CSV to write a well formed csv file with some formatting options.

In short - I had the following (sample) data:

my @data = (
      a => 'Eklige Heisswasseraufbruehsuppe',
      b => 'Fiat Punto'
      a => 'Drogenhölle Kleintierzüchterverein',
      b => 'Lada Niva'
      a => 'Konsumrausch',
      b => 'Audi A4 Cabrio'

Interestingly enough - the following script will persist that data into a csv file.

  #!/usr/bin/env perl
  use strict;
  use warnings;
  use Text::CSV;
  my $csv = Text::CSV->new();
  open my $fh, ">", "new.csv" or die "new.csv: $!";
  $csv->eol ("\n");
  foreach my $row (@data) {
    $csv->print($fh, [ $row->{a}, $row->{b} ]);
  close $fh or die "new.csv: $!";

If you check the result file you’ll find that the hash element with the umlaut is missing. Text::CSV doesn’t say anything, the element is just missing. Checking with the documentation brings something to light:

see here

Important Note: The default behavior is to only accept ASCII characters. This means that
fields can not contain newlines. If your data contains newlines embedded in fields, or
characters above 0x7e (tilde), or binary data, you *must* set binary => 1 in the call
to new ()

Fun fact - umlauts are above the tilde character and are therefore not within the range of Text::CSV unless you specify binary => 1 to the constructor. Fun fact - the author stated this as: you want this option always. I considered binary == images, therefore - where are my umlauts?

  my $csv = Text::CSV->new({binary => 1});

Yes, it’s stated in the documentation but it’s so easy to overlook even if it’s written in bold. A single statements inside the documentation such as (umlauts are well over tilde) would have fixed that. Well, it was late at night.

Just in case someone googles this in the same situation - smile, you are not alone

comments powered by Disqus