How to use rsync on a Mac / macOS to sync files with a locally mounted FAT32 disk, and deal sensibly with Unicode characters, umlauts, accents, etc.

So, there’s a lot of discussion of the above problem; you are on a Mac, and you are trying to “rsync” files from a source to a destination, and it complains about umlauts, complex quote characters, or other Unicode.

If you are like me, you’ll waste time trying to work out what encodings and codepages are applicable to the FAT32 filesystem, assuming that that’s the correct approach; it turns out not. Some of them will also claim that FAT32 has native code-page support, but that the Long File Name support apparently requires UTF-16, or perhaps UCS2.

You’ll also run into people earnestly explaining that Mac HFS requires everything to be put into “fully” UTF-8 decomposed form before being turned into the filename, and that this is the problem. But then they quit on giving you a solution.

You may even find people talking about solving a similar problem on Mac-to-Linux systems (eg: NAS) – or in passing you will learn that time resolution on FAT32 is about 2 seconds, so you can’t sensibly use time-comparison without adding some kind of time-fuzz – but neither of these will solve the filename issue.

You are probably also annoyed by half a dozen blogposts by geeks who hack around this with Perl, or who attack the questioner to say WHY HAVEN’T YOU JUST REFORMATTED THE DISK TO BE <ext4, zfs, ntfs> YOU ENORMOUS BABOON? (tip: because the camera still has to read it, too)

The actual answer – the one you were looking for – is really simple.

  1. brew install rsync # because the macOS rsync is old
  2. rsync -av –iconv=utf-8-mac,utf-8-mac –size-only /from/dir/ /to/fat32dir/

It seems to basically boil down to convincing rsync to decompose the filenames that it [previously created and] reads back from the FAT32 filesystem, before comparing them to what it can see in the HFS filesystem.

So, tell it on both sides that it has to use “utf-8-mac”, thereby levelling the filename-comparison playing field. Suddenly everything works fast again. It might be possible to tune it even further, to simplify more, but I’m so pleased with this that I just want to share.

Leave a comment if this worked for you and you found it / appreciated the writeup.

Comments

6 responses to “How to use rsync on a Mac / macOS to sync files with a locally mounted FAT32 disk, and deal sensibly with Unicode characters, umlauts, accents, etc.”

  1. Yes, thank you. Writeups like this are awesome.

  2. sending incremental file list
    [sender] cannot convert filename: Pearl Jam/Yield/08 ?.mp3 (Illegal byte sequence)
    IO error encountered — skipping file deletion

    it’s always something

    i added –exclude ‘*?*’ and it stopped breaking

  3. Eamonn Webster

    Code blog 101: turn off smart formatting. An en-dash might look prettier that two hyphens, but it doesn’t work.

  4. Thanks, really. I was trying with –iconv=UTF-8-MAC,UTF-8 (which works in a different context, macOS to a remote NTFS filesystem mounted in a Raspberry PI), but I was frustrated for the apparently simpler task to syncing to the USB key that I use in my car. I’d never figure out this solution.

  5. Years later, this helped me also! I’ve been trying to reduce the number of files that re-sync needlessly, and changing my code from -iconv=utf-8-mac,utf-8 to –iconv=utf-8-mac,utf-8-mac cleared nearly all my duplicates.

    (There are still eight files that re-sync every time and that’s a mystery I haven’t solved yet, but still, that’s a MASSIVE improvement over the extra re-copies that were happening before the utf value change.)

Leave a Reply

Your email address will not be published. Required fields are marked *