1 Mar 2010

Tip: Changing file names recursively from command line

Recently I've found very nice tool to change file names recursively, and fixing file names encoding. The tool is called convmv. It might be very helpful if you're migrating from older systems (or you had bad samba config).

Exact description of convmv from project site.

convmv converts filenames (not file content), directories, and even whole filesystems to a different encoding. This comes in very handy if, for example, one switches from an 8-bit locale to an UTF-8 locale or changes charsets on Samba servers. It has some smart features: it automagically recognises if a file is already UTF-8 encoded (thus partly converted filesystems can be fully moved to UTF-8) and it also takes care of symlinks. Additionally, it is able to convert from normalization form C (UTF-8 NFC) to NFD and vice-versa. This is important for interoperability with Mac OS X, for example, which uses NFD, while Linux and most other Unixes use NFC. Though it's primary written to convert from/to UTF-8 it can also be used with almost any other charset encoding. Convmv can also be used for case conversion from upper to lower case and vice versa with virtually any charset. Note that this is a command line tool which requires at least Perl version 5.8.0. This tool is not available in all distributions by default.

How to lower all file names in direcotry

convmv --lower -r /path/to/your/files/
depending on the files you might have to add your charset here, too:
convmv --lower --nosmart -r -f utf8 /path/to/your/files/

How to fix file names encoding

convmv --lower --nosmart -r -f iso8859-2 -t utf8 /path/to/your/files/
Other options which may be useful:
-f enc
encoding *from* which should be converted
-t enc
encoding *to* which should be converted
-r
recursively go through directories
--lowmem
keep memory footprint low
--nosmart
ignore if files already seem to be UTF-8 and convert if posible
--notest
actually do rename the files
--replace
will replace files if they are equal
--unescape
convert%20ugly%20escape%20sequences
--upper
turn to upper case
--lower
turn to lower case

1 comment:

  1. If you have some problems with polish characters in file names, try using:

    convmv -r --nosmart -f cp1250 -t utf8 path/to/files/

    ReplyDelete