How to use different character encodings

Officially the FTP protocol (RFC959) only supports 7-bit ASCII characters. This means that file and directory names transferred across the control channel can only be ASCII.

However, most FTP servers do support extended (i.e. 8-bit) ASCII. Unfortunately there are many types of extended ASCII and there is no standard for what extended ASCII encoding is to be used. As a result, the meaning of characters 0 to 127 is well-defined, but the meaning of characters 128 to 255 varies from one server to the other. For example, one server might interpret character 193 as an accented A, whereas another might interpret it as an accented E.

By default, FTPClient, SSLFTPClient, and ProFTPClient support 7-bit ASCII. If a character is encountered whose code is 128 to 255 it will be represented as a question mark. The setControlEncoding method allows the developer to select an 8-bit character encoding that matches that of the server. Unfortunately many servers do not state what 8-bit ASCII character set they are using, so it is often necessary to use trial and error to find out. Some common character encodings to try for western European languages are windows-1252 and ISO-8859-1.

For example, to set the command encoding to windows-1252:

ftp.setControlEncoding("Windows-1252");

For Chinese, try using an encoding such as Big5.