UNIX/DOS newline translation for Samba

drh.net

David Harris, dharris@

Created July 1997
Posted December 4, 1998.

This document describes a patch to Samba version 1.9.16 which adds support for automatic newline translation between the UNIX and DOS worlds. The user should have experience with building Samba from a binary release, patching source code, and running Samba.

Available for consulting work

I am currently available for consulting work. For consulting, I am happy to telecommute or come to your site for a period of time.

For more information about my skills and experience, view my statement of consulting capabilities or contact me for more information.

Back to davideous.com home.


1. Usage

What this patch does

This patch modifies Samba version 1.9.16 (and possibly later patchlevel releases) for automatic newline translation between the UNIX and DOS worlds.

The problem is that UNIX and DOS use different binary codes to encode a newline, LF and CRLF, respectively. So, a DOS file in the UNIX world has a CR at the end of every line and a UNIX file in the DOS world is all just one line.

When transferring a text file between the UNIX and DOS worlds, one wants to do a newline translation. This is what the ASCII mode in FTP does. However, one does not want to modify binary files, such as jpegs, because that will mess up the binary format. This is what BINARY mode in FTP does.

So, if Samba is going to do this translation it needs to know if a file is a text file or a binary file. Then, if it is a text file the newline format must be detected and the change made.

My patch to Samba does exactly this. It assumes that the local text file format is UNIX and the remote text file format is DOS. As a part of every file transfer, it detects the file format, and, if needed, changes it. Therefore, the user does not have to worry about messing up the newline formats.

Annoyances & Bugs

There are a few annoyances and bugs in this patch. However, I believe it remains quite usable.

  • I've encountered a few files where the file format (text vs binary) is detected wrong. This can happen with a tar file that contains text and binary files, but begins with a text file.

    I only had trouble with it mistaking a binary file as text, and not mistaking a text file as binary. So, I just setup an un-patched copy of Samba to run on that same machine, just a different IP address. This way I have a smb interface without newline translation and I can transfer binary files what would have been mistaken as text.

  • I've experienced a problem where the directory listing takes a few seconds to show up instead of appearing instantly. My patch did not fool with any of the directory listing code, so I don' know how this came about, or if this is even caused by my code.

  • Because of the file copying and newline translation step there can be a pause when opening a file for reading or closing a file through samba. This is when the actual newline translation is done. For some applications this can be a slight problem. For example, Codewright, my win95 text editor was opening and closing a file three times whenever I click on a file listing in my project pane. This was causing a noticeable delay, so I found a way to disable this weird behavior in Codewright.

  • The version of Samba upon which this is based, 1.9.16, is now old. I believe it is also susceptible to a bad security hole that was fixed in later versions. This does not bother me because I run it on a private LAN behind a firewall. However, I would not recommend putting this on a directly internet-accessible machine.
I've been using this for over a year and it has served me well.

2. Theory of operation

Here is my comment out of the servers.c giving an overview of what I did:

/*
 * This program was modified by David R. Harris (DRH) on and slightly
 * before the date 7/17/97. The ability to convert between Unix and
 * Dos file formats was added. The following modifications were made:
 *
 * Changes in server.c:
 *  - Modified the function open_file to become open_file_wrapped 
 *  - Modified the function close_file to become close_file_wrapped 
 *  - Created the function open_file.
 *  - Created the function open_file.
 *  - Slightly modified the function write_file.
 *
 * Changes in smb.h:
 *  - Added to the structure files_struct.
 *
 * Changes in Makefile:
 *  - Added newlinelib.o to the dependencies of server.o.
 *
 * New files:
 *  - newlinelib.c to do the newline format translation and detection.
 *  - newlinelib.h header file for newlinelib.c
 */
Now the details:

Whenever samba gets a request to open a file I catch it. If the file is being opened for reading and no newline translation needs to be done, the open request is just passed on normally. But if it is being opened for reading, a newline translation needs to be done, and the file is translated into a temporary file and samba open that tmp file for the user. If the file is being opened for writing or readwrite, then I have a tmp file opened for writing.

I also catch file close requests. If a temp file was created and written to, then its newline format is detected and copied it (with possible newline translations) to the real file. If the temp file was only read from, it is deleted.

I catch these close file requests and open file requests with the open_file and close_file functions. I wrote my own open_file and close_file functions and renamed the previous functions to open_file_wrapped and close_file_wrapped. My functions handle the newline translation and the tmp file, and call the wrapped functions to do the actual open and close.

I modified the function write_file to tell me when a file has been written to. I need to know this in the close_file function to see if the temp file gets copied back on the real file.

I also modified files_struct in smb.h to include extra file state information required by all of this.

I created newlinelib.c and newlinelib.h to implement a newline library, which handles the detection and translation of the newline formats. For details of the file format detection method, see the source code.

3. Distribution & Installation

I'm making this available in a few formats to make things easy. (Because I just had a horrible time applying a patch from someone-else.)

Here they are:

Just download and apply the patch.

4. Request from author

If you use this, I'd love to hear about it.