nomadsoftware |
|
The inspiration for this article was one written a few weeks ago entitled Working with Files in Go. In that article the author details numerous ways of interacting with files highlighting the capabilities of Go. I thought I would write a companion piece, this time detailing how to interact with files using the D programming language. Interacting with files is a fundamental task of any programming language and while such tasks are commonplace, it’s not entirely obvious how to achieve certain file related tasks using D. Hopefully this article will change that and show the simplicity and power of the D language when working with files.
Some of following code examples make good use of D’s uniform function call syntax (UFCS). Don’t be put off by this, a very simple explanation can be found here.
The following code show how to open and close a file in a safe way. Generally D does not attempt to provide thin wrappers over equivalent functions from the C standard library, but manipulating such file handles directly is unsafe and error-prone in many ways. The File
type ensures safe manipulation, automatic file closing, and a lot of convenience. The underlying handle is maintained in a reference-counted manner, such that as soon as the last File
variable goes out of scope, the underlying handle is automatically closed.
If an exception is thrown, there has been an error accessing the file and the errno
property of the exception can be examined to find out what went wrong. Because the File
type is a wrapper over a C function, the error number returned will be equal to the constants defined in core.stdc.errno
. This is the most common way of accessing files and handling any errors that occur. Extended information can be gleaned from a file using the std.file.getAttributes
function which returns an unsigned integer. This integer contains several bit flags that are set in an operating system specific manner. More information about these flags can be found here.
Sometimes it’s necessary to move to a particular place in a file before you start reading or writing. The following example shows how to move to an offset from different starting points.
This example shows how to write bytes to a file.
Sometimes is nice to just dump a buffer of data to a file and let the library take care of opening and closing it. This example shows you how.
This is writing an array of bytes to the file but it could just as easily be a string.
When dealing with files there’s always a lot of reading and writing of strings. This example shows the different methods available for writing a string to a file.
These methods can be used to provide a bit more convenience depending on different scenarios.
As an optimisation technique, sometimes it’s necessary to write to a buffer in memory before writing it to disk to save time on disk IO. This example shows one of many ways of creating and writing such a buffer.
Using a buffer this way enables data to be written to memory very quickly before being dumped to disk. This saves time from making many small writes and decreases the wear on a drive.
This example shows how to read bytes from a file.
When reading bytes like this you have to provide a buffer which receives the read data. This example uses a dynamic array for such a buffer and preallocates 1024 bytes before reading. The rawRead
method fills the buffer with data and returns a slice of that buffer. The buffer length is the maximum number of bytes that will be read.
Sometimes is nice to just read data from a file and let the library take care of opening and closing it. This example shows you how.
The returned data is typed as a void array. This can be cast to a more useful type and in the above example it’s cast to an array of bytes.
This example uses the read
function again but this time uses the second parameter to define a limit of bytes to read. If the file is smaller than the defined limit, only the data in the file will be returned.
As before, the data returned can be cast to different array types.
This example reads a file in 1024 byte chunks.
The byChunk
method returns an input range of bytes which reads from the file handle a chunk at a time. In this case, each call will return a maximum of 1024 bytes. The buffer is reused for every call so if you need the data to persist between calls, copies must be made.
These examples show how to read strings from a file.
While the above example is convenient for reading strings from a file there is a downside and that is readln
allocates a new buffer for every line read. Because of this potential performance issue there is an overloaded method which takes a buffer as a parameter, like this.
This buffer can then be reused for each string read (which increases performance) but on the downside you have to take copies if you need the data to persist between calls. D leaves you to make the decision which is preferred.
Reading a file as a range allows you to use many generic algorithms defined in Phobos. This example shows how.
The byLine
method returns an input range which reads from the file handle one line at a time. Internally a buffer is reused for every line so if you need this data to persist between calls, copies must be made. There is a convenience method called byLineCopy
which does this automatically.
This example shows how to read an entire text file into a string.
This reads and validates a text file. No character width conversion is performed. If the width of the characters in the file don’t match the specified string type the file will fail validation.
This creates a new file (if one doesn’t exist) when initialising a File
struct. If a file with the same name already exists, its contents are discarded and the file is treated as a new empty file.
This simply checks if a file exists.
This renames and/or moves a file. If the destination file exists, it is overwritten.
This copies a file. If the destination file exists, it is overwritten.
This simply deletes a file.
This gets information for a particular file, similar to what you’d get from stat on a Posix system. The following code shows only cross-platform information, more is available for individual operating systems by decoding the attributes
member.
This truncates an existing file to a maximum of 100 bytes. If the existing file’s size is less, no truncation takes place.
Building upon later examples, the following shows how to create a zip archive.
This example shows how to read a zip archive.
This example shows how to compress data before writing to a file.
In the above example a string is used but any data can be compressed. Internally the std.zlib
module uses the zlib C library.
This shows how to read compressed data from a file.
This changes file access rights on a Posix system such as Linux or Mac OS. There is no cross-platform Phobos solution for this task so we can only use a Posix specific system call.
The chmod
system call functions in exactly the same way as the chmod shell command. A file name is specified along with its new access rights (expressed as an octal number). When modifying a file in this way, you also need permission to actually perform the operation. This can be accomplished by owning the file or by being a super user.
This changes the ownership of a file on a Posix system. Once a file is owned, file access rights can be changed without being a super user.
The chown
system call functions in exactly the same way as the chown shell command. A file name is specified along with its new owner and group. Your program will need super user permissions to change the owner.
Sometimes it’s necessary to create hard links or symbolic links on a Posix system. The following example first shows you how to create a hard link.
To create a symbolic link, replace line 9 with the following line.
There is rarely one canonical way of interacting with files and developers like to perform different file related tasks in their own particular way. Hopefully this article demonstrates the power and convenience of D and highlights convenient functions in the standard library for use when working with files.