Files and I/O
Goals
- UNIX I/O Design Concepts
- High-level File I/O: Streams
- Low-level File I/O: File Descriptors
- How and Why of High-level File I/O
- Process State for File Descriptors
- Common Pitfalls with OS Abstractions
UNIX I/O Design Concepts
- Uniformity: Everything is a “File”
- file operations, device I/O, and interprocess communication through open, read/write, close
- Identical interface for:
- Files on the disk
- Devices (terminals, printers, etc.)
- Regular files on the disk
- Networking (sockets)
- Local interprocess communication (pipes, sockets)
- Based on the system calls
open()
read()
write()
close()
- Additional:
ioctl()
for custom configuration that doesn’t quite fit - Open before use
- Provide opportunity for access control and arbitration
- Sets up the underlying machinery, i.e., data structures
- Explicitly close
- Byte-oriented
- Even if blocks are transferred, addressing is in bytes
- OS responsible for hiding the fact that real devices may not work this way (e.g. hard drive stores data in blocks)
- Kernel buffered reads
- Part of making everything byte-oriented
- Process is blocked while waiting for device
- Let other processes run while gathering result
- Kernel buffered writes
- Complete in background
- Return to user when data is “handed off” to kernel
The File System Abstraction
- File
- Named collection of data in a file system
- POSIX File data: sequence of bytes
- Could be text, binary, serialized objects, …
- File Metadata: information about the file
- size, modification time, owner, security info, etc.
- Directory
- “Folder” containing files & directories
- Hierarchical (graphical) naming
- Path through the directory graph
- Uniquely identifies a file or directory
/home/ff/cs162/public_html/fa18/index.html
- Links and Volumes
I/O and Storage Layers
High-Level File API – Streams
Operate on “streams” - sequence of bytes, whether text or data, with a position
The fopen
function returns a pointer to a FILE
data structure. A null pointer will be returned if there is an error.
Standard Streams and C APIs
Three predefined streams are opened implicitly when a program/process is executed
FILE *stdin
: normal source of input, can be redirectedFILE *stdout
: normal source of output, can be redirectedFILE *stderr
: diagnostics and errors, can be redirected
The STDIN / STDOUT
enables composition in UNIX. All can be redirected, for instance, using pipe symbol: |
:
# `cat`'s `stdout` goes to `grep`'s `stdin`
cat hello.txt | grep "World"
- A file copy example:
#include <stdio.h>
#define BUFFER_SIZE = 1024
int main(void) {
FILE* input = fopen("input.txt", "r");
FILE* output = fopen("output.txt", "w");
char buffer[BUFFER_SIZE];
size_t length;
// read the whole file and store the length
length = fread(buffer, BUFFER_SIZE, sizeof(char), input);
while(length > 0) {
fwrite(buffer, length, sizeof(char), output);
// update the length, util reaching the end of the file
length = fread(buffer, BUFFER_SIZE, sizeof(char), intput);
}
fclose(input);
fclose(output);
return 0;
}
- C API for positioning the file pointer:
int fseek(FILE* stream, long int offset, int whence);
long int ftell(FILE* stream);
void rewind(FILE* stream);
For fseek()
the offset is interpreted based on the whence
argument:
SEEK_SET
: Thenoffset
interpreted from beginning (position 0)SEEK_END
: Thenoffset
interpreted backwards from end of fileSEEK_CUR
: Thenoffset
interpreted from the current position
What’s in a FILE
?
- File descriptor (from call to the low-level
open
API) - An array buffer
- Lock (In case multiple threads use the FILE concurrently)
- some other stuff…
When you call fwrite
, what happens to the data you provided?
- It gets written to
FILE
’s buffer - If the
FILE
’s buffer is full, then it is flushed, meaning it’s written to the underlying file descriptor - The C standard library may flush the FILE more frequently
- e.g., if it sees a certain character in the stream
- When you write code, make the weakest possible assumptions about how data is flushed from FILE buffers
char x = 'c';
FILE* f1 = fopen("file.txt", "w");
fwrite("b", sizeof(char), 1, f1);
FILE* f2 = fopen("file.txt", "r");
fread(&x, sizeof(char), 1, f2);
The call to fread
might see the latest write 'b'
, or it might miss it and see end of file, in which case x
will remain 'c'
.
The first fwrite
might not have gotten into the kernel depending on whether it got flushed or not. We need to explicitly flush the buffer after fwrite
fwrite("b", sizeof(char), 1, f1);
fflush(f1);
Low-Level File API: File Descriptors
- The integer return from
open()
is a file descriptor. This is how OS object representing the state of a file. User can use that as the “handle” to the file. - Operations on file descriptors
- Open system call created an open file description entry in system-wide table of open files
- Open file description object in the kernel represents an instance of an open file
- System default file descriptors for
stdin
,stdout
,stderr
#include <unistd.h>
STDIN_FILENO ‐ macro has value 0
STDOUT_FILENO ‐ macro has value 1
STDERR_FILENO ‐ macro has value 2
- Get file descriptor inside
FILE *
int fileno(FILE *stream);
- Make
FILE*
from descriptor
FILE *fdopen(int filedges, const char* opentype);
- Read data from open file using file descriptor:
// Reads up to maxsize bytes
// returns bytes read, 0 => EOF, ‐1 => error
ssize_t read (int filedes, void *buffer, size_t maxsize)
- Write data to open file using file descriptor
// returns bytes written
ssize_t write (int filedes, const void *buffer, size_t size)
- Reposition file offset with kernel(this is independent of any position held by high-level FILE descriptor for this file!)
off_t lseek (int filedes, off_t offset, int whence)
- Wait for i/o to finish
int fsync (int fildes)
// wait for ALL to finish
void sync (void)
- A little example
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
int main() {
char buf[1000];
int fd = open("lowio.c", O_RDONLY, S_IRUSR | S_IWUSR);
ssize_t rd = read(fd, buf, sizeof(buf));
int err = close(fd);
ssize_t wr = write(STDOUT_FILENO, buf, rd);
}
Other Low-Level APIs
- Operations specific to terminals, device, networking
- e.g.,
ioctl
- e.g.,
- Duplicating descriptors
int dup2 (int old, int new)
int dup (int old)
- Pipes - channel
int pipe(int pipefd[2])
writes to pipefd[1] can be read from pipefd[0]
- Memory mapped files
- File Locking
- Asynchronous I/O
- Generic I/O Control Operations
High-Level vs. Low-Level File APIs
As you can see, fread()
does more work before going into the kernel. Internally, the fread()
maintains a buffer. When reading from the kernel, fread()
put it into a local memory buffer, and all the subsequent freads()
you do for a while just look in that buffer and grab the next BUFFER_SIZE
without having to go into the kernel, as kernel processing is expensive and slower.
Streams vs. File Descriptors
Streams are buffered in user memory
printf("Beginning of line ");
sleep(10); // sleep for 10 seconds
printf("and end of line\n");
The printf
function goes to the buffered version of stdout
. When hitting the new line character, the buffer will be flushed out, so we print everything all at once.
However, if we use low-level C API calls, operations on file descriptors are visible immediately
write(STDOUT_FILENO, "Beginning of line ", 18);
sleep(10);
write("and end of line \n", 16);
This outputs “Beginning of line” 10 seconds earlier than and end of line
. There is no buffering in this path at the bottom, but there is buffering in the path at top. However, the system/kernel level buffering is completely transparent to users. You won’t feel it.
Why Buffering in Userspace?
- Avoid system call overhead
- Time to copy registers, transition to kernel mode, jump to system call handler, etc.
- Minimum syscall time:
- syscalls are 25x more expensive than function calls(~100 ns)
- The blue bars are user level function calls
- The green bars are all system calls for getting
getpid()
! - Not to make syscall if we can avoid them
- Read/write a file byte by byte?
- With the syscall APIs, the max throughput is ~10MB/second
- With
fgetc
, the speed can keep up with your SSD
- System call operations less capable
- Simplifies operating system
- Example: No “read until new line” operation
- Solution: Make a big read syscall, find first new line in userspace
State Maintained by the kernel
When open()
is successfully called, a file descriptor is created in the kernel. For each process, the kernel maintains a mapping from file descriptor to open file description.
On future system calls (e.g., read()
), kernel looks up the open file description using the file descriptor and uses it to service the system call
So what does an Open File Description look like?
An internal Data Structure describing everything about the file, such as
- where it resides
- its status
- file descriptor number
- How to access it
The two most important things are
- where to find the file data on disk
- The current position within the file
Abstract Representation of a Process
- Suppose that we are executing
open("foo.txt")
, and the return value is3
.- In the kernel space, we have this file descriptor table that maps a file descriptor (
3
in this example) to an open file description.
- In the kernel space, we have this file descriptor table that maps a file descriptor (
- Next, after we open the file, we execute
read(3, buf, 100)
and the return value is100
. Theposition
in the screenshot above will become100
. - Finally, after
close(3)
. The file descriptor(3
) is removed from the table.
fork()
What if we don’t call close(3)
, and instead, we call fork()
- We have forked a child process (
#2
). Now the File Descriptors table got duplicated. Both parent and the child processes point to the same open file description(shared), meaning either of them can read the file. - Next, if the parent process read
100
bytes, theposition
in the open file description will become200
. Now, if the child process read100
bytes, since the open file description is shared, theposition
will become300
. - Finally, if the parent process execute
close(0)
, it’ll remove itself from the table, but the child process still holds a reference to the open file description, so the file won’t be closed.
dup and dup2
The dup
and dup2
functions let you duplicated file descriptors. For example, dup(3)
will create a new file descriptor 4
from 3
that points to the same open file description. dup2(3, 162)
let you specific a file descriptor(162
) when duplicating the original file descriptor(3
).