I/O
Basics¶
I/O 的种类¶
Polling:如果 I/O 设备忙,那么就 busy waiting。
Interrupt: 发出 I/O 请求之后,就不再管它,而是直接去执行代码;然后 I/O 请求收到回复之后,硬件发出中断,然后再去处理这个。
前者的问题在于如果 I/O 设备忙,那么需要一直等待;后者的问题在于,如果 I/O 请求很多,那么会造成程序一直在处理中断,根本没去运行代码。
DMA¶
DMA 是一种技术,旨在让 I/O 设备读写内存的时候,不需要经过 CPU。从而我们可以实现 CPU 执行和 I/O 过程异步进行。
如图,
- 使用
ioctl
system call - trap 就会去告知对应的 device driver
- Device driver 会通知相应的 drive controller
- Drive controller 就会告知 DMA controller 开始 DMA transfer,直接将数据通过 PCIe bus 以及 CPU memory bus 写入 memory。此谓 direct memory access
- 写入完毕之后, DMA controller 就会给 CPU 发送 interrupt
Characters of I/O Devices¶
aspect | variation | example |
---|---|---|
data-transfer mode | character block |
terminal disk |
access method | sequential random |
modem CD-ROM |
transfer schedule | synchronous asynchronous |
tape keyboard |
sharing | dedicated sharable |
tape keyboard |
device speed | latency seek time transfer rate delay between operations |
|
I/O direction | read only write only read-write |
CD-ROM graphics controller disk |
- Broadly, I/O devices can be grouped by the OS into
- block I/O: read, write, seek
- character I/O (Stream)
- memory-mapped file access
- network sockets
- OSs have usually an escape/back door that passes any I/O
commands from app to device - Linux's
iocti
call to send commands to a device driver
Synchronous I/O vs Asynchronous I/O
如图,左图是同步 I/O,右图是异步 I/O。
左图中,如果 I/O 硬件没有执行完毕,那么这个指令就不会返回;右图中,发出 I/O 命令之后,直接返回。
Kernel I/O Subsystem¶
宏内核包含了 I/O 子系统。这个系统可以做下面的事情:
- spooling 就是输出缓存
I/O Protection¶
键盘事件肯定不是所有软件都能监听到的。可能只有处于焦点的软件才能监听到(否则就难免会出现密码泄露)。因此键盘事件必须是 privileged。
UNIX I/O Kernel Structure¶
Kernel source code
struct task_struct {
// ...
/* Filesystem information: */
struct fs_struct *fs;
/* Open file information: */
struct files_struct *files;
// ...
}
struct files_struct *files
就是 per-process open-file table:
/*
* Open file table structure
*/
struct files_struct {
/*
* read mostly part
*/
atomic_t count;
bool resize_in_progress;
wait_queue_head_t resize_wait;
struct fdtable __rcu *fdt;
struct fdtable fdtab;
/*
* written part on a separate cache line in SMP
*/
spinlock_t file_lock ____cacheline_aligned_in_smp;
unsigned int next_fd;
unsigned long close_on_exec_init[1];
unsigned long open_fds_init[1];
unsigned long full_fds_bits_init[1];
struct file __rcu * fd_array[NR_OPEN_DEFAULT];
};
其中,这个 struct file * fd_array
就是指向各种 record 的指针的 array。
I/O Request to Hardware¶
Consider reading a file from disk for a process: - determine device holding file - translate name to device representation - FAT, UNIX: major/minor - physically read data from disk into buffer - make data available to requesting process - return control to process
Lifecycle of An I/O Request
How to Reduce I/O Overhead¶
一句话:make devices smarter。
具体来说,能用 DMA 就用 DMA,设备的 controller 能够胜任就让它来负责——别所有东西都让 CPU 来决策。
Question: How to Register a Device on Linux¶
以 tty
为例:
可以大致看出,需要进行注册。因此,我们在热插拔设备的时候,就会产生中断,然后进行设备的注册和取消注册。
Linux I/O Implementation¶
write
¶
write
-> ... -> vfs_write
-> (indirect call) tty_write
ioctl
¶
ioctl
-> vfs_ioctl
-> (indirect call) tty_ioctl