Skip to content

I/O

Basics

I/O 的种类

Polling:如果 I/O 设备忙,那么就 busy waiting。

Interrupt: 发出 I/O 请求之后,就不再管它,而是直接去执行代码;然后 I/O 请求收到回复之后,硬件发出中断,然后再去处理这个。

前者的问题在于如果 I/O 设备忙,那么需要一直等待;后者的问题在于,如果 I/O 请求很多,那么会造成程序一直在处理中断,根本没去运行代码。

DMA

DMA 是一种技术,旨在让 I/O 设备读写内存的时候,不需要经过 CPU。从而我们可以实现 CPU 执行和 I/O 过程异步进行。

如图,

  1. 使用 ioctl system call
  2. trap 就会去告知对应的 device driver
  3. Device driver 会通知相应的 drive controller
  4. Drive controller 就会告知 DMA controller 开始 DMA transfer,直接将数据通过 PCIe bus 以及 CPU memory bus 写入 memory。此谓 direct memory access
  5. 写入完毕之后, DMA controller 就会给 CPU 发送 interrupt

Characters of I/O Devices

aspect variation example
data-transfer mode character
block
terminal
disk
access method sequential
random
modem
CD-ROM
transfer schedule synchronous
asynchronous
tape
keyboard
sharing dedicated
sharable
tape
keyboard
device speed latency
seek time
transfer rate
delay between operations
I/O direction read only
write only
read-write
CD-ROM
graphics controller
disk
  • Broadly, I/O devices can be grouped by the OS into
    • block I/O: read, write, seek
    • character I/O (Stream)
    • memory-mapped file access
    • network sockets
  • OSs have usually an escape/back door that passes any I/O
    commands from app to device
  • Linux's iocti call to send commands to a device driver
Synchronous I/O vs Asynchronous I/O

如图,左图是同步 I/O,右图是异步 I/O。

左图中,如果 I/O 硬件没有执行完毕,那么这个指令就不会返回;右图中,发出 I/O 命令之后,直接返回。

Kernel I/O Subsystem

宏内核包含了 I/O 子系统。这个系统可以做下面的事情:

  • spooling 就是输出缓存

I/O Protection

键盘事件肯定不是所有软件都能监听到的。可能只有处于焦点的软件才能监听到(否则就难免会出现密码泄露)。因此键盘事件必须是 privileged。

UNIX I/O Kernel Structure

Kernel source code
struct task_struct {
    // ...

    /* Filesystem information: */
    struct fs_struct        *fs;

    /* Open file information: */
    struct files_struct     *files;

    // ...
}

struct files_struct *files 就是 per-process open-file table:

/*
 * Open file table structure
 */
struct files_struct {
  /*
   * read mostly part
   */
    atomic_t count;
    bool resize_in_progress;
    wait_queue_head_t resize_wait;

    struct fdtable __rcu *fdt;
    struct fdtable fdtab;
  /*
   * written part on a separate cache line in SMP
   */
    spinlock_t file_lock ____cacheline_aligned_in_smp;
    unsigned int next_fd;
    unsigned long close_on_exec_init[1];
    unsigned long open_fds_init[1];
    unsigned long full_fds_bits_init[1];
    struct file __rcu * fd_array[NR_OPEN_DEFAULT];
};

其中,这个 struct file * fd_array 就是指向各种 record 的指针的 array。

I/O Request to Hardware

Consider reading a file from disk for a process: - determine device holding file - translate name to device representation - FAT, UNIX: major/minor - physically read data from disk into buffer - make data available to requesting process - return control to process

Lifecycle of An I/O Request

How to Reduce I/O Overhead

一句话:make devices smarter。

具体来说,能用 DMA 就用 DMA,设备的 controller 能够胜任就让它来负责——别所有东西都让 CPU 来决策。

Question: How to Register a Device on Linux

tty 为例:

可以大致看出,需要进行注册。因此,我们在热插拔设备的时候,就会产生中断,然后进行设备的注册和取消注册。

Linux I/O Implementation

write

write -> ... -> vfs_write -> (indirect call) tty_write

ioctl

ioctl -> vfs_ioctl -> (indirect call) tty_ioctl