Lec 21-22: Network Programming
Linux 哲学:一切皆文件
文件类型标识 | 文件类型 |
---|---|
- |
普通文件 |
d |
目录文件 |
l |
符号链接 |
c (伪文件) |
字符设备(character device) |
b (伪文件) |
块设备(block device) |
s (伪文件) |
套接字文件(socket) |
p (伪文件) |
命名管道文件(pipe) |
如上图,Linux 的哲学就是,一切皆文件。
其中:
-
字符设备和块设备
-
字符设备只能以字节为最小单位访问,而块设备以块为单位访问,例如 512 字节,1024 字节等。
-
块设备可以随机访问,但是字符设备不可以
-
套接字文件
-
使用套接字(socket)除了可以实现网络间不同主机间的通信外,还可以实现同一主机的不同进程间的通信,且建立的通信是双向的通信。
I/O 通路硬件实现
如图,网络信息通过网络适配器进行解译,并通过 I/O 总线传到 I/O 桥,进而进入内存/CPU。
Internet Protocol
How is it possible to send bits across incompatible LANs and WANs?
Solution: protocol software running on each host and router
- Protocol is a set of rules that governs how hosts and routers should cooperate when they transfer data from network to network.
- Smooths out the differences between the different networks
What does an internet protocol do?
-
Provides a naming scheme
-
An internet protocol defines a uniform format for host addresses
-
Each host (and router) is assigned at least one of these internet addresses that uniquely identifies it
-
Provides a delivery mechanism
-
An internet protocol defines a standard transfer unit (packet)
-
Packet consists of header and payload
- Header: contains info such as packet size, source and destination addresses
- Payload: contains data bits sent from source host
Global IP Internet
- Most famous example of an internet
- Based on the TCP/IP protocol family
- IP (Internet Protocol):
- Provides basic naming scheme and unreliable delivery capability of packets (datagrams) from host-to-host
- UDP (Unreliable Datagram Protocol):
- Uses IP to provide unreliable datagram delivery from process-to-process
- TCP (Transmission Control Protocol):
- Uses IP to provide reliable byte streams from process-to-process over connections
- Accessed via a mix of Unix file I/O and functions from the sockets interface
Internet Connections
- Clients and servers communicate by sending streams of bytes over connections. Each connection is:
- Point-to-point: connects a pair of processes.
- Full-duplex: data can flow in both directions at the same time,
- Reliable: stream of bytes sent by the source is eventually received by the destination in the same order it was sent.
- Note: A connection must be RELIABLE (e.g. socket via TCP), otherwise it's not a connection (e.g. socket via UDP, ICMP, etc).
- A socket is an endpoint of a connection
- Socket address is an
IPaddress:port
pair - A port is a 16-bit integer that identifies a process:
- Ephemeral port: Assigned automatically by client kernel when client makes a connection request.
- Well-known port: Associated with some service provided by a server (e.g., port 80 is associated with Web servers)
Anatomy of a Connection
A connection is uniquely identified by the socket addresses of its endpoints (socket pair)
(cliaddr:cliport, servaddr:servport)
Socket Interface
如图,这就是 socket 的一般执行流程。
其中
getaddrinfo
用于 DNS lookupsocket
用于建立一个抽象的套接字- 对于内核而言,是建立了一个 endpoint of communication
- 对于用户而言,是建立了一个 socket 文件,以便用户直接通过文件 I/O 进行网络通信
bind
用于将该套接字文件和对应的套接字addr
bind 在一起- 也就是设置这个 socket 的参数
listen
用于将该套接字文件设置为 listen 模式,从而阻塞进程,直到接收到 connection request