内核的同步以及锁机制

1.前言

为什么要考虑内核同步

在单一CPU 的情况下，中断或者内核代码明确调度时，多个执行线程并发访问共享数据。目前多处理器以及抢占式内核的存在，更需要注意保护共享资源。

造成并发的原因

Interrupts— An interrupt can occur asynchronously at almost any time, interrupting the currently executing code.

Softirqs and tasklets— The kernel can raise or schedule a softirq or tasklet at
almost any time, interrupting the currently executing code.

Kernel preemption— Because the kernel is preemptive, one task in the kernel
can preempt another.

Sleeping and synchronization with user-space— A task in the kernel can
sleep and thus invoke the scheduler, resulting in the running of a new process.

Symmetrical multiprocessing— Two or more processors can execute kernel
code at exactly the same time.

临界区

临界区( Critical Regions ) 指的是访问内存中共享数据的 代码段，编程者需要保证这一部分代码段执行期间不被打断。如果两个线程在同一个临界区执行，就会有Bug。避免这种情况发生称之为 同步(synchronization).

例子

比如银行取款的代码，同一个账号同时在用两种方式取款，需要调用同一个取款函数，这时候必须考虑同步。有两种方法来解决这个问题，一个是原子性，一个是加锁。

将取款的整个函数原子化,使得每一个事务是原子性并且不可打断的。CPU 和内核提供了原子操作的接口（算术运算或比较大小）。
有时候共享数据是一个复杂的数据结构，例如队列等不定长度的数据类型,这种复杂的情况没有办法原子化,就需要用到加锁

2.加锁

锁的机制就像一个房间的门锁一样，房间内只能有一个线程存在。后来的线程需要在门外排队。
linux 自身实现了几种不同的锁，主要体现在锁在被占用的情况下的表现。
加锁的过程是采用原子操作实现的，和体系结构相关。一般 0 意味着开锁。
伪并发与真并发 都存在竞争，都需要保护。
- 伪并发是指单处理器系统中，一个程序在运行中被调度到另一个程序，两个程序执行在一个临界区。
- 真并发是指在多处理器系统中，两个程序同时执行，执行在一个临界区。
加锁的方法并不难，难的是找到哪些数据和临界区需要加锁，在编写代码的时候就要考虑到
- Is the data global? Can a thread of execution other than the current one access it?
- Is the data shared between process context and interrupt context? Is it shared between two different interrupt handlers?
- If a process is preempted while accessing this data, can the newly scheduled process access the same data?
- Can the current process sleep (block) on anything? If it does, in what state does that leave any shared data?
- What prevents the data from being freed out from under me?
- What happens if this function is called again on another processor?
- Given the proceeding points, how am I going to ensure that my code is safe from
  concurrency?
不需要加锁的数据
- 局部变量，只被特定进程访问的数据
需要加锁的数据
- 大多数内核的数据结构，有其他线程可以访问的数据
内核在编译时是可以配置的，可以不设置SMP 以及抢占的模块，减少开销。

3.死锁

死锁的产生条件：一个或多个执行线程，和一个或多个共享数据。 ==彼此需要的资源被彼此锁住了== 。

自死锁，等待自己已经拿到的锁
避免死锁的规则：
- 按顺序加锁，多个线程按照同样的顺序获得多个锁， A - B -C 这样第二个进程就不会先获得C 造成死锁。
- 防止发生饥饿
- 不重复请求同一个锁
- 设计应简单
内核提供了调试工具在运行时检测死锁

4.扩展性

并不是多CPU 就是成倍的增加性能，因为共享数据的锁，需要排队等待锁机制。
锁的加锁 ==粒度==，是对一个大块数据加锁还是对大块数据的一个元素加锁。
- 可以对(一整个链表/链表的一个节点/节点的一个元素)进行加锁。
- 锁的增用严重的话，就需要使加锁的粒度更精细，但是粒度过于精细也会使得系统开销增大
- 调度队列 O(1) 到 CFS 就提高了精细度