我們在開發 device driver 或是 embedded system applications 時很常用到 volatile
來宣告定義變數,藉此提示 compiler 此變數可能會被硬體或是 interrupt function 修改,因此當 compiler 進行 code optimization 時,要確保涉及到 volatile
變數的操作不會被更動到。(舉例來說,當 compiler 在進行分析時,發現此變數涉及的操作只有讀取但沒有寫入時,就有可能調整執行順序,以改善程式效能)。雖然這對開發者來說可能是很基本的知識,但我們最近還是踩到了小坑,因此這篇再一次的透過 C11 標準來複習 volatile 定義,並且說明可能會有的疑問。
C11 標準下的 volatile
首先,先從 C11 標準來了解 volatile 修飾字 (Linux Kernel Moving Ahead With Going From C89 To C11 Code)。
5.1.2.3 Program execution
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects. which are changes in the state of the execution environment.
The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.
5.1.2.3 這段比較重要的地方在於 side effects 定義與須遵守的執行順序。
6.7.3 Type qualifiers
If an attempt is made to refer to an object defined with a volatile-qualified type through use of an lvalue with non-volatile-qualified type, the behavior is undefined.
這邊 undefined behavior 如以下例子:
1int test(void) {
2 // example: 1
3 volatile int number = 2;
4 volatile int *number_ptr = &number;
5 int copy_number = *number_ptr;
6
7 copy_number = 30;
8
9 return copy_number;
10}
進行 -O2 編譯後 (x86_64 gcc 12.1),可以看到在經過 lvalue conversion 後, volatile qualifier 被移除。
1test:
2 mov DWORD PTR [rsp-4], 2
3 mov eax, DWORD PTR [rsp-4]
4 mov eax, 30
5 ret
An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. What constitutes an access to an object that has volatile-qualified type is implementation-defined.
A volatile declaration may be used to describe an object corresponding to a memory-mapped input/output port or an object accessed by an asynchronously interrupting function. Actions on objects so declared shall not be ‘‘optimized out’’ by an implementation or reordered except as permitted by the rules for evaluating expressions.
valitile qualifiers in GNU GCC
在 C11 標準中可以看到一些是由 compiler 定義的內容,因此 compiler 就會針對這部分進行實作補充說明。以 GNU GCC 為例,像是 an access to an object that has volatile-qualified type,GNU GCC 的文件就有針對這部分描述具體實作方式 6.46 When is a Volatile Object Accessed?。
volatile declaration
1// Example 1:
2struct s {
3 volatile int a; // only a is a volatile-qualified object
4 int b;
5};
6
7// Example 2:
8struct z {
9 int a;
10 int b;
11};
12
13// all members of the aggregate type are volatile-qualified
14volatile struct z example;
volatile-qualified type casting
volatile-qualified type casting 規則曾經在 GNU GCC 4.x 版本上產生爭議。在 gcc -O2 discards cast to volatile 中提到,當經過 -O2
code optimization 後,object 的 volatile qualifer 遺失,造成編譯出來的程式發生問題,其主要原因在於 pointer to volatile-qualified type 和 pointer to non-volatile-qualified type 轉換的實作差異,此問題已在後續版本中修復。
pointer to volatile-qualified type -> pointer to non-volatile-qualified type
1struct data {
2 int a;
3 int b;
4};
5
6void volatile_test(void)
7{
8 volatile struct data d = {10, 20};
9 volatile struct data *d_ptr = &d;
10
11 // warning: discards 'volatile'
12 struct data *non_d_ptr = (struct data *)d_ptr;
13}
pointer to non-volatile-qualified type -> pointer to volatile-qualified type
1struct data {
2 int a;
3 int b;
4};
5
6int volatile_test(void)
7{
8 struct data d = {10, 20};
9 struct data *d_ptr = &d;
10
11 volatile struct data *non_d_ptr = d_ptr;
12
13 return non_d_ptr->a;
14}
在 x86_64 gcc 12.1 編譯(-O2)後,其 mechine code 如下,可以知道有正確地轉換為 volatile-qualified type。
1volatile_test:
2 mov DWORD PTR [rsp-12], 10
3 mov eax, DWORD PTR [rsp-12]
4 ret
但 x86_64 gcc 4.6.4 編譯(-O2)後,其 mechine code 顯示並沒有透過 load 行為取得值,volatile qualifer 遺失。
1volatile_test:
2 mov eax, 10
3 ret
Bug
前面複習了 volatile,但是實際上我們產生 bug 的根本原因與 volatile 定義或實作方式並沒有太大關係,主要在於 GNU GCC 在 code optimization 階段,進行 local variable allocation 時,會以 register 為主,而如果剛好定義了一個 volatile-qualified object,其大小能放在 register,此時也會造成 volatile qualifer 的遺失。
1struct data {
2 volatile int a;
3 int b;
4};
5
6int volatile_test(void)
7{
8 struct data data = {10, 20};
9 return data.a + data.a;
10}
在 x86_64 gcc 12.1 編譯(-O2)後,其 mechine code 為:
1volatile_test:
2 mov eax, 20
3 ret
此 mechine code 跟我們預期的行為不符合,因為 data.a 有 volatile 修飾字,應該要讀取 data.a 地址的值,而不是直接取預設值相加成 20。
但只要多加一個 member,將其 structure 的大小變成 12 byte,那麼這個變數就會被 allocate 在 stack 上,其編譯出來的結果就會是正確的。
1struct data {
2 volatile int a;
3 int b;
4 int c;
5};
6
7int volatile_test(void)
8{
9 struct data data = {10, 20};
10 return data.a + data.a;
11}
在 x86_64 gcc 12.1 編譯(-O2)後,其 mechine code 為:
1volatile_test:
2 mov DWORD PTR [rsp-12], 10
3 mov eax, DWORD PTR [rsp-12]
4 mov edx, DWORD PTR [rsp-12]
5 add eax, edx
6 ret
可以看到在 return data.a + data.a;
部分,會讀取 2 次 data.a 地址的值,符合我們預期結果。