foreach是PHP中很常用的一個(gè)用作數(shù)組循環(huán)的控制語句。
因?yàn)樗姆奖愫鸵子茫匀灰簿驮诤蠖穗[藏著很復(fù)雜的具體實(shí)現(xiàn)方式(對用戶透明)
今天,我們就來一起分析分析,foreach是如何實(shí)現(xiàn)數(shù)組(對象)的遍歷的。
我們知道PHP是一個(gè)腳本語言,也就是說,用戶編寫的PHP代碼最終都是會(huì)被PHP解釋器解釋執(zhí)行,
特別的,對于PHP來說,所有的用戶編寫的PHP代碼,都會(huì)被翻譯成PHP的虛擬機(jī)ZE的虛擬指令(OPCODES)來執(zhí)行,不論細(xì)節(jié)的話,就是說,我們所編寫的任何PHP腳本,都會(huì)最終被翻譯成一條條的指令,從而根據(jù)指令,由相應(yīng)的C編寫的函數(shù)來執(zhí)行。
那么foreach會(huì)被翻譯成什么樣子呢?
foreach($arr as $key => $val){ echo $key . '=>' . $val . "/n";}
在詞法分析階段,foreach會(huì)被識(shí)別為一個(gè)TOKEN:T_FOREACH,
在語法分析階段,會(huì)被規(guī)則:
unticked_statement: //沒有被綁定ticks的語句 //有省略 | T_FOREACH '(' variable T_AS { zend_do_foreach_begin(&$1, &$2, &$3, &$4, 1 TSRMLS_CC); } foreach_variable foreach_optional_arg ')' { zend_do_foreach_cont(&$1, &$2, &$4, &$6, &$7 TSRMLS_CC); } foreach_statement { zend_do_foreach_end(&$1, &$4 TSRMLS_CC); } | T_FOREACH '(' expr_without_variable T_AS { zend_do_foreach_begin(&$1, &$2, &$3, &$4, 0 TSRMLS_CC); } variable foreach_optional_arg ')' { zend_check_writable_variable(&$6); zend_do_foreach_cont(&$1, &$2, &$4, &$6, &$7 TSRMLS_CC); } foreach_statement { zend_do_foreach_end(&$1, &$4 TSRMLS_CC); } //有省略;
仔細(xì)分析這段語法規(guī)則,我們可以發(fā)現(xiàn),對于:
foreach($arr as $key => $val){echo $key . ‘=>' . $val .”/n”;}
會(huì)被分析為:
T_FOREACH '(' variable T_AS { zend_do_foreach_begin('foreach', '(', $arr, 'as', 1 TSRMLS_CC); } foreach_variable foreach_optional_arg(T_DOUBLE_ARROW foreach_variable) ')' { zend_do_foreach_cont('foreach', '(', 'as', $key, $val TSRMLS_CC); } foreach_satement {zend_do_foreach_end('foreach', 'as');}
然后,讓我們來看看foreach_statement:
它其實(shí)就是一個(gè)代碼塊,體現(xiàn)了我們的 echo $key . ‘=>' . $val .”/n”;
T_ECHO expr;
顯然,實(shí)現(xiàn)foreach的核心就是如下3個(gè)函數(shù):
- zend_do_foreach_begin
- zend_do_foreach_cont
- zend_do_foreach_end
其中,zend_do_foreach_begin (代碼太長,直接寫偽碼) 主要做了:
1. 記錄當(dāng)前的opline行數(shù)(為以后跳轉(zhuǎn)而記錄)
2. 對數(shù)組進(jìn)行RESET(講內(nèi)部指針指向第一個(gè)元素)
3. 獲取臨時(shí)變量 ($val)
4. 設(shè)置獲取變量的OPCODE FE_FETCH,結(jié)果存第3步的臨時(shí)變量
4. 記錄獲取變量的OPCODES的行數(shù)
而對于 zend_do_foreach_cont來說:
1. 根據(jù)foreach_variable的u.EA.type來判斷是否引用
2. 根據(jù)是否引用來調(diào)整zend_do_foreach_begin中生成的FE_FETCH方式
3. 根據(jù)zend_do_foreach_begin中記錄的取變量的OPCODES的行數(shù),來初始化循環(huán)(主要處理在循環(huán)內(nèi)部的循環(huán):do_begin_loop)
最后zend_do_foreach_end:
1. 根據(jù)zend_do_foreach_begin中記錄的行數(shù)信息,設(shè)置ZEND_JMP OPCODES
2. 根據(jù)當(dāng)前行數(shù),設(shè)置循環(huán)體下一條opline, 用以跳出循環(huán)
3. 結(jié)束循環(huán)(處理循環(huán)內(nèi)循環(huán):do_end_loop)
4. 清理臨時(shí)變量
當(dāng)然, 在zend_do_foreach_cont 和 zend_do_foreach_end之間 會(huì)在語法分析階段被填充foreach_satement的語句代碼。
這樣,就實(shí)現(xiàn)了foreach的OPCODES line。
比如對于我們開頭的實(shí)例代碼,最終生成的OPCODES是:
filename: /home/huixinchen/foreach.phpfunction name: (null)number of ops: 17compiled vars: !0 = $arr, !1 = $key, !2 = $valline # op fetch ext return operands------------------------------------------------------------------------------- 2 0 SEND_VAL 1 1 SEND_VAL 100 2 DO_FCALL 2 'range' 3 ASSIGN !0, $0 3 4 FE_RESET $2 !0, ->14 5 FE_FETCH $3 $2, ->14 6 ZEND_OP_DATA ~5 7 ASSIGN !2, $3 8 ASSIGN !1, ~5 4 9 CONCAT ~7 !1, '-' 10 CONCAT ~8 ~7, !2 11 CONCAT ~9 ~8, '%0A' 12 ECHO ~9 5 13 JMP ->5 14 SWITCH_FREE $2 7 15 RETURN 1 16* ZEND_HANDLE_EXCEPTION
我們注意到FE_FETCH的op2的操作數(shù)是14,也就是JMP后一條opline,也就是說,在獲取完最后一個(gè)數(shù)組元素以后,F(xiàn)E_FETCH失敗的情況下,會(huì)跳到第14行opline,從而實(shí)現(xiàn)了循環(huán)的結(jié)束。
而15行opline的op1的操作數(shù)是指向了FE_FETCH,也就是無條件跳轉(zhuǎn)到第5行opline,從而實(shí)現(xiàn)了循環(huán)。
附錄:
void zend_do_foreach_begin(znode *foreach_token, znode *open_brackets_token, znode *array, znode *as_token, int variable TSRMLS_DC){ zend_op *opline; zend_bool is_variable; zend_bool push_container = 0; zend_op dummy_opline; if (variable) { //是否是匿名數(shù)組 if (zend_is_function_or_method_call(array)) { //是否是函數(shù)返回值 is_variable = 0; } else { is_variable = 1; } /* 使用括號記錄FE_RESET的opline行數(shù) */ open_brackets_token->u.opline_num = get_next_op_number(CG(active_op_array)); zend_do_end_variable_parse(BP_VAR_W, 0 TSRMLS_CC); //獲取數(shù)組/對象和zend_do_begin_variable_parse對應(yīng) if (CG(active_op_array)->last > 0 && CG(active_op_array)->opcodes[CG(active_op_array)->last-1].opcode == ZEND_FETCH_OBJ_W) { /* Only lock the container if we are fetching from a real container and not $this */ if (CG(active_op_array)->opcodes[CG(active_op_array)->last-1].op1.op_type == IS_VAR) { CG(active_op_array)->opcodes[CG(active_op_array)->last-1].extended_value |= ZEND_FETCH_ADD_LOCK; push_container = 1; } } } else { is_variable = 0; open_brackets_token->u.opline_num = get_next_op_number(CG(active_op_array)); } foreach_token->u.opline_num = get_next_op_number(CG(active_op_array)); //記錄數(shù)組Reset Opline number opline = get_next_op(CG(active_op_array) TSRMLS_CC); //生成Reset數(shù)組Opcode opline->opcode = ZEND_FE_RESET; opline->result.op_type = IS_VAR; opline->result.u.var = get_temporary_variable(CG(active_op_array)); opline->op1 = *array; SET_UNUSED(opline->op2); opline->extended_value = is_variable ? ZEND_FE_RESET_VARIABLE : 0; dummy_opline.result = opline->result; if (push_container) { dummy_opline.op1 = CG(active_op_array)->opcodes[CG(active_op_array)->last-2].op1; } else { znode tmp; tmp.op_type = IS_UNUSED; dummy_opline.op1 = tmp; } zend_stack_push(&CG(foreach_copy_stack), (void *) &dummy_opline, sizeof(zend_op)); as_token->u.opline_num = get_next_op_number(CG(active_op_array)); //記錄循環(huán)起始點(diǎn) opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = ZEND_FE_FETCH; opline->result.op_type = IS_VAR; opline->result.u.var = get_temporary_variable(CG(active_op_array)); opline->op1 = dummy_opline.result; //被操作數(shù)組 opline->extended_value = 0; SET_UNUSED(opline->op2); opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = ZEND_OP_DATA; //當(dāng)使用key的時(shí)候附屬操作數(shù),當(dāng)foreach中不包含key時(shí)忽略 SET_UNUSED(opline->op1); SET_UNUSED(opline->op2); SET_UNUSED(opline->result);}void zend_do_foreach_cont(znode *foreach_token, const znode *open_brackets_token, const znode *as_token, znode *value, znode *key TSRMLS_DC){ zend_op *opline; znode dummy, value_node; zend_bool assign_by_ref=0; opline = &CG(active_op_array)->opcodes[as_token->u.opline_num]; //獲取FE_FETCH Opline if (key->op_type != IS_UNUSED) { znode *tmp;//交換key和val tmp = key; key = value; value = tmp; opline->extended_value |= ZEND_FE_FETCH_WITH_KEY; //表明需要同時(shí)獲取key和val } if ((key->op_type != IS_UNUSED) && (key->u.EA.type & ZEND_PARSED_REFERENCE_VARIABLE)) { //key不能以引用方式獲取 zend_error(E_COMPILE_ERROR, "Key element cannot be a reference"); } if (value->u.EA.type & ZEND_PARSED_REFERENCE_VARIABLE) { //以引用方式獲取值 assign_by_ref = 1; if (!(opline-1)->extended_value) { //根據(jù)FE_FETCH的上一條Opline也就是獲取數(shù)組的擴(kuò)展值來判斷數(shù)組是否是匿名數(shù)組 zend_error(E_COMPILE_ERROR, "Cannot create references to elements of a temporary array expression"); } opline->extended_value |= ZEND_FE_FETCH_BYREF; //指明按引用取 CG(active_op_array)->opcodes[foreach_token->u.opline_num].extended_value |= ZEND_FE_RESET_REFERENCE; //重置原數(shù)組 } else { zend_op *foreach_copy; zend_op *fetch = &CG(active_op_array)->opcodes[foreach_token->u.opline_num]; zend_op *end = &CG(active_op_array)->opcodes[open_brackets_token->u.opline_num]; /* Change "write context" into "read context" */ fetch->extended_value = 0; /* reset ZEND_FE_RESET_VARIABLE */ while (fetch != end) { --fetch; if (fetch->opcode == ZEND_FETCH_DIM_W && fetch->op2.op_type == IS_UNUSED) { zend_error(E_COMPILE_ERROR, "Cannot use [] for reading"); } fetch->opcode -= 3; /* FETCH_W -> FETCH_R */ } /* prevent double SWITCH_FREE */ zend_stack_top(&CG(foreach_copy_stack), (void **) &foreach_copy); foreach_copy->op1.op_type = IS_UNUSED; } value_node = opline->result; if (assign_by_ref) { zend_do_end_variable_parse(value, BP_VAR_W, 0 TSRMLS_CC); //獲取值(引用) zend_do_assign_ref(NULL, value, &value_node TSRMLS_CC);//指明value node的type是IS_VAR } else { zend_do_assign(&dummy, value, &value_node TSRMLS_CC); //獲取copy值 zend_do_free(&dummy TSRMLS_CC); } if (key->op_type != IS_UNUSED) { znode key_node; opline = &CG(active_op_array)->opcodes[as_token->u.opline_num+1]; opline->result.op_type = IS_TMP_VAR; opline->result.u.EA.type = 0; opline->result.u.opline_num = get_temporary_variable(CG(active_op_array)); key_node = opline->result; zend_do_assign(&dummy, key, &key_node TSRMLS_CC); zend_do_free(&dummy TSRMLS_CC); } do_begin_loop(TSRMLS_C); INC_BPC(CG(active_op_array));}void zend_do_foreach_end(znode *foreach_token, znode *as_token TSRMLS_DC){ zend_op *container_ptr; zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); //生成JMP opcode opline->opcode = ZEND_JMP; opline->op1.u.opline_num = as_token->u.opline_num; //設(shè)置JMP到FE_FETCH opline行 SET_UNUSED(opline->op1); SET_UNUSED(opline->op2); CG(active_op_array)->opcodes[foreach_token->u.opline_num].op2.u.opline_num = get_next_op_number(CG(active_op_array)); //設(shè)置跳出循環(huán)的opline行 CG(active_op_array)->opcodes[as_token->u.opline_num].op2.u.opline_num = get_next_op_number(CG(active_op_array)); //同上 do_end_loop(as_token->u.opline_num, 1 TSRMLS_CC); //為循環(huán)嵌套而設(shè)置 zend_stack_top(&CG(foreach_copy_stack), (void **) &container_ptr); generate_free_foreach_copy(container_ptr TSRMLS_CC); zend_stack_del_top(&CG(foreach_copy_stack)); DEC_BPC(CG(active_op_array)); //為PHP interactive模式而設(shè)置}
同時(shí)還要注意的是,foreach在使用中是值還是傳引用的問題。
php 中遍歷一個(gè)array時(shí)可以使用for或foreach,foreach的語法為:foreach ($arr as $k => $v)。遍歷數(shù)組,把index賦給$k,數(shù)組的值賦給$v,那么此處的賦值是傳值還是傳引用呢。先看下面的例子:
$arr = array( array('id' => 1, 'name' => 'name1'), array('id' => 2, 'name' => 'name2'),);foreach ($arr as $obj) { $obj['id'] = $obj['id']; $obj['name'] = $obj['name'] . '-modify';}print_r($arr); //輸出的結(jié)果Array( [0] => Array ( [id] => 1 [name] => name1 ) [1] => Array( [id] => 2 [name] => name2 ))
觀察可以發(fā)現(xiàn)在foreach循環(huán)中對$arr操作并沒有影響到$arr的元素,所以這里的賦值是傳值而不是傳引用。那如果需要修改$arr中元素的值該怎么辦呢?可以在變量前面加一個(gè)”&”符號,例如:
foreach ($arr as &$obj) { $obj['id'] = $obj['id']; $obj['name'] = $obj['name'] . '-modify';}
再看另外一個(gè)例子,array里面存放的是object,
$arr = array( (object)(array('id' => 1, 'name' => 'name1')), (object)(array('id' => 2, 'name' => 'name2')),);foreach ($arr as $obj) { $obj->name = $obj->name . '-modify'; }print_r($arr); //輸出的結(jié)果Array( [0] => stdClass Object ( [id] => 1 [name] => name1-modify ) [1] => stdClass Object ( [id] => 2 [name] => name2-modify ))
此時(shí)可以看到原始數(shù)組中的object對象已經(jīng)修改了,所以這里的賦值又是傳引用而不是傳值
綜合上述,得出的結(jié)論:如果數(shù)組里面存放的是普通類型的元素就是采用傳值的方式,存放對象類型元素采用的方式為傳地址。