麻豆小视频在线观看_中文黄色一级片_久久久成人精品_成片免费观看视频大全_午夜精品久久久久久久99热浪潮_成人一区二区三区四区

首頁 > 數(shù)據(jù)庫 > Redis > 正文

Redis源碼解析:集群手動(dòng)故障轉(zhuǎn)移、從節(jié)點(diǎn)遷移詳解

2020-03-17 12:36:01
字體:
供稿:網(wǎng)友

一:手動(dòng)故障轉(zhuǎn)移

         Redis集群支持手動(dòng)故障轉(zhuǎn)移。也就是向從節(jié)點(diǎn)發(fā)送”CLUSTER  FAILOVER”命令,使其在主節(jié)點(diǎn)未下線的情況下,發(fā)起故障轉(zhuǎn)移流程,升級(jí)為新的主節(jié)點(diǎn),而原來的主節(jié)點(diǎn)降級(jí)為從節(jié)點(diǎn)。

         為了不丟失數(shù)據(jù),向從節(jié)點(diǎn)發(fā)送”CLUSTER  FAILOVER”命令后,流程如下:

         a:從節(jié)點(diǎn)收到命令后,向主節(jié)點(diǎn)發(fā)送CLUSTERMSG_TYPE_MFSTART包;
         b:主節(jié)點(diǎn)收到該包后,會(huì)將其所有客戶端置于阻塞狀態(tài),也就是在10s的時(shí)間內(nèi),不再處理客戶端發(fā)來的命令;并且在其發(fā)送的心跳包中,會(huì)帶有CLUSTERMSG_FLAG0_PAUSED標(biāo)記;
         c:從節(jié)點(diǎn)收到主節(jié)點(diǎn)發(fā)來的,帶CLUSTERMSG_FLAG0_PAUSED標(biāo)記的心跳包后,從中獲取主節(jié)點(diǎn)當(dāng)前的復(fù)制偏移量。從節(jié)點(diǎn)等到自己的復(fù)制偏移量達(dá)到該值后,才會(huì)開始執(zhí)行故障轉(zhuǎn)移流程:發(fā)起選舉、統(tǒng)計(jì)選票、贏得選舉、升級(jí)為主節(jié)點(diǎn)并更新配置;

         ”CLUSTER  FAILOVER”命令支持兩個(gè)選項(xiàng):FORCE和TAKEOVER。使用這兩個(gè)選項(xiàng),可以改變上述的流程。

         如果有FORCE選項(xiàng),則從節(jié)點(diǎn)不會(huì)與主節(jié)點(diǎn)進(jìn)行交互,主節(jié)點(diǎn)也不會(huì)阻塞其客戶端,而是從節(jié)點(diǎn)立即開始故障轉(zhuǎn)移流程:發(fā)起選舉、統(tǒng)計(jì)選票、贏得選舉、升級(jí)為主節(jié)點(diǎn)并更新配置。

         如果有TAKEOVER選項(xiàng),則更加簡單粗暴:從節(jié)點(diǎn)不再發(fā)起選舉,而是直接將自己升級(jí)為主節(jié)點(diǎn),接手原主節(jié)點(diǎn)的槽位,增加自己的configEpoch后更新配置。

         因此,使用FORCE和TAKEOVER選項(xiàng),主節(jié)點(diǎn)可以已經(jīng)下線;而不使用任何選項(xiàng),只發(fā)送”CLUSTER  FAILOVER”命令的話,主節(jié)點(diǎn)必須在線。

        在clusterCommand函數(shù)中,處理”CLUSTER  FAILOVER”命令的部分代碼如下:

else if (!strcasecmp(c->argv[1]->ptr,"failover") &&       (c->argc == 2 || c->argc == 3)) {   /* CLUSTER FAILOVER [FORCE|TAKEOVER] */   int force = 0, takeover = 0;    if (c->argc == 3) {     if (!strcasecmp(c->argv[2]->ptr,"force")) {       force = 1;     } else if (!strcasecmp(c->argv[2]->ptr,"takeover")) {       takeover = 1;       force = 1; /* Takeover also implies force. */     } else {       addReply(c,shared.syntaxerr);       return;     }   }    /* Check preconditions. */   if (nodeIsMaster(myself)) {     addReplyError(c,"You should send CLUSTER FAILOVER to a slave");     return;   } else if (myself->slaveof == NULL) {     addReplyError(c,"I'm a slave but my master is unknown to me");     return;   } else if (!force &&         (nodeFailed(myself->slaveof) ||         myself->slaveof->link == NULL))   {     addReplyError(c,"Master is down or failed, "             "please use CLUSTER FAILOVER FORCE");     return;   }   resetManualFailover();   server.cluster->mf_end = mstime() + REDIS_CLUSTER_MF_TIMEOUT;    if (takeover) {     /* A takeover does not perform any initial check. It just      * generates a new configuration epoch for this node without      * consensus, claims the master's slots, and broadcast the new      * configuration. */     redisLog(REDIS_WARNING,"Taking over the master (user request).");     clusterBumpConfigEpochWithoutConsensus();     clusterFailoverReplaceYourMaster();   } else if (force) {     /* If this is a forced failover, we don't need to talk with our      * master to agree about the offset. We just failover taking over      * it without coordination. */     redisLog(REDIS_WARNING,"Forced failover user request accepted.");     server.cluster->mf_can_start = 1;   } else {     redisLog(REDIS_WARNING,"Manual failover user request accepted.");     clusterSendMFStart(myself->slaveof);   }   addReply(c,shared.ok); } 

首先檢查命令的最后一個(gè)參數(shù)是否是FORCETAKEOVER

         如果當(dāng)前節(jié)點(diǎn)是主節(jié)點(diǎn);或者當(dāng)前節(jié)點(diǎn)是從節(jié)點(diǎn),但沒有主節(jié)點(diǎn);或者當(dāng)前從節(jié)點(diǎn)的主節(jié)點(diǎn)已經(jīng)下線或者斷鏈,并且命令中沒有FORCE或TAKEOVER參數(shù),則直接回復(fù)客戶端錯(cuò)誤信息后返回;

         然后調(diào)用resetManualFailover,重置手動(dòng)強(qiáng)制故障轉(zhuǎn)移的狀態(tài);

         置mf_end為當(dāng)前時(shí)間加5秒,該屬性表示手動(dòng)強(qiáng)制故障轉(zhuǎn)移流程的超時(shí)時(shí)間,也用來表示當(dāng)前是否正在進(jìn)行手動(dòng)強(qiáng)制故障轉(zhuǎn)移;

         如果命令最后一個(gè)參數(shù)為TAKEOVER,這表示收到命令的從節(jié)點(diǎn)無需經(jīng)過選舉的過程,直接接手其主節(jié)點(diǎn)的槽位,并成為新的主節(jié)點(diǎn)。因此首先調(diào)用函數(shù)clusterBumpConfigEpochWithoutConsensus,產(chǎn)生新的configEpoch,以便后續(xù)更新配置;然后調(diào)用clusterFailoverReplaceYourMaster函數(shù),轉(zhuǎn)變成為新的主節(jié)點(diǎn),并將這種轉(zhuǎn)變廣播給集群中所有節(jié)點(diǎn);

         如果命令最后一個(gè)參數(shù)是FORCE,這表示收到命令的從節(jié)點(diǎn)可以直接開始選舉過程,而無需達(dá)到主節(jié)點(diǎn)的復(fù)制偏移量之后才開始選舉過程。因此置mf_can_start為1,這樣在函數(shù)clusterHandleSlaveFailover中,即使在主節(jié)點(diǎn)未下線或者當(dāng)前從節(jié)點(diǎn)的復(fù)制數(shù)據(jù)比較舊的情況下,也可以開始故障轉(zhuǎn)移流程;

         如果最后一個(gè)參數(shù)不是FORCE或TAKEOVER,這表示收到命令的從節(jié)點(diǎn),首先需要向主節(jié)點(diǎn)發(fā)送CLUSTERMSG_TYPE_MFSTART包,因此調(diào)用clusterSendMFStart函數(shù),向其主節(jié)點(diǎn)發(fā)送該包;

         主節(jié)點(diǎn)收到CLUSTERMSG_TYPE_MFSTART包后,在clusterProcessPacket函數(shù)中,是這樣處理的:

else if (type == CLUSTERMSG_TYPE_MFSTART) {   /* This message is acceptable only if I'm a master and the sender    * is one of my slaves. */   if (!sender || sender->slaveof != myself) return 1;   /* Manual failover requested from slaves. Initialize the state    * accordingly. */   resetManualFailover();   server.cluster->mf_end = mstime() + REDIS_CLUSTER_MF_TIMEOUT;   server.cluster->mf_slave = sender;   pauseClients(mstime()+(REDIS_CLUSTER_MF_TIMEOUT*2));   redisLog(REDIS_WARNING,"Manual failover requested by slave %.40s.",     sender->name); } 

  如果字典中找不到發(fā)送節(jié)點(diǎn),或者發(fā)送節(jié)點(diǎn)的主節(jié)點(diǎn)不是當(dāng)前節(jié)點(diǎn),則直接返回;

         調(diào)用resetManualFailover,重置手動(dòng)強(qiáng)制故障轉(zhuǎn)移的狀態(tài);

         然后置mf_end為當(dāng)前時(shí)間加5秒,該屬性表示手動(dòng)強(qiáng)制故障轉(zhuǎn)移流程的超時(shí)時(shí)間,也用來表示當(dāng)前是否正在進(jìn)行手動(dòng)強(qiáng)制故障轉(zhuǎn)移;

         然后設(shè)置mf_slave為sender,該屬性表示要進(jìn)行手動(dòng)強(qiáng)制故障轉(zhuǎn)移的從節(jié)點(diǎn);

         然后調(diào)用pauseClients,使所有客戶端在之后的10s內(nèi)阻塞;

         主節(jié)點(diǎn)在發(fā)送心跳包時(shí),在構(gòu)建包頭時(shí),如果發(fā)現(xiàn)當(dāng)前正處于手動(dòng)強(qiáng)制故障轉(zhuǎn)移階段,則會(huì)在包頭中增加CLUSTERMSG_FLAG0_PAUSED標(biāo)記:

void clusterBuildMessageHdr(clusterMsg *hdr, int type) {   ...   /* Set the message flags. */   if (nodeIsMaster(myself) && server.cluster->mf_end)     hdr->mflags[0] |= CLUSTERMSG_FLAG0_PAUSED;   ... }   

 

  從節(jié)點(diǎn)在clusterProcessPacket函數(shù)中處理收到的包,一旦發(fā)現(xiàn)主節(jié)點(diǎn)發(fā)來的,帶有CLUSTERMSG_FLAG0_PAUSED標(biāo)記的包,就會(huì)將該主節(jié)點(diǎn)的復(fù)制偏移量記錄到server.cluster->mf_master_offset中:

int clusterProcessPacket(clusterLink *link) {   ...   /* Check if the sender is a known node. */   sender = clusterLookupNode(hdr->sender);   if (sender && !nodeInHandshake(sender)) {     ...     /* Update the replication offset info for this node. */     sender->repl_offset = ntohu64(hdr->offset);     sender->repl_offset_time = mstime();     /* If we are a slave performing a manual failover and our master      * sent its offset while already paused, populate the MF state. */     if (server.cluster->mf_end &&       nodeIsSlave(myself) &&       myself->slaveof == sender &&       hdr->mflags[0] & CLUSTERMSG_FLAG0_PAUSED &&       server.cluster->mf_master_offset == 0)     {       server.cluster->mf_master_offset = sender->repl_offset;       redisLog(REDIS_WARNING,         "Received replication offset for paused "         "master manual failover: %lld",         server.cluster->mf_master_offset);     }   } }   

         從節(jié)點(diǎn)在集群定時(shí)器函數(shù)clusterCron中,會(huì)調(diào)用clusterHandleManualFailover函數(shù),判斷一旦當(dāng)前從節(jié)點(diǎn)的復(fù)制偏移量達(dá)到了server.cluster->mf_master_offset,就會(huì)置server.cluster->mf_can_start為1。這樣在接下來要調(diào)用的clusterHandleSlaveFailover函數(shù)中,就會(huì)立即開始故障轉(zhuǎn)移流程了。

         clusterHandleManualFailover函數(shù)的代碼如下:

void clusterHandleManualFailover(void) {   /* Return ASAP if no manual failover is in progress. */   if (server.cluster->mf_end == 0) return;   /* If mf_can_start is non-zero, the failover was already triggered so the    * next steps are performed by clusterHandleSlaveFailover(). */   if (server.cluster->mf_can_start) return;   if (server.cluster->mf_master_offset == 0) return; /* Wait for offset... */   if (server.cluster->mf_master_offset == replicationGetSlaveOffset()) {     /* Our replication offset matches the master replication offset      * announced after clients were paused. We can start the failover. */     server.cluster->mf_can_start = 1;     redisLog(REDIS_WARNING,       "All master replication stream processed, "       "manual failover can start.");   } } 

 

  不管是從節(jié)點(diǎn),還是主節(jié)點(diǎn),在集群定時(shí)器函數(shù)clusterCron中,都會(huì)調(diào)用manualFailoverCheckTimeout函數(shù),一旦發(fā)現(xiàn)手動(dòng)故障轉(zhuǎn)移的超時(shí)時(shí)間已到,就會(huì)重置手動(dòng)故障轉(zhuǎn)移的狀態(tài),表示終止該過程。manualFailoverCheckTimeout函數(shù)代碼如下:

/* If a manual failover timed out, abort it. */ void manualFailoverCheckTimeout(void) {   if (server.cluster->mf_end && server.cluster->mf_end < mstime()) {     redisLog(REDIS_WARNING,"Manual failover timed out.");     resetManualFailover();   } } 

二:從節(jié)點(diǎn)遷移

         在Redis集群中,為了增強(qiáng)集群的可用性,一般情況下需要為每個(gè)主節(jié)點(diǎn)配置若干從節(jié)點(diǎn)。但是這種主從關(guān)系如果是固定不變的,則經(jīng)過一段時(shí)間之后,就有可能出現(xiàn)孤立主節(jié)點(diǎn)的情況,也就是一個(gè)主節(jié)點(diǎn)再也沒有可用于故障轉(zhuǎn)移的從節(jié)點(diǎn)了,一旦這樣的主節(jié)點(diǎn)下線,整個(gè)集群也就不可用了。

         因此,在Redis集群中,增加了從節(jié)點(diǎn)遷移的功能。簡單描述如下:一旦發(fā)現(xiàn)集群中出現(xiàn)了孤立主節(jié)點(diǎn),則某個(gè)從節(jié)點(diǎn)A就會(huì)自動(dòng)變成該孤立主節(jié)點(diǎn)的從節(jié)點(diǎn)。該從節(jié)點(diǎn)A滿足這樣的條件:A的主節(jié)點(diǎn)具有最多的附屬從節(jié)點(diǎn);A在這些附屬從節(jié)點(diǎn)中,節(jié)點(diǎn)ID是最小的(The acting slave is the slave among the masterswith the maximum number of attached slaves, that is not in FAIL state and hasthe smallest node ID)。

         該功能是在集群定時(shí)器函數(shù)clusterCron中實(shí)現(xiàn)的。這部分的代碼如下:

void clusterCron(void) {   ...   orphaned_masters = 0;   max_slaves = 0;   this_slaves = 0;   di = dictGetSafeIterator(server.cluster->nodes);   while((de = dictNext(di)) != NULL) {     clusterNode *node = dictGetVal(de);     now = mstime(); /* Use an updated time at every iteration. */     mstime_t delay;      if (node->flags &       (REDIS_NODE_MYSELF|REDIS_NODE_NOADDR|REDIS_NODE_HANDSHAKE))         continue;      /* Orphaned master check, useful only if the current instance      * is a slave that may migrate to another master. */     if (nodeIsSlave(myself) && nodeIsMaster(node) && !nodeFailed(node)) {       int okslaves = clusterCountNonFailingSlaves(node);        /* A master is orphaned if it is serving a non-zero number of        * slots, have no working slaves, but used to have at least one        * slave. */       if (okslaves == 0 && node->numslots > 0 && node->numslaves)         orphaned_masters++;       if (okslaves > max_slaves) max_slaves = okslaves;       if (nodeIsSlave(myself) && myself->slaveof == node)         this_slaves = okslaves;     }     ...   }   ...   if (nodeIsSlave(myself)) {     ...     /* If there are orphaned slaves, and we are a slave among the masters      * with the max number of non-failing slaves, consider migrating to      * the orphaned masters. Note that it does not make sense to try      * a migration if there is no master with at least *two* working      * slaves. */     if (orphaned_masters && max_slaves >= 2 && this_slaves == max_slaves)       clusterHandleSlaveMigration(max_slaves);   }   ... }  

  輪訓(xùn)字典server.cluster->nodes,只要其中的節(jié)點(diǎn)不是當(dāng)前節(jié)點(diǎn),沒有處于REDIS_NODE_NOADDR或者握手狀態(tài),就對(duì)該node節(jié)點(diǎn)做相應(yīng)的處理:

         如果當(dāng)前節(jié)點(diǎn)是從節(jié)點(diǎn),并且node節(jié)點(diǎn)是主節(jié)點(diǎn),并且node未被標(biāo)記為下線,則首先調(diào)用函數(shù)clusterCountNonFailingSlaves,計(jì)算node節(jié)點(diǎn)未下線的從節(jié)點(diǎn)個(gè)數(shù)okslaves,如果node主節(jié)點(diǎn)的okslaves為0,并且該主節(jié)點(diǎn)負(fù)責(zé)的插槽數(shù)不為0,說明該node主節(jié)點(diǎn)是孤立主節(jié)點(diǎn),因此增加orphaned_masters的值;如果該node主節(jié)點(diǎn)的okslaves大于max_slaves,則將max_slaves改為okslaves,因此,max_slaves記錄了所有主節(jié)點(diǎn)中,擁有最多未下線從節(jié)點(diǎn)的那個(gè)主節(jié)點(diǎn)的未下線從節(jié)點(diǎn)個(gè)數(shù);如果當(dāng)前節(jié)點(diǎn)正好是node主節(jié)點(diǎn)的從節(jié)點(diǎn)之一,則將okslaves記錄到this_slaves中,以上都是為后續(xù)做從節(jié)點(diǎn)遷移做的準(zhǔn)備;

         輪訓(xùn)完所有節(jié)點(diǎn)之后,如果存在孤立主節(jié)點(diǎn),并且max_slaves大于等于2,并且當(dāng)前節(jié)點(diǎn)剛好是那個(gè)擁有最多未下線從節(jié)點(diǎn)的主節(jié)點(diǎn)的眾多從節(jié)點(diǎn)之一,則調(diào)用函數(shù)clusterHandleSlaveMigration,滿足條件的情況下,進(jìn)行從節(jié)點(diǎn)遷移,也就是將當(dāng)前從節(jié)點(diǎn)置為某孤立主節(jié)點(diǎn)的從節(jié)點(diǎn)。

         clusterHandleSlaveMigration函數(shù)的代碼如下:

void clusterHandleSlaveMigration(int max_slaves) {   int j, okslaves = 0;   clusterNode *mymaster = myself->slaveof, *target = NULL, *candidate = NULL;   dictIterator *di;   dictEntry *de;   /* Step 1: Don't migrate if the cluster state is not ok. */   if (server.cluster->state != REDIS_CLUSTER_OK) return;   /* Step 2: Don't migrate if my master will not be left with at least    *     'migration-barrier' slaves after my migration. */   if (mymaster == NULL) return;   for (j = 0; j < mymaster->numslaves; j++)     if (!nodeFailed(mymaster->slaves[j]) &&       !nodeTimedOut(mymaster->slaves[j])) okslaves++;   if (okslaves <= server.cluster_migration_barrier) return;   /* Step 3: Idenitfy a candidate for migration, and check if among the    * masters with the greatest number of ok slaves, I'm the one with the    * smaller node ID.    *    * Note that this means that eventually a replica migration will occurr    * since slaves that are reachable again always have their FAIL flag    * cleared. At the same time this does not mean that there are no    * race conditions possible (two slaves migrating at the same time), but    * this is extremely unlikely to happen, and harmless. */   candidate = myself;   di = dictGetSafeIterator(server.cluster->nodes);   while((de = dictNext(di)) != NULL) {     clusterNode *node = dictGetVal(de);     int okslaves;     /* Only iterate over working masters. */     if (nodeIsSlave(node) || nodeFailed(node)) continue;     /* If this master never had slaves so far, don't migrate. We want      * to migrate to a master that remained orphaned, not masters that      * were never configured to have slaves. */     if (node->numslaves == 0) continue;     okslaves = clusterCountNonFailingSlaves(node);     if (okslaves == 0 && target == NULL && node->numslots > 0)       target = node;     if (okslaves == max_slaves) {       for (j = 0; j < node->numslaves; j++) {         if (memcmp(node->slaves[j]->name,               candidate->name,               REDIS_CLUSTER_NAMELEN) < 0)         {           candidate = node->slaves[j];         }       }     }   }   dictReleaseIterator(di);   /* Step 4: perform the migration if there is a target, and if I'm the    * candidate. */   if (target && candidate == myself) {     redisLog(REDIS_WARNING,"Migrating to orphaned master %.40s",       target->name);     clusterSetMaster(target);   } } 

 

         如果當(dāng)前集群狀態(tài)不是REDIS_CLUSTER_OK,則直接返回;如果當(dāng)前從節(jié)點(diǎn)沒有主節(jié)點(diǎn),則直接返回;

         接下來計(jì)算,當(dāng)前從節(jié)點(diǎn)的主節(jié)點(diǎn),具有未下線從節(jié)點(diǎn)的個(gè)數(shù)okslaves;如果okslaves小于等于遷移閾值server.cluster_migration_barrier,則直接返回;

         接下來,開始輪訓(xùn)字典server.cluster->nodes,針對(duì)其中的每一個(gè)節(jié)點(diǎn)node:

         如果node節(jié)點(diǎn)是從節(jié)點(diǎn),或者處于下線狀態(tài),則直接處理下一個(gè)節(jié)點(diǎn);如果node節(jié)點(diǎn)沒有配置從節(jié)點(diǎn),則直接處理下一個(gè)節(jié)點(diǎn);

        調(diào)用clusterCountNonFailingSlaves函數(shù),計(jì)算該node節(jié)點(diǎn)的未下線主節(jié)點(diǎn)數(shù)okslaves;如果okslaves為0,并且該node節(jié)點(diǎn)的numslots大于0,說明該主節(jié)點(diǎn)之前有從節(jié)點(diǎn),但是都下線了,因此找到了一個(gè)孤立主節(jié)點(diǎn)target;

         如果okslaves等于參數(shù)max_slaves,說明該node節(jié)點(diǎn)就是具有最多未下線從節(jié)點(diǎn)的主節(jié)點(diǎn),因此將當(dāng)前節(jié)點(diǎn)的節(jié)點(diǎn)ID,與其所有從節(jié)點(diǎn)的節(jié)點(diǎn)ID進(jìn)行比較,如果當(dāng)前節(jié)點(diǎn)的名字更大,則將candidate置為具有更小名字的那個(gè)從節(jié)點(diǎn);(其實(shí)從這里就可以直接退出返回了)

         輪訓(xùn)完所有節(jié)點(diǎn)后,如果找到了孤立節(jié)點(diǎn),并且當(dāng)前節(jié)點(diǎn)擁有最小的節(jié)點(diǎn)ID,則調(diào)用clusterSetMaster,將target置為當(dāng)前節(jié)點(diǎn)的主節(jié)點(diǎn),并開始主從復(fù)制流程。

三:configEpoch沖突問題

         在集群中,負(fù)責(zé)不同槽位的主節(jié)點(diǎn),具有相同的configEpoch其實(shí)是沒有問題的,但是有可能因?yàn)槿藶榻槿氲脑蚧蛘連UG的問題,導(dǎo)致具有相同configEpoch的主節(jié)點(diǎn)都宣稱負(fù)責(zé)相同的槽位,這在分布式系統(tǒng)中是致命的問題;因此,Redis規(guī)定集群中的所有節(jié)點(diǎn),必須具有不同的configEpoch。

         當(dāng)某個(gè)從節(jié)點(diǎn)升級(jí)為新主節(jié)點(diǎn)時(shí),它會(huì)得到一個(gè)大于當(dāng)前所有節(jié)點(diǎn)的configEpoch的新configEpoch,所以不會(huì)導(dǎo)致具有重復(fù)configEpoch的從節(jié)點(diǎn)(因?yàn)橐淮芜x舉中,不會(huì)有兩個(gè)從節(jié)點(diǎn)同時(shí)勝出)。但是在管理員發(fā)起的重新分片過程的最后,遷入槽位的節(jié)點(diǎn)會(huì)自己更新自己的configEpoch,而無需其他節(jié)點(diǎn)的同意;或者手動(dòng)強(qiáng)制故障轉(zhuǎn)移過程,也會(huì)導(dǎo)致從節(jié)點(diǎn)在無需其他節(jié)點(diǎn)同意的情況下更新configEpoch,以上的情況都可能導(dǎo)致出現(xiàn)多個(gè)主節(jié)點(diǎn)具有相同configEpoch的情況。

         因此,就需要一種算法,保證集群中所有節(jié)點(diǎn)的configEpoch都不相同。這種算法是這樣實(shí)現(xiàn)的:當(dāng)某個(gè)主節(jié)點(diǎn)收到其他主節(jié)點(diǎn)發(fā)來的心跳包后,發(fā)現(xiàn)包中的configEpoch與自己的configEpoch相同,就會(huì)調(diào)用clusterHandleConfigEpochCollision函數(shù),解決這種configEpoch沖突的問題。

         clusterHandleConfigEpochCollision函數(shù)的代碼如下:

void clusterHandleConfigEpochCollision(clusterNode *sender) {   /* Prerequisites: nodes have the same configEpoch and are both masters. */   if (sender->configEpoch != myself->configEpoch ||     !nodeIsMaster(sender) || !nodeIsMaster(myself)) return;   /* Don't act if the colliding node has a smaller Node ID. */   if (memcmp(sender->name,myself->name,REDIS_CLUSTER_NAMELEN) <= 0) return;   /* Get the next ID available at the best of this node knowledge. */   server.cluster->currentEpoch++;   myself->configEpoch = server.cluster->currentEpoch;   clusterSaveConfigOrDie(1);   redisLog(REDIS_VERBOSE,     "WARNING: configEpoch collision with node %.40s."     " configEpoch set to %llu",     sender->name,     (unsigned long long) myself->configEpoch); } 

 如果發(fā)送節(jié)點(diǎn)的configEpoch不等于當(dāng)前節(jié)點(diǎn)的configEpoch,或者發(fā)送節(jié)點(diǎn)不是主節(jié)點(diǎn),或者當(dāng)前節(jié)點(diǎn)不是主節(jié)點(diǎn),則直接返回;

        如果相比于當(dāng)前節(jié)點(diǎn)的節(jié)點(diǎn)ID,發(fā)送節(jié)點(diǎn)的節(jié)點(diǎn)ID更小,則直接返回;

        因此,較小名字的節(jié)點(diǎn)能獲得更大的configEpoch,接下來首先增加自己的currentEpoch,然后將configEpoch賦值為currentEpoch。

         這樣,即使有多個(gè)節(jié)點(diǎn)具有相同的configEpoch,最終,只有具有最大節(jié)點(diǎn)ID的節(jié)點(diǎn)的configEpoch保持不變,其他節(jié)點(diǎn)都會(huì)增加自己的configEpoch,而且增加的值會(huì)不同,具有最小NODE ID的節(jié)點(diǎn),最終具有最大的configEpoch。

總結(jié)

以上就是本文關(guān)于Redis源碼解析:集群手動(dòng)故障轉(zhuǎn)移、從節(jié)點(diǎn)遷移詳解的全部內(nèi)容,有不足之處,請(qǐng)留言指出,感謝朋友們對(duì)本站的支持!


注:相關(guān)教程知識(shí)閱讀請(qǐng)移步到Redis頻道。
發(fā)表評(píng)論 共有條評(píng)論
用戶名: 密碼:
驗(yàn)證碼: 匿名發(fā)表
主站蜘蛛池模板: 国产精品资源手机在线播放 | 国产精品久久久久久久久久久久久久久久 | 在线成人影视 | 在线中文字幕观看 | 久久久久国 | 天天黄色片 | 国产1区2区3区中文字幕 | 精品亚洲免费 | 羞羞视频免费观看网站 | 亚洲天堂成人在线观看 | 久久精品国产亚洲7777小说 | 国产午夜精品久久久久 | 香蕉久草在线 | 欧美成在线视频 | 久久久久91视频 | 成人午夜激情网 | 久久精品视频网址 | 最新中文字幕在线 | 久久久视频免费观看 | 国产精品99精品 | 极品国产91在线网站 | 羞羞网站视频 | 中文字幕电影免费播放 | 日韩av手机在线免费观看 | 欧洲成人一区 | 国产午夜精品一区二区三区在线观看 | 精品在线观看一区二区三区 | 黑人日比 | 欧美日韩亚洲精品一区二区三区 | h色网站在线观看 | 日韩黄色成人 | 久久久成人精品 | 一级做a爱片久久 | 美女av在线免费观看 | 男女无遮挡羞羞视频 | 美女被免费网站在线软件 | 国产精品免费一区二区三区都可以 | 日本成人一区二区三区 | 草逼一区| 精品国产一区二区三区四区在线 | 成人精品|