麻豆小视频在线观看_中文黄色一级片_久久久成人精品_成片免费观看视频大全_午夜精品久久久久久久99热浪潮_成人一区二区三区四区

首頁 > 編程 > Python > 正文

python正向最大匹配分詞和逆向最大匹配分詞的實例

2020-02-15 23:40:55
字體:
來源:轉載
供稿:網友

正向最大匹配

# -*- coding:utf-8 -*- CODEC='utf-8' def u(s, encoding):  'converted other encoding to unicode encoding'  if isinstance(s, unicode):    return s  else:    return unicode(s, encoding) def fwd_mm_seg(wordDict, maxLen, str):  'forward max match segment'  wordList = []  segStr = str  segStrLen = len(segStr)  for word in wordDict:    print 'word: ', word  print "/n"  while segStrLen > 0:    if segStrLen > maxLen:      wordLen = maxLen    else:      wordLen = segStrLen    subStr = segStr[0:wordLen]    print "subStr: ", subStr    while wordLen > 1:      if subStr in wordDict:        print "subStr1: %r" % subStr        break      else:        print "subStr2: %r" % subStr        wordLen = wordLen - 1        subStr = subStr[0:wordLen]#      print "subStr3: ", subStr    wordList.append(subStr)    segStr = segStr[wordLen:]    segStrLen = segStrLen - wordLen  for wordstr in wordList:    print "wordstr: ", wordstr  return wordList          def main():  fp_dict = open('words.dic')  wordDict = {}  for eachWord in fp_dict:    wordDict[u(eachWord.strip(), 'utf-8')] = 1  segStr = u'你好世界hello world'  print segStr  wordList = fwd_mm_seg(wordDict, 10, segStr)  print "==".join(wordList)   if __name__ == '__main__':  main()  

逆向最大匹配

# -*- coding:utf-8 -*-  def u(s, encoding):  'converted other encoding to unicode encoding'  if isinstance(s, unicode):    return s  else:    return unicode(s, encoding) CODEC='utf-8' def bwd_mm_seg(wordDict, maxLen, str):  'forward max match segment'  wordList = []  segStr = str  segStrLen = len(segStr)  for word in wordDict:    print 'word: ', word  print "/n"  while segStrLen > 0:    if segStrLen > maxLen:      wordLen = maxLen    else:      wordLen = segStrLen    subStr = segStr[-wordLen:None]    print "subStr: ", subStr    while wordLen > 1:      if subStr in wordDict:        print "subStr1: %r" % subStr        break      else:        print "subStr2: %r" % subStr        wordLen = wordLen - 1        subStr = subStr[-wordLen:None]#      print "subStr3: ", subStr    wordList.append(subStr)    segStr = segStr[0: -wordLen]    segStrLen = segStrLen - wordLen  wordList.reverse()  for wordstr in wordList:    print "wordstr: ", wordstr  return wordList          def main():  fp_dict = open('words.dic')  wordDict = {}  for eachWord in fp_dict:    wordDict[u(eachWord.strip(), 'utf-8')] = 1  segStr = ur'你好世界hello world'  print segStr  wordList = bwd_mm_seg(wordDict, 10, segStr)  print "==".join(wordList) if __name__ == '__main__':  main()  

以上這篇python正向最大匹配分詞和逆向最大匹配分詞的實例就是小編分享給大家的全部內容了,希望能給大家一個參考,也希望大家多多支持武林站長站。

發表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發表
主站蜘蛛池模板: 高清中文字幕在线 | 精品视频 久久久 | chinesehd天美原创xxxx | 国产精品中文在线 | 欧美激情性色生活片在线观看 | 国产精品一区在线观看 | 激情福利视频 | 国产乱一区二区三区视频 | 国产一级在线免费观看 | 成人免费福利视频 | 一级做a爱片毛片免费 | 久久精品男人 | 中文日韩在线 | 羞羞羞网站| 99综合视频 | 久草免费资源视频 | 黄色国产在线观看 | 国产男女爽爽爽爽爽免费视频 | 久久久免费观看完整版 | 精品一区二区三区在线观看视频 | 亚洲视频成人在线 | 欧美一区二区三区中文字幕 | 少妇一级淫片免费放4p | 天天色综合6 | 色综合久久久久久久久久 | 羞羞视频入口 | 在线播放免费视频 | 91精品国产综合久久久欧美 | 午夜视频福利 | 暖暖免费观看高清完整版电影 | 亚洲国产网址 | 成年免费大片黄在线观看岛国 | 亚洲四播房 | 91网址在线播放 | 91一区二区在线观看 | 久久99国产精品视频 | 日本免费aaa观看 | 日韩在线激情 | av一道本| 99re久久最新地址获取 | 美女擦逼|