一般從網上看到的sax解析,都是在Handler中的characters方法進行對象數據的賦值。示例代碼如下:
PRivate TransportFile parseXML(String xml) {SAXParserFactory saxfac = SAXParserFactory.newInstance();try { SAXParser saxparser = saxfac.newSAXParser(); InputStream is = new ByteArrayInputStream(xml.getBytes()); MySAXHandler handler = new MySAXHandler(); saxparser.parse(is, handler); return handler.getData();} catch (ParserConfigurationException e) { e.printStackTrace();} catch (SAXException e) { e.printStackTrace();} catch (FileNotFoundException e) { e.printStackTrace();} catch (IOException e) { e.printStackTrace();}return null;}private class MySAXHandler extends DefaultHandler{String currentTagName = "";TransportFile mData = null ;@Overridepublic void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { currentTagName = qName ; if("file".equals(qName)){ mData = new TransportFile(); }}@Overridepublic void characters(char[] ch, int start, int length) throws SAXException { String str = new String(ch,start,length); if("guid".equals(currentTagName)){ mData.guid = str; }else if("name".equals(currentTagName)){ mData.name = str; }else if("type".equals(currentTagName)){ mData.type = str; }else if("length".equals(currentTagName)){ mData.length = Long.parseLong(str); }else if("index".equals(currentTagName)){ mData.index = Integer.parseInt(str); }else if("count".equals(currentTagName)){ mData.count = Integer.parseInt(str); }else if("data".equals(currentTagName)){ mData.data = Base64.decode(str); }}@Overridepublic void endElement(String uri, String localName, String qName) throws SAXException { currentTagName = "";}public TransportFile getData(){ return mData ;}}
普通的使用場景中上述代碼沒有問題,但是當xml中某一標簽的內容很長時,就會引發上述代碼的bug。實踐發現sax解析每次也就解析1k左右的數據,超出部分其實是要分段多次解析的。所以問題來了,如果在characters方法中解析,那么其他幾段的數據仍然會不斷覆蓋最終返回對象中的數據,導致數據丟失。因此,對于賦值給最終傳回對象的數據,在characters階段只能不斷拼接,解析必須在endElement時才可以完成。否則當數據內容比較大的時候,網上代碼中的bug就會凸顯出來。順便貼上我的代碼:
private class MySAXHandler extends DefaultHandler{ String currentTagName = ""; TransportFile mData = null ; @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { currentTagName = qName ; mStringBuilder = new StringBuilder(); if("file".equals(qName)){ mData = new TransportFile(); } } private StringBuilder mStringBuilder; @Override public void characters(char[] ch, int start, int length) throws SAXException { mStringBuilder.append(ch, start, length); } @Override public void endElement(String uri, String localName, String qName) throws SAXException { String str = mStringBuilder.toString(); if("guid".equals(currentTagName)){ mData.guid = str; }else if("name".equals(currentTagName)){ mData.name = str; }else if("type".equals(currentTagName)){ mData.type = str; }else if("length".equals(currentTagName)){ mData.length = Long.parseLong(str); }else if("index".equals(currentTagName)){ mData.index = Integer.parseInt(str); }else if("count".equals(currentTagName)){ mData.count = Integer.parseInt(str); }else if("data".equals(currentTagName)){ mData.data = Base64.decode(str); } currentTagName = ""; } public TransportFile getData(){ return mData ; } }
characters方法參數注意ch是當前解析到的字符數組,并不是精確的標簽內的內容。下面是解析第一個標簽時characters中 ch 、 start、length輸出:
===========characters ch: <?xml version='1.0' encoding='utf-8' standalone='yes' ?><file><guid>678c6f92-d617-40af-bb87-a80c3b2be91f</guid><name>0CAQLTZGO.jpg</name><type>image</type><length>71374</length><index>0</index><count>1</count><data>/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAYEBQYFBAYGBQYHBwYIChAKCgkJChQODwwQFxQYGBcUFhYaHSUfGhsjHBYWICwgIyYnKSopGR8tMC0oMCUoKSj/2wBDAQcHBwoIChMKChMoGhYaKCgoKCgoK.....===========characters start:31===========characters length:36
真正當前需要的數據是ch數組從start開始的length個字符。
新聞熱點
疑難解答