当前位置: 首页 > 图文教程 > 网络编程 > PHP > 用PHP读取和编写XML DOM
<books><book><author>JackHerrington</author><title>PHPHacks</title><publisher>O'Reilly</publisher></book><book><author>JackHerrington</author><title>PodcastingHacks</title><publisher>O'Reilly</publisher></book></books>
<?php$doc=newDOMDocument();$doc->load('books.xml');$books=$doc->getElementsByTagName("book");foreach($booksas$book){$authors=$book->getElementsByTagName("author");$author=$authors->item(0)->nodeValue;$publishers=$book->getElementsByTagName("publisher");$publisher=$publishers->item(0)->nodeValue;$titles=$book->getElementsByTagName("title");$title=$titles->item(0)->nodeValue;echo"$title-$author-$publisher\n";}?>
脚本首先创建一个newDOMdocument对象,用load方法把图书XML装入这个对象。之后,脚本用getElementsByName方法得到指定名称下的所有元素的列表。
在book节点的循环中,脚本用getElementsByName方法获得author、publisher和title标记的nodeValue。nodeValue是节点中的文本。脚本然后显示这些值。
可以在命令行上像这样运行PHP脚本:
%phpe1.php
PHPHacks-JackHerrington-O'Reilly
PodcastingHacks-JackHerrington-O'Reilly
%
可以看到,每个图书块输出一行。这是一个良好的开始。但是,如果不能访问XMLDOM库该怎么办?
<?php$g_books=array();$g_elem=null;functionstartElement($parser,$name,$attrs){global$g_books,$g_elem;if($name=='BOOK')$g_books[]=array();$g_elem=$name;}functionendElement($parser,$name){global$g_elem;$g_elem=null;}functiontextData($parser,$text){global$g_books,$g_elem;if($g_elem=='AUTHOR'||$g_elem=='PUBLISHER'||$g_elem=='TITLE'){$g_books[count($g_books)-1][$g_elem]=$text;}}$parser=xml_parser_create();xml_set_element_handler($parser,"startElement","endElement");xml_set_character_data_handler($parser,"textData");$f=fopen('books.xml','r');while($data=fread($f,4096)){xml_parse($parser,$data);}xml_parser_free($parser);foreach($g_booksas$book){echo$book['TITLE']."-".$book['AUTHOR']."-";echo$book['PUBLISHER']."\n";}?><?php$xml="";$f=fopen('books.xml','r');while($data=fread($f,4096)){$xml.=$data;}fclose($f);preg_match_all("/\<book\>(.*?)\<\/book\>/s",$xml,$bookblocks);foreach($bookblocks[1]as$block){preg_match_all("/\<author\>(.*?)\<\/author\>/",$block,$author);preg_match_all("/\<title\>(.*?)\<\/title\>/",$block,$title);preg_match_all("/\<publisher\>(.*?)\<\/publisher\>/",$block,$publisher);echo($title[1][0]."-".$author[1][0]."-".$publisher[1][0]."\n");}?>
请注意这个代码有多短。开始时,它把文件读进一个大的字符串。然后用一个regex函数读取每个图书项目。最后用foreach循环,在每个图书块间循环,并提取出author、title和publisher。
那么,缺陷在哪呢?使用正则表达式代码读取XML的问题是,它并没先进行检查,确保XML的格式良好。这意味着在读取之前,无法知道XML是否格式良好。而且,有些格式正确的XML可能与正则表达式不匹配,所以日后必须修改它们。
我从不建议使用正则表达式读取XML,但是有时它是兼容性最好的方式,因为正则表达式函数总是可用的。不要用正则表达式读取直接来自用户的XML,因为无法控制这类XML的格式或结构。应当一直用DOM库或SAX解析器读取来自用户的XML。
<?php$books=array();$books[]=array('title'=>'PHPHacks','author'=>'JackHerrington','publisher'=>"O'Reilly");$books[]=array('title'=>'PodcastingHacks','author'=>'JackHerrington','publisher'=>"O'Reilly");$doc=newDOMDocument();$doc->formatOutput=true;$r=$doc->createElement("books");$doc->appendChild($r);foreach($booksas$book){$b=$doc->createElement("book");$author=$doc->createElement("author");$author->appendChild($doc->createTextNode($book['author']));$b->appendChild($author);$title=$doc->createElement("title");$title->appendChild($doc->createTextNode($book['title']));$b->appendChild($title);$publisher=$doc->createElement("publisher");$publisher->appendChild($doc->createTextNode($book['publisher']));$b->appendChild($publisher);$r->appendChild($b);}echo$doc->saveXML();?>%phpe4.php<?xmlversion="1.0"?><books><book><author>JackHerrington</author><title>PHPHacks</title><publisher>O'Reilly</publisher></book><book><author>JackHerrington</author><title>PodcastingHacks</title><publisher>O'Reilly</publisher></book></books>%
<?php$books=array();$books[]=array('title'=>'PHPHacks','author'=>'JackHerrington','publisher'=>"O'Reilly");$books[]=array('title'=>'PodcastingHacks','author'=>'JackHerrington','publisher'=>"O'Reilly");?><books><?phpforeach($booksas$book){?><book><title><?phpecho($book['title']);?></title><author><?phpecho($book['author']);?></author><publisher><?phpecho($book['publisher']);?></publisher></book><?php}?></books><books><?phpforeach($booksas$book){$title=htmlentities($book['title'],ENT_QUOTES);$author=htmlentities($book['author'],ENT_QUOTES);$publisher=htmlentities($book['publisher'],ENT_QUOTES);?><book><title><?phpecho($title);?></title><author><?phpecho($author);?></author><publisher><?phpecho($publisher);?></publisher></book><?php}?></books>
评论 (0) All