🧡 📱 🍡 PHP：如何解析复杂的XML文件而不会淹没在本机代码中 🛏️ 🧑‍🤝‍🧑 👩🏼‍🏫

一天中的好时光！

XML格式的范围非常广泛。与CSV，JSON和其他格式一起，XML是表示数据以在不同服务，程序和站点之间交换的最常用方法之一。例如，我们可以引用CommerceML格式在1C“贸易管理”和在线商店之间交换商品和订单。

因此，几乎每个不时创建Web服务的人都必须处理解析XML文档的需求。在我的文章中，我提出了一种关于如何使用XMLReader尽可能清楚和透明地执行此操作的方法。

PHP提供了几种使用XML格式的方法。在不赘述的情况下，我将原则上分为两类：

将整个XML文档作为一个对象加载到内存中并使用该对象
在标签，属性和文本内容级别上逐步读取XML字符串

第一种方法更直观，代码看起来更透明。此方法适用于小文件。

第二种方法是较低级别的方法，它为我们提供了许多优势，同时也使生活蒙上了阴影。让我们更详细地讨论它。优点：

解析速度。您可以在此处详细阅读。
消耗更少的RAM。我们不会以内存中非常昂贵的对象的形式存储所有数据。

但是：我们牺牲了代码的可读性。例如，如果我们解析的目的是使用简单的结构来计算XML内某些位置的值之和，那么就没有问题。
但是，如果文件结构很复杂，那么即使使用数据也要依赖于此数据的完整路径，并且结果应包含许多参数，因此这里的代码很混乱。

因此，我写了一节课，使我的生活更轻松。它的使用简化了规则的编写，并大大提高了程序的可读性，它们的大小变得小很多倍，并且代码更加美观。

主要思想是这样的：我们将在单个数组中存储XML的架构以及如何使用它，该数组仅重复我们所需标签的层次结构。同样，对于同一数组中的任何标签，我们将能够注册所需的函数，这些函数是用于打开标签，关闭标签，读取属性或读取文本或全部处理的处理程序。因此，我们将XML结构和处理程序存储在一个地方。只需看一下我们的处理结构就足以了解我们对XML文件所做的工作。我会保留一点，在简单的任务上（如下面的示例所示），可读性方面的优势很小，但是当使用结构相对复杂的文件（例如，使用1C的交换格式）时，这一点将显而易见。

现在的细节。这是我们的课：

调试版本（带有$ debug参数）：

XMLReaderStruct类-单击以展开

class XMLReaderStruct extends XMLReader { public function xmlStruct($xml, $structure, $encoding = null, $options = 0, $debug = false) { $this->xml($xml, $encoding, $options); $stack = array(); $node = &$structure; $skipToDepth = false; while ($this->read()) { switch ($this->nodeType) { case self::ELEMENT: if ($skipToDepth === false) { //       ,     ,  :      //  ,   ,      ,      .  //  ,    ,        . if (isset($node[$this->name])) { if ($debug) echo "[  ]: ",$this->name," -   .   .\r\n"; $stack[$this->depth] = &$node; $node = &$node[$this->name]; if (isset($node["__open"])) { if ($debug) echo "    ",$this->name," - .\r\n"; if (false === $node["__open"]()) return false; } if (isset($node["__attrs"])) { if ($debug) echo "    ",$this->name," - .\r\n"; $attrs = array(); if ($this->hasAttributes) while ($this->moveToNextAttribute()) $attrs[$this->name] = $this->value; if (false === $node["__attrs"]($attrs)) return false; } if ($this->isEmptyElement) { if ($debug) echo "  ",$this->name," .   .\r\n"; if (isset($node["__close"])) { if ($debug) echo "    ",$this->name," - .\r\n"; if (false === $node["__close"]()) return false; } $node = &$stack[$this->depth]; } } else { $skipToDepth = $this->depth; if ($debug) echo "[  ]: ",$this->name," -    .        ",$skipToDepth,".\r\n"; } } else { if ($debug) echo "(  ): ",$this->name," -    .\r\n"; } break; case self::TEXT: if ($skipToDepth === false) { if ($debug) echo "[  ]: ",$this->value," -  .\r\n"; if (isset($node["__text"])) { if ($debug) echo "    - .\r\n"; if (false === $node["__text"]($this->value)) return false; } } else { if ($debug) echo "(  ): ",$this->value," -    .\r\n"; } break; case self::END_ELEMENT: if ($skipToDepth === false) { //  $skipToDepth  ,   ,        , //       . if ($debug) echo "[  ]: ",$this->name," -   .   .\r\n"; if (isset($node["__close"])) { if ($debug) echo "    ",$this->name," - .\r\n"; if (false === $node["__close"]()) return false; } $node = &$stack[$this->depth]; } elseif ($this->depth === $skipToDepth) { //  $skipToDepth ,   ,    ,         . if ($debug) echo "[  ]: ",$this->name," -   ",$skipToDepth,".    .\r\n"; $skipToDepth = false; } else { if ($debug) echo "(  ): ",$this->name," -    .\r\n"; } break; } } return true; } }

发行版本（不带$ debug参数和注释）：

XMLReaderStruct类-单击以展开

 class XMLReaderStruct extends XMLReader { public function xmlStruct($xml, $structure, $encoding = null, $options = 0) { $this->xml($xml, $encoding, $options); $stack = array(); $node = &$structure; $skipToDepth = false; while ($this->read()) { switch ($this->nodeType) { case self::ELEMENT: if ($skipToDepth === false) { if (isset($node[$this->name])) { $stack[$this->depth] = &$node; $node = &$node[$this->name]; if (isset($node["__open"]) && (false === $node["__open"]())) return false; if (isset($node["__attrs"])) { $attrs = array(); if ($this->hasAttributes) while ($this->moveToNextAttribute()) $attrs[$this->name] = $this->value; if (false === $node["__attrs"]($attrs)) return false; } if ($this->isEmptyElement) { if (isset($node["__close"]) && (false === $node["__close"]())) return false; $node = &$stack[$this->depth]; } } else { $skipToDepth = $this->depth; } } break; case self::TEXT: if ($skipToDepth === false) { if (isset($node["__text"]) && (false === $node["__text"]($this->value))) return false; } break; case self::END_ELEMENT: if ($skipToDepth === false) { if (isset($node["__close"]) && (false === $node["__close"]())) return false; $node = &$stack[$this->depth]; } elseif ($this->depth === $skipToDepth) { $skipToDepth = false; } break; } } return true; } }

如您所见，我们的类扩展了标准XMLReader类的功能，并向其中添加了一种方法：

 xmlStruct($xml, $structure, $encoding = null, $options = 0, $debug = false)

参数：

$ xml，$编码，$选项 ：如XMLReader :: xml（）
$ structure ：一个关联数组，完整描述我们应该如何处理文件。可以理解，它的外观是事先已知的，我们确切地知道应该使用什么标签以及应该做什么。
$ debug ：（仅用于Debug版本）是否输出调试信息（默认情况下-关闭）。

$结构参数。

这是一个关联数组，其结构重复XML文件中标记的层次结构，此外，如有必要，每个结构元素都可以具有处理函数（定义为具有相应键的字段）：

“ __open” -打开标签时的功能-功能（）
“ __attrs” -处理标签属性的函数（如果有）-函数（$ assocArray）
“ __text” -存在标记的文本值的函数-函数（$文本）
“ __close” -关闭标签时的功能-功能（）

如果任何处理程序返回false，则解析将中止，而xmlStruct（）函数将返回false。以下示例说明如何构造$结构参数：

示例1显示了调用处理程序的顺序

假设有一个XML文件：

 <?xml version="1.0" encoding="UTF-8"?> <root> <a attr_1="123" attr_2="456">Abc</a> <b> <x>This is node x inside b</x> </b> <c></c> <d> <x>This is node x inside d</x> </d> <e></e> </root>

 $structure = array( 'root' => array( 'a' => array( "__attrs" => function($array) { echo "ATTR ARRAY IS ",json_encode($array),"\r\n"; }, "__text" => function($text) use (&$a) { echo "TEXT a {$text}\r\n"; } ), 'b' => array( "__open" => function() { echo "OPEN b\r\n"; }, "__close" => function() { echo "CLOSE b\r\n"; }, 'x' => array( "__open" => function() { echo "OPEN x\r\n"; }, "__text" => function($text) { echo "TEXT x {$text}\r\n"; }, "__close" => function() { echo "CLOSE x\r\n"; } ) ) ) ); $xmlReaderStruct->xmlStruct($xml, $structure);

处理程序将被调用（按时间顺序）：

根属性-> a
文本字段root-> a
开头root-> b
开头root-> b-> x
文本根-> b-> x
闭根-> b-> x
闭根-> b

其余字段将不被处理（包括root-> d-> x将被忽略，因为它在结构之外）

示例2说明了一个简单的实际任务

假设有一个XML文件：

 <?xml version="1.0" encoding="UTF-8"?> <shop> <record> <id>0</id> <type>product</type> <name>Some product name. ID:0</name> <qty>0</qty> <price>0</price> </record> <record> <id>1</id> <type>service</type> <name>Some product name. ID:1</name> <qty>1</qty> <price>15</price> </record> <record> <id>2</id> <type>product</type> <name>Some product name. ID:2</name> <qty>2</qty> <price>30</price> </record> <record> <id>3</id> <type>service</type> <name>Some product name. ID:3</name> <qty>3</qty> <price>45</price> </record> <record> <id>4</id> <type>product</type> <name>Some product name. ID:4</name> <qty>4</qty> <price>60</price> </record> <record> <id>5</id> <type>service</type> <name>Some product name. ID:5</name> <qty>5</qty> <price>75</price> </record> </shop>

这是一张包含商品和服务的银行本票。

每个检查记录均包含记录标识符，类型（产品“产品”或服务“服务”），名称，数量和价格。

任务：计算支票的金额，但单独计算商品和服务的金额。

 include_once "xmlreaderstruct.class.php"; $x = new XMLReaderStruct(); $productsSum = 0; $servicesSum = 0; $structure = array( 'shop' => array( 'record' => array( 'type' => array( "__text" => function($text) use (&$currentRecord) { $currentRecord['isService'] = $text === 'service'; } ), 'qty' => array( "__text" => function($text) use (&$currentRecord) { $currentRecord['qty'] = (int)$text; } ), 'price' => array( "__text" => function($text) use (&$currentRecord) { $currentRecord['price'] = (int)$text; } ), '__open' => function() use (&$currentRecord) { $currentRecord = array(); }, '__close' => function() use (&$currentRecord, &$productsSum, &$servicesSum) { $money = $currentRecord['qty'] * $currentRecord['price']; if ($currentRecord['isService']) $servicesSum += $money; else $productsSum += $money; } ) ) ); $x->xmlStruct(file_get_contents('example.xml'), $structure); echo 'Overal products price: ', $productsSum, ', Overal services price: ', $servicesSum;

PHP：如何解析复杂的XML文件而不会淹没在本机代码中

More articles: