PHP 5.5.20 is available

The DOMXPath class

(PHP 5)

Introduction

Supports XPath 1.0

Class synopsis

DOMXPath {
/* Properties */
/* Methods */
public __construct ( DOMDocument $doc )
public mixed evaluate ( string $expression [, DOMNode $contextnode [, bool $registerNodeNS = true ]] )
public DOMNodeList query ( string $expression [, DOMNode $contextnode [, bool $registerNodeNS = true ]] )
public bool registerNamespace ( string $prefix , string $namespaceURI )
public void registerPhpFunctions ([ mixed $restrict ] )
}

Properties

document

Table of Contents

add a note add a note

User Contributed Notes 4 notes

up
13
Mark Omohundro, ajamyajax dot com
6 years ago
<?php
// to retrieve selected html data, try these DomXPath examples:

$file = $DOCUMENT_ROOT. "test.html";
$doc = new DOMDocument();
$doc->loadHTMLFile($file);

$xpath = new DOMXpath($doc);

// example 1: for everything with an id
//$elements = $xpath->query("//*[@id]");

// example 2: for node data in a selected id
//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");

// example 3: same as above with wildcard
$elements = $xpath->query("*/div[@id='yourTagIdHere']");

if (!
is_null($elements)) {
  foreach (
$elements as $element) {
    echo
"<br/>[". $element->nodeName. "]";

   
$nodes = $element->childNodes;
    foreach (
$nodes as $node) {
      echo
$node->nodeValue. "\n";
    }
  }
}
?>
up
-2
archimedix32783262 at mailinator dot com
3 months ago
Note that evaluate() will use the same encoding as the XML document.

So if you have a UTF-16 XML, you will have to query using UTF-16 strings.

You can use iconv() to convert from your code's encoding to the target encoding for better legibility.
up
-6
dhz
3 years ago
I just spent far too much time chasing this one....

When running an xpath query on a table be careful about table internal nodes (ie: <tr></tr>, and <td></td>).  If the master <table> tag is missing, then query() (and likely evaluate() also) will return unexpected results.

I had a DOMNode with a structure like this:

<td>
    <table></table>
    <table>
        <tr>
            <td></td>
        </tr>
        <tr>
            <td></td>
            <td></td>
        </tr>
    </table>
</td>

Upon which I was trying to do a relative query (ie: <?php $xpath_obj->query('my/x/path', $relative_node); ?>).

But because of the lone outer <td></td> tags, the inner tags were being invalidated, while the nodes were still recognized.  Meaning that the following query would work:

<?php $xpath_obj->query('*[2]/*[*[2]]', $relative_node); ?>

But when replacing any of the "*" tokens with the corresponding (and valid) "table", "tr", or "td" tokens the query would inexplicably break.
up
-22
david at lionhead dot nl
5 years ago
When using DOMXPath and having a default namespace. Consider using an intermediate function to add the default namespace to all queries:

<?php
// The default namespace: x:xmlns="http://..."
$path="/Book/Title";
$path=preg_replace("\/([a-zA-Z])","/x:$1",$path);

// Result: /x:Book/x:Title
?>
To Top