Formatting Perl’s XML::DOM output

Written by James McDonald

February 14, 2009

I have been taking some output from our ERP system into a webform destined ultimately for resubmission to the ERP.

XML and it’s relationship to HTML makes it perfect for the job.

Therefore I have been using Perls XML::DOM to read and write some XML.

One issue I have found is that using the perl XML::DOM methods $blah->toString and $blah->toTextFile(“File.xml”) causes it to be in one long string of non-indented output as follows….


0 text node1 text node2 text node3 text node4 text node

xmllint is one simple way to work around the issue

xmllint --format yourxmlfile

Hence:

xmllint --format test2.xml > youroutputfile.xml


  
    0 text node
    1 text node
    2 text node
    3 text node
    4 text node
  

If you wish to display it in Windows then converting the resulting output to Windows line ends is simple:

sudo apt-get install tofrodos

# this is not a default package under Ubuntu 8.04 but very handy if you work on Windows and Linux/Unix.

unix2dos youroutputfile.xml

Formatting XML for display in WordPress
Incidentally when formatting this XML to display in WordPress I needed to replace a lot of the html entities with their special characters:

./createXML.pl.txt | xmllint --format - | sed -e 's/ / /g;s//>/g'

The command above means
run createXML.pl and pipe the output to the standard input of xmllint
xmllint’s –format option will reformat and indent XML document. The “-” is required to take the input from the pipe “|” on the left
From xmllint pipe the output to sed and perform three find and replace operations:

  1. Find spaces and replace them with  
  2. find < and replace with &lt;
  3. and find > and replace with &gt;
  4. .

Creating and Formatting XML without leaving PERL

#!/usr/bin/perl -W
#

use XML::DOM;

my $doc = XML::DOM::Document->new;
my $root = $doc->createElement("root");

my $xmldecl = $doc->createXMLDecl('1.0');
my $body = $doc->createElement("element1");

$root->appendChild($body);

for ($i = 0; $i < 5; $i++) {
	my $h1 = $doc->createElement('element2');

	$h1->setAttribute("anAttribute", "attr value $i");

	$body->appendChild($h1);

	my $text = $doc->createTextNode("$i text node");

	$h1->appendChild($text);

}

my $output_doc = "myout.xml";

open(XMLLINT, "|xmllint --format - > $output_doc") or die "can't open XMLLINT\n";

print XMLLINT $xmldecl->toString . $root->toString . "\n";

close(XMLLINT);

0 Comments

Trackbacks/Pingbacks

  1. create empty XML::DOM document in Perl – bobsharpie - […] since it comes out as one line, use what I dsicovered here to pretty priint it https://jamesmcdonald.id.au/it-tips/xml-learnings-for-make-life-easier […]

Submit a Comment

Your email address will not be published. Required fields are marked *

You May Also Like…

How to Research a CPU Upgrade

How to Research a CPU Upgrade

Upgrade Time! Doing a lot of VMWare Workstation virtualization to create labs for self-study and training. Finding...