Pretty Printing XML on the Command-Line
A couple of times recently I've found that I need to pretty-print XML - be that HTML or actual XML, but haven't found a great way from the command-line.
Fortunately DuckDuckGo has an HTML Beautify setup but that's not super safe for proprietary content, nor for automating the pretty-printing.
But as I found today, it turns out that the xmllint
command can save us here.
For instance, take the following XML file that I've purposefully uglified:
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>me.jvt.www</groupId> <artifactId>www-api</artifactId> <version>0.5.0-SNAPSHOT</version> <modules> <module>www-api-web</module> <module>www-api-acceptance</module> <module>www-api-core</module> <module>indieauth-spring-security</module> </modules> <packaging>pom</packaging> </project>
If we feed this through xmllint --format
:
$ xmllint --format in.xml
We then get the following pretty-printed XML:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>me.jvt.www</groupId>
<artifactId>www-api</artifactId>
<version>0.5.0-SNAPSHOT</version>
<modules>
<module>www-api-web</module>
<module>www-api-acceptance</module>
<module>www-api-core</module>
<module>indieauth-spring-security</module>
</modules>
<packaging>pom</packaging>
</project>
If you are using this to pretty-print HTML, you can use:
$ xmllint --html --format in.html
But note that it will not ignore any non-standard HTML elements, or anything that it doesn't understand at least.