XML is the best technique to represent data in a structural format. Reading the data from XML is called Parsing. As a developer, you should know XML parsing.

Have you heard about XPath? XPath is an expression language to evaluate/parse XML data. You can get more details on XPath on w3schools.

I've developed a component which will be very handy in order to parse XML. This component makes use of XPath expressions.

ParseXML.java
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import java.io.ByteArrayInputStream;
import java.io.StringReader;
import java.io.File;
import java.io.FileInputStream;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import javax.xml.xpath.XPathConstants;

/**
* ParseXML is a component to read and parse XML using XPath
* @author SANTHOSH REDDY MANDADI
* @since 25-April-2013
* @version 1.0
*/

public class ParseXML
{
private DocumentBuilder builder;
private Document document;

public ParseXML(String xmlData) throws Exception
{
builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
ByteArrayInputStream stream = new ByteArrayInputStream(xmlData.getBytes());
document = builder.parse(stream);
}

public ParseXML(File file) throws Exception
{
builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
document = builder.parse(new FileInputStream(file));
}

public String getValue(String xpathExpression) throws Exception
{
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(xpathExpression);
return (String)expr.evaluate(document, XPathConstants.STRING);
}

public Node getNode(String xpathExpression) throws Exception
{
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(xpathExpression);
return (Node)expr.evaluate(document, XPathConstants.NODE);
}
}
Explanation
  • ParseXML is the class name which has two private properties to hold the Document and DocumentBuilder objects
  • ParseXML class has two constructors each with 1 parameter. One constructor accepts XML string while the other constructor allows you to pass an XML file for parsing
  • getValue() method will return a value by evaluating the string parameter using XPath
  • getNode() method will return a Node by evaluating the string parameter using XPath

Let's use ParseXML component to read/parse the XML below. Store this XML file in the same folder

Person.xml

<?xml version="1.0"?>
<person>
<first-name>Santhosh Reddy</first-name>
<last-name>Mandadi</last-name>
<gender>Male</gender>
<age>28</age>
<email>msreddy61184@gmail.com</email>
<phone>9884707729</phone>
<address>
<address-line1>H.No. 4/17</address-line1>
<address-line2>Aswin Hari Building, Manapakkam</address-line2>
<city>Chennai</city>
<pin>600089</pin>
</address>
</person>
ReadXML.java
import java.io.File;
import org.w3c.dom.Node;

public class ReadXML
{
public static void main(String[] args)
{
File xml = new File("Person.xml");
//System.out.println(xml);
try
{
ParseXML parseXML = new ParseXML(xml);
System.out.println("Name: "+parseXML.getValue("/person/first-name")+" "+parseXML.getValue("/person/last-name"));
System.out.println("Gender: "+parseXML.getValue("/person/gender"));
System.out.println("Age: "+parseXML.getValue("/person/age"));
System.out.println("Email: "+parseXML.getValue("/person/email"));
System.out.println("Phone: "+parseXML.getValue("/person/phone"));
Node node = parseXML.getNode("/person");
System.out.println("Node name: "+node.getNodeName());
System.out.println("Node type: "+node.getNodeType());
System.out.println("Child nodes available: "+node.hasChildNodes());
System.out.println("hasAttributes: "+node.hasAttributes());
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}
Output
@santhosh> java ReadXML 
Name: Santhosh Reddy Mandadi
Gender: Male
Age: 28
Email: msreddy61184@gmail.com
Phone: 9884707729
Node name: person
Node type: 1
Child nodes available: true
hasAttributes: false