Prev Next

Java / XML

Could not find what you were looking for? send us the question and we would be happy to answer your question.

1. What is XML?

XML stands for Extensible Markup language which means you can extend XML based on your needs. You can define custom tags like , etc in XML easily as opposed to other markup languages. The structure of XML can be standardized by making use of DTD and XML Schema. XML is mostly used to transfer data from one system to another, for example, between client and server in enterprise applications.

2. Difference between DTD and XML Schema.

DTD stands for Document Type definition and was a legacy way to define the structure of XML documents. XML schema is designed after DTD and it offers more types to map different types of data in XML documents.

DTD is not written using XML while XML schema are xml documents in itself, which means existing XML tools like XML parsers can be used to work with XML schema.

3. What is XPath?

XPath is an XML technology which is used to retrieve element from XML documents. Since XML documents are structured, XPath expression can be used to locate and retrieve elements, attributes or value from XML files.

4. How to select elements in XPATH based on attribute and element value?
<employees>
        <employee id="1">
                <name>John</name>
                <age>32</age>
        </employee>
        <employee id="2">
                <name>Adam</name>
                <age>36</age>
        </employee>
</employees>

In this example, to extract the age of employee Adam, the XPATH will be /employees/employee[name/text()='Adam']/age.

The text() function returns value of name element and [] brackets defines condition/predicate to compare name tag text with 'Adam'.

Another example of XPATH to find the age of employee by id =1 is /employees/employee[@id='1']/age and the result will be 32. Here [] is used for condition and @ is used to get value from attribute.

5. What is XSLT?

XSLT is a popular XML technology to transform one XML file to other XML, HTML or any other format. XSLT language specifies its own syntax, functions, and operator to transform XML documents. Usually, transformation is done by XSLT Engine which reads instruction written using XSLT syntax in XML style sheets or XSL files. One of the best examples of using XSLT is for displaying data present in XML files as HTML pages.

6. Difference between DOM and SAX XML Parser in Java.

DOM parser loads entire XML document in memory while SAX only loads the part of the XML file in memory.

DOM parser is faster than SAX because it accesses complete XML document in memory.

SAX parser is suitable for large XML files than DOM Parser because it doesn't require much memory.

DOM parser works on Document Object Model while SAX is an event based XML parser.

Prefer DOM parser over SAX parser if XML file is small otherwise use SAX parser if you don’t know the size of XML files to be processed or if it is large.

7. What is XML namespace?

XML namespace is similar to package in Java that provides a way to avoid conflict between two xml tags of same name but different sources. XML namespace is defined using xmlns attribute at top of the XML document and has following syntax xmlns:prefix=’URI’. That prefix is used along with actual tag in XML documents.

<root xmlns:javapedia="http://javapedia.net/javapedia-Example">
  <javapedia:email>
      <javapedia:id>javatutorials2016@gmail.com</javapedia:id>
   </javapedia:email>
</root>

8. What is XML data Binding in Java?

XML binding in Java refers to creating Java classes and object from XML documents and then modifying XML documents using Java.JAXB ,a Java API for XML binding provides convenient way to bind XML documents with Java objects.

9. Can you create a XML document using SAX parser?

Not recommended. Use SAX parser to parse or edit a XML document.It is better to use StAX parser for creating XML documents rather than using SAX parser.

10. What is SAX parser?

SAX stands for Simple API for XML. SAX Parser is an event-based parser for xml documents.

11. Difference between JAXB and DOM/SAX Parsers,

The Java DOM and SAX parsing APIs are lower-level APIs to parse XML documents, while JAXB is a higher-level API for converting XML elements and attributes to a Java object hierarchy (and vice versa).

Implementations of JAXB will use a DOM or SAX parser behind the scenes to do the actual parsing of the XML input data.

JAXB will be easier to use and less code than using the DOM or SAX parsing API.

12. How do I validate an XML file against XSD file?

The Java runtime library supports validation which is backed by Apache Xerces parser.

import java.io.File;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;

import javax.xml.XMLConstants;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

import org.xml.sax.SAXException;

public class XMLValidation {

	public static void main(String[] args) throws MalformedURLException {
		URL schemaFile = new URL("http://host:port/filename.xsd");
		Source xmlFile = new StreamSource(new File("web.xml"));
		SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
		try {
			Schema schema = schemaFactory.newSchema(schemaFile);
			Validator validator = schema.newValidator();
			validator.validate(xmlFile);
			System.out.println(xmlFile.getSystemId() + " is valid.");
		} catch (SAXException e) {
			System.out.println(xmlFile.getSystemId() + " is NOT valid xml, The reason is:" + e);
		} catch (IOException e) {
		}
	}

}
13. Explain about XEE/XXE XML security Attacks.

XML Entity Expansion (XEE) also known as XML Expansion/XML Bomb, is an attack that consumes unsuspected vast memory resources. Usually attacker leverages a feature of XML whereby entity references can reference another entity.

In DTD/XML External Entity (XXE) attacks, the victim is made to disclose sensitive/privileged information through weaknesses in the XML parser and the ability of the attacker to modify XML used by the service.

Mitigation strategies:

  • Don't use DTDs: Disable DTDS in your parser.
  • XML Schema validation: XML comes packed with schemal validation. For easy validation, you can pass the XML data file and the XML schema definition file to the validating XML parser.
  • Whitelist definitions: Filter and allow only certain definitions, and block everything that doesn't fit your syntax or regular expression match.
  • Blacklist: Block known bad patterns.
14. How do you prevent Injection attacks?

Defending against SQL Injection-JDBC Prepared statements: The Java platform includes a defensive measure to protect yourself against the SQL injection attacks: JDBC prepared statements. JDBC prepared statements to work by precomputing the SQL query into a binary database proprietary format.

Prior to execution, user data is bound to the pre-computed query, and finally, the query is executed. Any reserved control characters or words passed by attackers are considered user data and not part of the SQL statement. Following is a safer SQL example:

//Prevent SQL Injection
String query = "SELECT * FROM users WHERE Name=?";
PreparedStatement pstmt = connection.prepareStatement(query);
pstmt.setString(1,userName);
ResultSet results = pstmt.executeQuery(); 

Encoding reserved control sequences within untrusted Input: Proper encoding of data needed to mitigate untrusted input attacks as user may input vulnerable script/malicious code.

To defend against these attacks, reserved character sequences must not be conflated with user data. Specifically, less than and greater than reserved characters in HTML must be changed to their corresponding HTML entity references -- \< and \> respectively. Reserved characters such as ;/?:@=&, unsafe characters such as blank/empty space, "<>#%{}|\^~[]` require encoding.

XML parser defence: XML external entity (XXE) attacks are used to exfiltrate sensitive data, execute server-side port scanning and perform denial-of-service and other attacks by leveraging an XML external entity reference.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<foo>&xxe;</foo>

For example, the above XML fragment leveraged to exfilterated password file, the password file content will be dumped in to browser.

The best way to defend against XXE attacks is to leverage the security features provided by XML parsers. Specific defenses vary between XML parser implementations, but the platform offers a provision to configure security settings by passing arguments to DocumentBuilderFactory.setFeature().

«
»
Garbage collection

Comments & Discussions