Home > GAE Java, Java / J2EE, Programming > How to read data from xml file on Google App Engine – Part 1, Setup & Prerequisites

How to read data from xml file on Google App Engine – Part 1, Setup & Prerequisites

Google App Engine is perhaps one of the best things that has happened to a small budget app developer (who dreams of making the most successful web app of the century) as far as cloud computing is concerned (IMHO). It takes a lot of complexities out of your development cycle and lets you focus on your main aim… to build that killer app !! And you dont have to worry if your web-app starts receiving hundreds of thousand of hits a day, Google App Engine would automatically scale your app for you !

But not everything is as rosy as it sounds… Google has restricted various features that would other wise be available had you chosen a normal hosting provider and a standard database. The topic of this blog entry is one such restriction – the inability to read from a file stored locally in the application..

There might be several scenarios when you may need to read data/information from an xml file stored in the application’s war. In my case I wanted to populate the datastore from the data stored in a particular xml at the application’s startup.. so that this data (which happens to be the bare minimum required for my app) is automatically populated in the datastore at startup itself. The other way was creating a form to insert this data, which actually is even more horrific idea than it sounds..

There are various parsers available for parsing xml files. I personally like the JDOM one, even though it is heavy on resources (memory). As of this moment I have been unable to use the JDOM way for parsing xml data on Google App Engine (this is because JDOM uses java IO library which does not work on the Google App Engine).

Finally I decided to write my own handler for reading the data from xml using the basix SAXParser. I shall create a sample project in order to describe how to read the xml data.

The example that I have taken assumes that you are using JDO for persistence and wish to create JDO objects and populate them with the values stored in the xml file.

The image below shows the directory structure of my demo project (XML Parsing Demo Project).

Directory Structure of Demo Project

Important Note : The App Engine considers all files inside WEB-INF as dyamic files and the files outside it as static files. Hence I have placed Data.xml file directly inside the war.

Now lets have a look at the Data.xml file

Data.xml

<?xml version="1.0" encoding="utf-8"?>
<data>
	<countries>
		<country name="India">
			<capital>New Delhi</capital>
		</country>
		<country name="Japan">
			<capital>Tokyo</capital>
		</country>
	</countries>
</data>

Its a pretty simple xml file that doesnt follow any xml schema (and hence you need to be extra careful about the placement of your tags). Each country has a name and a capital. The name is in the form of attribute of the country tag and the capital is in the form of a nested tag of country. Finally the countries are grouped in a countries tag.

Next shown below is the web.xml file.

web.xml

<?xml version="1.0" encoding="utf-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://java.sun.com/xml/ns/javaee"
xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
xsi:schemaLocation="http://java.sun.com/xml/ns/javaee
http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" version="2.5">
	<servlet>
		<servlet-name>XMLDemoServlet</servlet-name>
		<servlet-class>com.shank.xml.demo.servlets.XMLDemoServlet</servlet-class>
	</servlet>
	<servlet-mapping>
		<servlet-name>XMLDemoServlet</servlet-name>
		<url-pattern>/xmlDemo</url-pattern>
	</servlet-mapping>
	<welcome-file-list>
		<welcome-file>index.html</welcome-file>
	</welcome-file-list>
</web-app>

The listing below shows our jdo data class for country. Please note the I have omitted the getters and setters so as to keep the unnecessary code to a minimum.

COUNTRY class..

package com.shank.xml.demo.jdo;

import java.util.logging.Logger;

import javax.jdo.annotations.IdGeneratorStrategy;
import javax.jdo.annotations.PersistenceCapable;
import javax.jdo.annotations.Persistent;
import javax.jdo.annotations.PrimaryKey;

/**
 *
 * @author shank
 *
 */
@PersistenceCapable
public class Country {

 // the logger for this class
 private static final Logger logger = Logger.getLogger(Country.class
   .getName());

 // the id
 @PrimaryKey
 @Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
 private Long id;

 // the name of the country
 @Persistent
 private String name;

 // the name of the capital of this country
 @Persistent
 private String capitalName;

 // constructors
 public Country(String countryName) {
  this.name = countryName;
 }

 // getters and setters...
}

Next in line, the listing below is for the PMF class that applies the singleton pattern to the PersistenceManager instance. This is advantageous as the creation of a persistence manager instance is a very heavy task.

PMF class

package com.shank.xml.demo.jdo;

import javax.jdo.JDOHelper;
import javax.jdo.PersistenceManagerFactory;

/**
 * @author shank
 * 
 */
public final class PMF {

	private static final PersistenceManagerFactory pmfInstance = JDOHelper
			.getPersistenceManagerFactory("transactions-optional");

	private PMF() {
	}

	public static PersistenceManagerFactory get() {
		return pmfInstance;
	}

}

Next let us have a look at the DatastoreUtilities class. This class has a method for getting a jdo object of a specific class according to its name. Below is the listing for this class

DATASTOREUTILITIES class…

package com.shank.xml.demo.jdo;

import java.util.List;
import java.util.logging.Logger;

import javax.jdo.PersistenceManager;
import javax.jdo.Query;

import com.google.appengine.api.datastore.Key;

/**
 * This is the common datastore utilities class.
 * 
 * The functions in this class are static and are the common functions that are required
 * for operating on the datastore
 * 
 * @author shank
 *
 */
public class DatastoreUtilities {
	
	private static Logger logger = Logger.getLogger("CommonDatastoreUtilities");
	
	/**
	 * This is the generic Helper method to check if an entity of a particular name exists in the 
	 * particular class entities. It also needs a persistence manager instance to execute the 
	 * calls to the database
	 * 
	 * @param <T> 
	 * @param string
	 * @param class1
	 * @param T 
	 * @return
	 */
	public static <T> boolean checkIfEntityExists(String nameToBeChecked,
			Class<T> entityClass) {
		// TODO Auto-generated method stub
		logger.info("inside checkIfEntityExists() : looking for "+nameToBeChecked+" in "+entityClass.getSimpleName());
		
		PersistenceManager pm = PMF.get().getPersistenceManager();
		
		boolean entityExists = false;
		
		Query query = pm.newQuery(entityClass, "name == nameParam");
		query.declareParameters("String nameParam");
		
		List<T> existingEntities = (List<T>)query.execute(nameToBeChecked);
		logger.info("Existing Entities with the same name are : "+existingEntities.toString());
				
		if(existingEntities.size() == 1){
			entityExists = true;
		}
		
		pm.close();
		
		return entityExists;
	}
		
}

We shall now look at the servlet that we shall be using to read the data from our Data.xml file. This shall be done by the custom xml parser that we shall see in the Part-2 of this post. The custom parser shall then create Country objects from this data and persist these Country objects to the Country datastore.

XMLDemoServlet class

package com.shank.xml.demo.servlets;

import java.io.IOException;
import java.util.logging.Logger;

import javax.servlet.http.*;

import com.shank.xml.CustomXMLParser;

public class XMLDemoServlet extends HttpServlet {
	
	//the logger for this class
	private Logger logger = Logger.getLogger(this.getClass().getName());
	
	public void doGet(HttpServletRequest req, HttpServletResponse resp)
			throws IOException {
		logger.info("inside doGet()");
				
		logger.info("calling customParser.doParsing()");
		
		//call the doParsing method on our CustomXMLParser
		new CustomXMLParser().doParsing();
		
		resp.setContentType("text/plain");
		resp.getWriter().println("Parsing Done !  Check the datastore");
	}
}

Next shown below is the index.html page that is the default welcome page for the application. It consists of a link with title “xml deml”. When this link is clicked, the control goes to the XMLDemoServlet as defined by the web.xml file.

index.html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- The HTML 4.01 Transitional DOCTYPE declaration-->
<!-- above set at the top of the file will set     -->
<!-- the browser's rendering engine into           -->
<!-- "Quirks Mode". Replacing this declaration     -->
<!-- with a "Standards Mode" doctype is supported, -->
<!-- but may lead to some differences in layout.   -->

<html>
  <head>
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
    <title>Hello App Engine</title>
  </head>

  <body>
    <h1>Hello App Engine!</h1>
	
    <table>
      <tr>
        <td colspan="2" style="font-weight:bold;">Available Servlets:</td>        
      </tr>
      <tr>
        <td><a href="xmlDemo">Parse Data.xml</a></td>
      </tr>
    </table>
  </body>
</html>

That is all for part-1 of this post. In part-2 we shall have a look at the CustomXMLParser code that shall do the actual parsing work.

Leave a comment