You may want to look at vtd-xml (The Future of XML Processing) which is ideally suited for the task that you outlined