535. XML Parsing Using
Visual Basic
Rev. 1.0
This module introduces the MSXML 4 parser and the two main Visual
Basic APIs for parsing XML documents: SAX and the DOM. Students learn the basic
MSXML architecture and how to create parsers that expose SAX or DOM APIs in VB
code, and how to configure parsers according to the SAX features and properties
specification. SAX parsing is covered, working from simple SAX event handling
through patterns for understanding document content from event sequences, to
error handling and document validation. Students then learn how to read
document information using the DOM’s tree model and API, and move on to using
the DOM to modify and to create new documents and information nodes.
LEARNING OBJECTIVES
·
Understand the use of SAX and DOM APIs for XML parsing.
·
Use MSXML to write XML parsing code in Visual Basic.
·
Parse element and attribute content, processing
instructions, and other document information using SAX.
·
Parse documents using the DOM.
·
Modify, create and delete information in an XML
document using the DOM.
Course Duration: 2
days
Prerequisites: Experience
in Visual Basic Programming. Basic knowledge, not necessarily fluency, in
reading and writing well-formed XML documents, and an understanding of the
concepts of valid documents and XML vocabularies. Both DTDs and XML Schema are
used in the module, and structural understanding of either will be very
helpful.
1. The
Microsoft XML Parser (MSXML)
Pure XML
Parsing XML
SAX and DOM
Comparison of SAX and DOM
What the W3C Says
What the W3C Doesn’t Say
MSXML Introduction
MSXML Deployment
2. The
Simple API for XML (SAX)
Origins of SAX
The SAX Parser
The SAXReader CoClass
The SAX Event Model
The SAX Event Model – Interactions
The ContentHandler Interface
Reading Document Content
Handling Namespaces
SAX Features for Namespaces
Parsing Attributes
Error Handling
Handling Processing Instructions
DTD Validation
Schema Validation
XML for Object Persistence
Serialization with SAX
3. The
Document Object Model (DOM)
Origins of the DOM
DOM Levels
DOM2 Structure
The DOMDocument CoClass
DOM Tree Model
DOM Interfaces
Document, Node and NodeList Interfaces
Element and Text Interfaces
Finding Elements By Name
Walking the Child List
The Attribute Interface
Namespaces and the DOM
Error Handling
The ProcessingInstruction Interface
The DOM and the XML InfoSet
Combining SAX and DOM
Object Serialization with the DOM
4. Manipulating
XML Information with the DOM
Modifying Documents
Modifying Elements
Modifying Attributes
Managing Children
Cloning
Splitting Text and Normalizing
Creating New Documents
Object Persistence with the DOM
Adapting Object Models to the DOM
Learning Resources
Quick Reference: XML and DTD Grammar
System Requirements
Software for this module can be installed and run on Windows systems
only. Required tools are Visual Basic 6 and the MSXML parser, version 4.0 or
higher.
Hardware requirements are modest: a good minimal system for this
module would have a Pentium 500MHz or equivalent CPU, 256 meg of RAM and at
least 500 megabytes of free disk space for tools installation and lab software.