Description
The GetTextNormalize method allows you to obtain the text data that is contained within the current PBDOM_CHARACTERDATA object, with all surrounding whitespace characters removed and internal whitespace characters normalized to a single space.
Syntax
pbdom_chardata_name.GetTextNormalize()
Return value
String.
The following table lists the return values, based on the type of DOM object contained within PBDOM_CHARACTERDATA.
DOM Object |
Return Value |
---|---|
PBDOM_TEXT |
Suppose you have the following element: <abc> MY TEXT </abc> If there is a PBDOM_TEXT object to represent the TEXT NODE "MY TEXT", then calling GetTextNormalize on the PBDOM_TEXT returns the string MY TEXT. |
PBDOM_CDATA |
Suppose there is the following CDATA: <![CDATA] They're saying "x < y" & that "z > y" so I guess that means that z > x ]]> If there is a PBDOM_CDATA to represent the above CDATA section, then calling GetTextNormalize on it returns the string: They're saying " x < y " & that "z > y" so I guess that means that z > x Note that the initial spaces before "They're" and the trailing space after the last "x" are removed. Additionally, the spaces between the words "guess" and "that" are reduced to just one space. |
PBDOM_COMMENT |
Suppose there is the following comment: <!--This is a comment --> Calling GetTextNormalize on this comment returns: This is a comment |
Throws
EXCEPTION_PBDOM_OBJECT_INVALID_FOR_USE -- If this PBDOM_CHARACTERDATA is not a reference to an object derived from PBDOM_CHARACTERDATA.
Examples
This example demonstrates:
-
Using an external general parsed entity.
-
Using a single line statement to obtain the children PBDOM_OBJECTs of an element.
-
Obtaining the text of the three separate types of PBDOM_CHARACTERDATA objects : PBDOM_TEXT, PBDOM_COMMENT, and PBDOM_CDATA.
-
Obtaining the normalized text of the same three separate types of PBDOM_CHARACTERDATA objects.
-
The difference between the two types of text retrieved in 3 and 4.
Suppose the file C:\entity_text.txt contains the following string:
	 Some External  	 Text 	
The example creates a PBDOM_DOCUMENT pbdom_doc based on the following DOM tree, which is in the file C:\inputfile.txt:
<!DOCTYPE abc [<!ENTITY text1 SYSTEM "c:\entity_text.txt" >]> <abc> <data> &text1; <!-- &text1;--> <![CDATA[&text1;]]> </data> </abc>
The Document Type Declaration defines an external general parsed entity text1.
The example obtains the root element, uses it to obtain the data child element, and then obtains an array of the child element's own children. PBDOM collects all the PBDOM_OBJECTs that are the children of data and stores them in the PBDOM_OBJECT array pbdom_obj_array.
Next, the FOR loop iterates through all the items in pbdom_obj_array and stores each item in the PBDOM_CHARACTERDATA array pbdom_chardata. This step is not required -- the pbdom_obj_array can be used to manipulate the data element's children. It is done to demonstrate that you can cast each item into a PBDOM_CHARACTERDATA object by assigning it into a PBDOM_CHARACTERDATA array. This is possible if and only if each PBDOM_OBJECT is also derived from PBDOM_CHARACTERDATA. If a PBDOM_OBJECT is not derived from PBDOM_CHARACTERDATA, the PowerBuilder VM throws an exception.
The next FOR loop iterates through all the items of the pbdom_chardata array and calls the GetText and GetTextNormalize methods on each. Each of the returned strings from GetText and GetTextNormalize is delimited by "[" and "]" characters so that the complete text content displays clearly in the message boxes.
The first child of data is the PBDOM_TEXT &text1;, which has been declared as an external general parsed entity whose content is the content of the file c:\entity_text.txt. The &text1; entity reference and the entity references it contains are expanded by the parser. The call to GetTextNormalize strips away the whitespace characters.
The second child of data is the PBDOM_COMMENT <!-- &text1;--> and the third child is the PBDOM_CDATA <![CDATA[&text1;]]>. Entity references within comments and CDATA sections are never expanded. Both GetText and GetTextNormalize return &text1;.
PBDOM_Builder pbdombuilder_new pbdom_document pbdom_doc PBDOM_CHARACTERDATA pbdom_chardata[] PBDOM_OBJECT pbdom_obj_array[] integer iFileNum1 long l = 0 TRY pbdombuilder_new = Create PBDOM_Builder pbdom_doc = pbdombuilder_new.BuildFromFile & ("C:\inputfile.txt") pbdom_doc.GetRootElement(). & GetChildElement("data"). & GetContent(pbdom_obj_array) for l = 1 to UpperBound(pbdom_obj_array) pbdom_chardata[l] = pbdom_obj_array[l] next for l = 1 to UpperBound(pbdom_chardata) MessageBox(pbdom_chardata[l]. & GetObjectClassString() + "GetText()", & "[" + pbdom_chardata[l].GetText() + "]") MessageBox (pbdom_chardata[l]. & GetObjectClassString() + " GetTextNormalize()", & "[" + pbdom_chardata[l].GetTextNormalize() + "]") next Destroy pbdombuilder_new CATCH (PBDOM_Exception except) MessageBox ("Exception Occurred", except.Text) END TRY
Usage
If no textual value exists for the current PBDOM_OBJECT, or if only whitespace characters exist, an empty string is returned.
See also