Unicode code points in the following ranges are valid in XML 1.0 documents:
- U+0009, U+000A, U+000D: these are the only C0 controls accepted in XML 1.0;
- U+0020–U+D7FF, U+E000–U+FFFD: this excludes some (not all) non-characters in the BMP (all surrogates, U+FFFE and U+FFFF are forbidden);
- U+10000–U+10FFFF: this includes all code points in supplementary planes, including non-characters.
The preceding code points ranges contain the following controls which are only valid in certain contexts in XML 1.0 documents, and whose usage is restricted and highly discouraged:
- U+007F–U+0084, U+0086–U+009F: this includes a C0 control character and all but one C1 control.
Pattern p = Pattern.compile("[^\\u0009\\u000A\\u000D\\u0020-\\uD7FF\\uE000-\\uFFFD\\u10000-\\u10FFFF]+");
requestXml = p.matcher(requestXml).replaceAll("");