Having problems after Security Update for .Net 2.0 (KB928365)?
At least we
had. One of our customers reported a strange error in one of our web app’s
running .Net 2.0. They got an error message after calling a web service
(running on .Net 1.1) in the web app (.Net 2.0) and we just couldn’t figure it
out. We and our customer started to
investigate and our customer found that after uninstalling the Security Update
for .Net 2.0 (KB928365)
things started to work again! Nice work customer!
We started
to investigate what caused the problem and found that one week ago we had found
an error related to parsing Unicode. The problem is related to how the
framework (or rather UTF8Encoding class) handles invalid bytes (see KB940521), only we didn’t know that at the time. Here’s the short
version of the KB article:
…the behavior of the UTF8Encoding class,
the UnicodeEncoding class, and the UTF32Encoding class changes to
comply with the Unicode 5.0 requirements for Unicode encodings. Invalid bytes
are not removed. Instead, the invalid bytes are replaced by the Unicode
character U+FFFD.
We had an
invalid byte in the beginning of one of our xml documents that we manually
created and returned as a string using UTF8Encoding. Don’t ask why we parse it
to string; its old code that we haven’t replaced with XmlNode yet. Since this
is now (in 2.0 with security update) replaced as a Unicode character it uses more space and we had to remove
more bytes. Since this class was written in .Net 1.1 but is also used in 2.0,
we made a quick fix for checking which framework and change the behavior accordingly.
The KB articles states:
Earlier versions of the .NET Framework 2.0
followed the latest available Unicode standard, Unicode 4.1. The specifications
for Unicode 4.1 disallowed the passing of invalid UTF code points. Any invalid
data that was encountered was dropped. This behavior was considered to have
minimal effect on current programs.
And then:
Before this change, invalid characters in the
middle of text strings would be silently removed. For example, the string
“Ad\xD800min\xDC00istrator” would change to “Administrator”
because the Unicode characters U+D800 and U+DC00 are invalid. This could cause
a security problem for some programs. After you install security bulletin
MS07-040, this string now becomes “Ad\xFFFDmin\xFFFDistrator.” This
string is decoded to “Ad�min�istrator,” where the � is the Unicode
replacement character.
So if you
have the same “bad” code or are using Unicode encodings with invalid bytes somewhere
you might experience the same problem and hopefully this would be of help.
Ever been annoyed by not being able to find the real object behind System.__ComObject? There is a solution and guess what; it’s VB.Net (or at least the VB API)! 