How to read an Oracle BLOB field containing text in VB.NET with the right encoding

Manuel Conde
3 min readNov 8, 2023

--

It may seem easy, but it took me two hours to find exactly the right solution to this problem. Let’s put ourselves in situation.

I “inherit” and old Oracle 10g DB with a table containing one field as a BLOB (Binary Large OBject) but, instead of being used to store an image or any other binary large object, it contains large texts (or not, just the possibility of containing text larger than 4000 chars).

The right choice for this case should be CLOB, not BLOB, but the guy who designed this DB chose BLOB and I should manage it.

Well, the theory was easy (once you know it, of course, not all people know what a binary field means): the data to retrieve is in a binary format and must be “transformed” to the text we want to obtain. How? Getting each byte of the file into an array and using functions that read the array and returns a text (GetString(bytearray)).

I’ve never done this before, but internet is full of examples. I found several, mostly for C#, but the idea is the same for VB. So by mixing the main ideas, I easily get the text inside the BLOB. Let’s check the string… uhmmm… something is not right, I see weird chars in the accented chararters (spanish texts).

Since I worked a lot with different databases and operating systems over all this years, I quickly indentified the problem: encoding.

This is my first VB.NET project, and the options I have to “extracting” the text once I read the binary data are limited to a set of common encondings: ASCII, UTF8, Unicode and a few others. Any of them worked with text.

So I went straight to the problem: what encoding was the DB using?

SELECT VALUE FROM V$NLS_PARAMETERS WHERE PARAMETER = ‘NLS_CHARACTERSET’

The response was: WE8MSWIN1252. A “subclass” of the classic Win1252.

Obviously, ASCII, UTF8, Unicode… will not print the right characters when using, e.g., Encoding.UTF8.GetString(bytearray).

Well, if there was not a GetString() function for Win1252 to retrieve the right chars, the logical solution was a transformation from Win1252 to UTF8, and then use Encoding.UTF8.GetString(bytearray)to retrieve something readable. I tried it and… it worked like a charm.

Here, the final code I wrote for testing purposes, in a table TEST with ID as number, and DATA as blob.

Imports System.IO
Imports System.Text
Imports Oracle.ManagedDataAccess.Client
...

Dim conn = New OracleConnection(mConnString)
Dim blob As OracleBlob
'Check values in https://learn.microsoft.com/en-us/dotnet/api/system.text.encoding?view=netframework-4.7.2
Dim win1252 = Encoding.GetEncoding(1252)

conn.Open()

Dim Sql = "select * from test"
Dim sqlCommand = New OracleCommand(Sql, conn)
Dim data As OracleDataReader = sqlCommand.ExecuteReader()
While data.Read()
'Get the blob field (the blob is binary data encoded with DB NLS_CHARACTERSET value)
'We can get it as an array of consecutive encoded bytes. SELECT VALUE FROM V$NLS_PARAMETERS WHERE PARAMETER = 'NLS_CHARACTERSET' will give the current DB encoding
'We are, currently, using WE8MSWIN1252 in Oracle, so win1252 (not the same, but very similar to ANSI) is the current encoding
blob = data.GetOracleBlob(1)

'Setup a byte array with blob data length
Dim bytANSI(blob.Length) As Byte
'Now, read data from blob var into our ANSI byte array
blob.Read(bytANSI, 0, blob.Length)

'Value is saved in Oracle table as Win1252 encoding.
'To retrieve it as string, we need to transform it to one of the GetString() available methods
'UTF8 is one of the more used encodings, so we transform it to UTF8
Dim bytUTF8 = Encoding.Convert(win1252, Encoding.UTF8, bytANSI)

'Finally, we have 2 byte arrays: on ANSI and on UTF8 encodings. To get the string, use GetString() from UTF8 byte array encoding
Dim strTemp = Encoding.UTF8.GetString(bytUTF8)

System.Console.WriteLine(strTemp)

'Save a file text with UTF8 and another with ANSI encoding to check that all is working fine
File.AppendAllText("C:\Temp\textUTF8.txt", strTemp)
File.AppendAllText("C:\Temp\textANSI.txt", strTemp, win1252)
End While
data.Close()
conn.Close()

So, problem solved. I simply want to leave this little pill for myself (maybe I will need it in the future) and for other people with the same problem.

--

--

No responses yet