Storing Unicode in an old Oracle database
I have a Java application that stores data in an old Oracle database. The database’s charset was originally set to Win-1252 and I don’t have the option of changing it.
It turns out Oracle has an alternative set of column types (NCHAR, NVARCHAR2, NCLOB) which can be used to store unicode in a database with a different encoding.
Converting columns
To convert a column we first add a new temporarily named column with the new type (e.g. NVARCHAR2, NCLOB). Delete the old column. Rename the temp column to the correct name.
ALTER TABLE items ADD (nametmp NVARCHAR2(256));
UPDATE items SET nametmp = TO_NCHAR(name);
ALTER TABLE items DROP COLUMN name;
ALTER TABLE items RENAME COLUMN nametmp TO name;
JDBC issues
To enable the thin JDBC driver to send NCHAR strings by default set connection
property defaultNChar=true
or system property oracle.jdbc.defaultNChar=true
.
Even with this set I found Unicode characters could only be passed via bound
parameters. String literals in queries, even with the N string flag like
SET name = N'example'
silently had their characters replaced.