Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Data Corruption Issues
 replace strange symbols

Author  Topic 

diegolaz
Starting Member

6 Posts

Posted - 2009-09-28 : 22:44:59
Changing some settings some data got saved with strange symbols.
How can I now what do some symbols mean (so I can do a replace to massively fix them)
Originally they where accentuation (in spanish). Is there an equivalence table?

For example in this sentence:
‘quien contamina paga’

Thanks!

diegolaz
Starting Member

6 Posts

Posted - 2009-09-28 : 23:31:41
the one that I moslty have is – .... how can I know what does it mean?
Go to Top of Page

diegolaz
Starting Member

6 Posts

Posted - 2009-09-30 : 23:58:01
ok. I manage to fix most of the bad chars with:
UPDATE defs SET def = REPLACE(def, '–', '-')
UPDATE defs SET def = REPLACE(def, '“', '"')
UPDATE defs SET def = REPLACE(def, '� ', '"')
UPDATE defs SET def = REPLACE(def, '�.', '".')
UPDATE defs SET def = REPLACE(def, '�,', '",')
UPDATE defs SET def = REPLACE(def, '�)', '")')
UPDATE defs SET def = REPLACE(def, '�<', '"�<')
UPDATE defs SET def = REPLACE(def, '�;', '"�;')

the main one I'm missing that I'm not sure what it is is •
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2009-10-02 : 09:16:48
quote:
Originally posted by diegolaz

the main one I'm missing that I'm not sure what it is is •



It's Bullet, NCHAR(0x2022)
which is encoded in utf-8 as the 3 bytes 0xE2 0x80 0xA2.
(see, for example, http://www.fileformat.info/info/unicode/char/2022/index.htm )

The problem you have is that the text that was inserted into the table was UTF-8 encoded Unicode, but it was inserted as if it were Windows code page 1252 encoded. So the multiple bytes that make up non-ASCII characters in UTF-8 have been misinterpreted as multiple characters.

I wrote a function a few years ago to convert varchar strings that were misencoded in this way back to nvarchar strings:

http://www.sqlteam.com/Forums/topic.asp?TOPIC_ID=62406
Go to Top of Page
   

- Advertisement -