Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Mladen
Starting Member
6 Posts |
Posted - 2008-07-28 : 04:51:22
|
Just realized that you cannot pick collation Indic_General_90_CI_AS from the list of available collations when installing the sql server. http://msdn.microsoft.com/en-us/library/ms143508.aspx. I assume this is because it is Unicode only. Is there a place where this is explicitly stated? The help under 2005 gives that for Hindi "Indic_General_90_CI_AS (Unicode only)" and under 2008 "Not available at server level." So this brings up the question is what is the point in specifying the collation of the server if all the fields of the databases under the server are unicode ie. ntext , nvarchar , nchar , nvarchar(max). Presumably it is only needed if the principal language used has a different sort order than the same unicode ranges used by other languages using the same script? |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2008-07-28 : 05:50:02
|
N prefix specifies that the data will use 2 bytes for storing a single char while non N prefixed used 1 byte to store a single char.Collation only specifies how the data is sorted when order by is applied.so (Unicode only) means that you can't use this collation on non N prefixed datatypes._______________________________________________Causing trouble since 1980Blog: http://weblogs.sqlteam.com/mladenpSpeed up SSMS development: www.ssmstoolspack.com <- version 1.0 out! |
 |
|
Mladen
Starting Member
6 Posts |
Posted - 2008-07-28 : 07:37:43
|
Thanks for the answer, I needed a confirmation. So would not the order by of Hindi characters in a table defined as nvarchar and having the collation Indic_General_90_CI_AS be the same as order by of Hindi characters in a table having the collation Latin1_General_CI_AS? I mean, is not the Hindi unicode range 0900 097F sorted the same reagardless of the collation, and if so what is the point of specifying Indic_General_90_CI_AS when this can only be only be specified in a Unicode context anyway, and then the 0900 097F range gets sorted the same regardess of collation. |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2008-07-28 : 07:39:44
|
as i have no ideas on differences between Indic_General_90_CI_AS and Latin1_General_CI_AS i can't answer this with absolute certainty. But logic dictates that if both collations were the same there wouldn't be 2 of them._______________________________________________Causing trouble since 1980Blog: http://weblogs.sqlteam.com/mladenpSpeed up SSMS development: www.ssmstoolspack.com <- version 1.0 out! |
 |
|
Mladen
Starting Member
6 Posts |
Posted - 2008-07-28 : 09:10:53
|
" If both collations were the same there wouldn't be 2 of them" is quite reasonable but it still does not answer my question. Here we are dealing with a collation which is Unicode only, and presumably for Unicode 2 byte fields for storing a single character the role of the collation is reduced to specifying the sort order if there are mutiple languages with different sort orders using the same script. So the question perhaps is whether there are multiple languages using the same script range 0900 097F and whether Indic_General_90_CI_AS within 0900 097F sorts the same way as the Latin1_General_CI_AS within 0900 097F. I did a quick test and it seems so within the limited sample. So the question of the role of the Indic_General_90_CI_AS collation is still unclear to me since it is Unicode only and its role of determining the sort order would seem redundant since Latin1_General_CI_AS in the range 0900 097F sorts the same. What am I missing? http://www.unicode.org/charts/create table IndicCollationTable (name nvarchar(10) collate Indic_General_90_CI_AS) create table LatinCollationTable (name nvarchar(10) collate Latin1_General_CI_AS) insert into IndicCollationTablevalues (nchar(2309))insert into IndicCollationTablevalues (nchar(2310))insert into IndicCollationTablevalues (nchar(2311))insert into IndicCollationTablevalues (nchar(2312))insert into IndicCollationTablevalues (nchar(2313))insert into LatinCollationTablevalues (nchar(2309))insert into LatinCollationTablevalues (nchar(2310))insert into LatinCollationTablevalues (nchar(2311))insert into LatinCollationTablevalues (nchar(2312))insert into LatinCollationTablevalues (nchar(2313))select * from IndicCollationTable order by nameselect * from LatinCollationTable order by name |
 |
|
spirit1
Cybernetic Yak Master
11752 Posts |
Posted - 2008-07-28 : 09:15:10
|
you got me there... I have no idea on true differences between the two...Maybe our local resident collation expert ArnoldFribble might know.or a mail to MS would probably give you an answer._______________________________________________Causing trouble since 1980Blog: http://weblogs.sqlteam.com/mladenpSpeed up SSMS development: www.ssmstoolspack.com <- version 1.0 out! |
 |
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2008-07-30 : 04:34:17
|
SELECT n, --NCHAR(n) AS c, ROW_NUMBER() OVER (ORDER BY NCHAR(n) COLLATE Latin1_General_CI_AS) AS r_Latin1, ROW_NUMBER() OVER (ORDER BY NCHAR(n) COLLATE Indic_General_90_CI_AS) AS r_IndicFROM dbo.Numbers AS NWHERE N.n BETWEEN 32 AND 65535ORDER BY nFor single-character strings, you can clearly see the differences by plotting r_Latin1 against r_Indic. |
 |
|
|
|
|
|
|