Indic_General_90_CI_AS collation and server setu

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Other SQL Server Topics (2005)

Indic_General_90_CI_AS collation and server setu

Author

Topic

Mladen
Starting Member

6 Posts

Posted - 2008-07-28 : 04:51:22

Just realized that you cannot pick collation Indic_General_90_CI_AS from the list of available collations when installing the sql server.
http://msdn.microsoft.com/en-us/library/ms143508.aspx. I assume this is because it is Unicode only. Is there a place where this is explicitly stated?

The help under 2005 gives that for Hindi "Indic_General_90_CI_AS (Unicode only)" and under 2008 "Not available at server level."
So this brings up the question is what is the point in specifying the collation of the server if all the fields of the databases under the server are unicode ie. ntext , nvarchar , nchar , nvarchar(max). Presumably it is only needed if the principal language used has a different sort order than the same unicode ranges used by other languages using the same script?

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2008-07-28 : 05:50:02

N prefix specifies that the data will use 2 bytes for storing a single char while non N prefixed used 1 byte to store a single char.
Collation only specifies how the data is sorted when order by is applied.
so (Unicode only) means that you can't use this collation on non N prefixed datatypes.

_______________________________________________
Causing trouble since 1980
Blog: http://weblogs.sqlteam.com/mladenp
Speed up SSMS development: www.ssmstoolspack.com <- version 1.0 out!

Mladen
Starting Member

6 Posts

Posted - 2008-07-28 : 07:37:43

Thanks for the answer, I needed a confirmation. So would not the order by of Hindi characters in a table defined as nvarchar and having the collation Indic_General_90_CI_AS be the same as order by of Hindi characters in a table having the collation Latin1_General_CI_AS? I mean, is not the Hindi unicode range 0900 097F sorted the same reagardless of the collation, and if so what is the point of specifying Indic_General_90_CI_AS when this can only be only be specified in a Unicode context anyway, and then the 0900 097F range gets sorted the same regardess of collation.

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2008-07-28 : 07:39:44

as i have no ideas on differences between Indic_General_90_CI_AS and Latin1_General_CI_AS i can't answer this with absolute certainty. But logic dictates that if both collations were the same there wouldn't be 2 of them.

_______________________________________________
Causing trouble since 1980
Blog: http://weblogs.sqlteam.com/mladenp
Speed up SSMS development: www.ssmstoolspack.com <- version 1.0 out!

Mladen
Starting Member

6 Posts

Posted - 2008-07-28 : 09:10:53

" If both collations were the same there wouldn't be 2 of them" is quite reasonable but it still does not answer my question. Here we are dealing with a collation which is Unicode only, and presumably for Unicode 2 byte fields for storing a single character the role of the collation is reduced to specifying the sort order if there are mutiple languages with different sort orders using the same script. So the question perhaps is whether there are multiple languages using the same script range 0900 097F and whether Indic_General_90_CI_AS within 0900 097F sorts the same way as the Latin1_General_CI_AS within 0900 097F. I did a quick test and it seems so within the limited sample. So the question of the role of the Indic_General_90_CI_AS collation is still unclear to me since it is Unicode only and its role of determining the sort order would seem redundant since Latin1_General_CI_AS in the range 0900 097F sorts the same. What am I missing?

http://www.unicode.org/charts/

create table IndicCollationTable (name nvarchar(10) collate Indic_General_90_CI_AS)
create table LatinCollationTable (name nvarchar(10) collate Latin1_General_CI_AS)

insert into IndicCollationTable
values (nchar(2309))
insert into IndicCollationTable
values (nchar(2310))
insert into IndicCollationTable
values (nchar(2311))
insert into IndicCollationTable
values (nchar(2312))
insert into IndicCollationTable
values (nchar(2313))

insert into LatinCollationTable
values (nchar(2309))
insert into LatinCollationTable
values (nchar(2310))
insert into LatinCollationTable
values (nchar(2311))
insert into LatinCollationTable
values (nchar(2312))
insert into LatinCollationTable
values (nchar(2313))

select * from IndicCollationTable order by name
select * from LatinCollationTable order by name

spirit1
Cybernetic Yak Master

11752 Posts

Posted - 2008-07-28 : 09:15:10

you got me there... I have no idea on true differences between the two...
Maybe our local resident collation expert ArnoldFribble might know.
or a mail to MS would probably give you an answer.

_______________________________________________
Causing trouble since 1980
Blog: http://weblogs.sqlteam.com/mladenp
Speed up SSMS development: www.ssmstoolspack.com <- version 1.0 out!

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2008-07-30 : 04:34:17

SELECT n,
--NCHAR(n) AS c,
ROW_NUMBER() OVER (ORDER BY NCHAR(n) COLLATE Latin1_General_CI_AS) AS r_Latin1,
ROW_NUMBER() OVER (ORDER BY NCHAR(n) COLLATE Indic_General_90_CI_AS) AS r_Indic
FROM dbo.Numbers AS N
WHERE N.n BETWEEN 32 AND 65535
ORDER BY n

For single-character strings, you can clearly see the differences by plotting r_Latin1 against r_Indic.

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources