Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Fun Puzzler

Author  Topic 

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-29 : 14:43:45
Given a recordset of names

Select Names FROM MyTable

The values in column Names in the recordset have a variable number of beginning characters that match.

What's the easiest way to find the number of matching characters that are common among all rows?

Sam

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-29 : 15:11:16
Got any sample data?

Do you mean like Brett and Brian share the same 2?

And Billy And BillyJo share 4?



Brett

8-)
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-29 : 15:28:23
Yep. Here's some sample data for you:

abcdthis
abcdthat
abcdtheother

Those 3 rows have the first 6 characters matching.

Sam
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2003-10-29 : 15:38:17
Tally ho!


SELECT TOP 1 n-1
FROM Numbers
WHERE n > 0
AND SUBSTRING((SELECT MIN(Names) FROM MyTable), n, 1) <>
SUBSTRING((SELECT MAX(Names) FROM MyTable), n, 1)
ORDER BY n


Though you may want to force a binary collation on Names depending on what characters it contains.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-10-29 : 15:41:25
quote:
Originally posted by Arnold Fribble

Tally ho!


SELECT TOP 1 n
FROM Numbers
WHERE n > 0
AND SUBSTRING((SELECT MIN(Names) FROM MyTable), n, 1) <>
SUBSTRING((SELECT MAX(Names) FROM MyTable), n, 1)


Though you may want to force a binary collation on Names depending on what characters it contains.




Sooooo coooool

Tally Ho indeed!



Brett

8-)
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2003-10-29 : 15:42:42
Shame I forgot to subtract one until the second edit.
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-29 : 15:46:24
Hi Arnold !

Pretty neat.

But I'm not sure your solution looks at every Name in MyTable anywhere?

select min(names) from mytable

will return the same name no matter what N is in your query.

Sam

I'll crunch this in QA to check.
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2003-10-29 : 15:54:44
Why does it need to?
The initial characters shared by the smallest and largest value in Names must be shared by all of them. I think...
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-29 : 16:00:01
That is the puzzle piece I'm missing.

If you're right, then the calculation is done only on the MIN and MAX, not the entire table repeatedly.

Thanks Arnold,

Sam
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-10-30 : 09:13:22
Can someone please explain the n notation in the preceeding SQL?
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-30 : 09:18:17
N is a column value from the table Tally. Tally tables can be very useful for some queries (like this).

There are a few articles on SQLTEAM about using Tally tables.

Sam
Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-10-30 : 09:56:29
Thank you very much Sam, I will study those articles.

Does the following apply or am I missing the mark...

declare @strsql varchar(1000),
@maxlen int,
@i int

select @maxlen = max(len(Names)) from MyTable
select @strsql = 'create table ##names (nid int identity(0,1),val varchar(' + cast(@maxlen as varchar(10)) + '),numexists int)' from MyTable
exec (@strsql)

set @i=1
while @i<=@maxlen
begin
select @strsql = 'insert into ##names select distinct rtrim(ltrim(left(Names,' + cast(@i as varchar(10)) + '))),count(*) from MyTable where rtrim(ltrim(left(Names,' + cast(@i as varchar(10)) + ')))not in (select val from ##names) group by rtrim(ltrim(left(Names,' + cast(@i as varchar(10)) + ')))'
exec (@strsql)
set @i = @i+1
end

select val,numexists from ##names order by numexists desc
drop table ##names
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-30 : 10:30:37
Hi ehorn,

It looks like your select may work, but the solution proposed by Arnold is hard to beat. There's no iteration, it works on an existing table, doesn't require dynamic SQL.

And it's even better once I've improved it

DECLARE @Maxname AS VARCHAR (100)
DECLARE @Minname AS VARCHAR (100)

SELECT @Maxname = MAX(Names), @MinName = MIN(Names) FROM MyTable

SELECT MAX(id) AS MatchLength
FROM Tally
WHERE id BETWEEN 1 AND LEN(@Minname) -- No point in evaluating past the minimum
AND SUBSTRING(@Minname, 1, ID) = SUBSTRING(@Maxname, 1, ID)


Couple of bugs found - Substring args were reversed, and the compare should be = not <> and I sort of prefer MAX rather than TOP 1, ORDER BY, and I like the use of MAX / MIN temporary variables marginally better than coding the queries together. I doubt there's any performance difference, but it clearly shows that the data table MyTable(Names) isn't repeatedly scanned to find the solution.

This doesn't take away from the beauty of Arnold's solution - the use of MAX / MIN Name to represent the search set - rather than the entire table.

The approach I would have taken was like yours ehorn. I'd have done repetitive scans of the data to find the MAX match of characters.

Sam

Go to Top of Page

ehorn
Master Smack Fu Yak Hacker

1632 Posts

Posted - 2003-10-30 : 10:54:16
I have much to learn - what a great site!!
Go to Top of Page

homam
Starting Member

31 Posts

Posted - 2003-10-30 : 14:12:11
Arnold's solution is pretty smart! It finds the solution by attacking the problem from a completely different angle: instead of directly finding the matches, let's find the first unmatch position and our answer is that number minus one. I like that!


Edit: BTW, it doesn't matter if you use min() or max(); it'll work with any pair.
Go to Top of Page

robvolk
Most Valuable Yak

15732 Posts

Posted - 2003-10-30 : 15:27:32
Arnold is god at non-linear thinking. But before you read his solution, read this:

http://www.sqlteam.com/item.asp?ItemID=8425

Give it a try before you look at how he solved it.

http://www.sqlteam.com/item.asp?ItemID=10084
Go to Top of Page

SamC
White Water Yakist

3467 Posts

Posted - 2003-10-30 : 15:32:19
Hi homam,

I think you're mistaken about it working with any pair. It MUST be the MIN and MAX.

Rather than explain, here's an example:

abcdefg123
abcdefg456
abc789

Using the first two rows as "any" arbitrary rows would give an incorrect solution of 7 characters that match. The trick in Arnold's solution is understanding how MIN and MAX contribute to finding the 1st and 3rd row in the sample data I just listed.

Sam
Go to Top of Page

homam
Starting Member

31 Posts

Posted - 2003-10-30 : 16:53:12
You're right :) I was thinking it should be min(len(name)) but then it doesn't really matter because adding characters will essentially push it further alphabetically.




quote:
Originally posted by SamC

Hi homam,

I think you're mistaken about it working with any pair. It MUST be the MIN and MAX.

Rather than explain, here's an example:

abcdefg123
abcdefg456
abc789

Using the first two rows as "any" arbitrary rows would give an incorrect solution of 7 characters that match. The trick in Arnold's solution is understanding how MIN and MAX contribute to finding the 1st and 3rd row in the sample data I just listed.

Sam



Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-16 : 16:23:14
Who can make use of the following:

declare @max varchar(6), @min varchar(6)
set @max='aszeee'
set @min='asheee'

select
1.000000000000000 *
cast(cast(@max as varbinary) as int) /
cast(cast(@min as varbinary) as int)


-------------------------------------
1.17242026975087269231014452
Go to Top of Page

Arnold Fribble
Yak-finder General

1961 Posts

Posted - 2003-11-17 : 13:19:17
People who only care about the last 4 characters of @min and @max?
Go to Top of Page

Stoad
Freaky Yak Linguist

1983 Posts

Posted - 2003-11-17 : 16:35:58
I think I meant here subtraction instead of division. Find subtraction,
then convert result into binary and at last compare its length with
length of @max (or @min). And by this get number of matching
first characters.
Go to Top of Page
    Next Page

- Advertisement -