| Author |
Topic |
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-29 : 14:43:45
|
| Given a recordset of namesSelect Names FROM MyTableThe values in column Names in the recordset have a variable number of beginning characters that match.What's the easiest way to find the number of matching characters that are common among all rows?Sam |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2003-10-29 : 15:11:16
|
| Got any sample data?Do you mean like Brett and Brian share the same 2?And Billy And BillyJo share 4?Brett8-) |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-29 : 15:28:23
|
| Yep. Here's some sample data for you:abcdthisabcdthatabcdtheotherThose 3 rows have the first 6 characters matching.Sam |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2003-10-29 : 15:38:17
|
Tally ho!SELECT TOP 1 n-1FROM NumbersWHERE n > 0 AND SUBSTRING((SELECT MIN(Names) FROM MyTable), n, 1) <> SUBSTRING((SELECT MAX(Names) FROM MyTable), n, 1)ORDER BY n Though you may want to force a binary collation on Names depending on what characters it contains. |
 |
|
|
X002548
Not Just a Number
15586 Posts |
Posted - 2003-10-29 : 15:41:25
|
quote: Originally posted by Arnold Fribble Tally ho!SELECT TOP 1 nFROM NumbersWHERE n > 0 AND SUBSTRING((SELECT MIN(Names) FROM MyTable), n, 1) <> SUBSTRING((SELECT MAX(Names) FROM MyTable), n, 1) Though you may want to force a binary collation on Names depending on what characters it contains.
Sooooo cooooolTally Ho indeed!Brett8-) |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2003-10-29 : 15:42:42
|
| Shame I forgot to subtract one until the second edit. |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-29 : 15:46:24
|
| Hi Arnold !Pretty neat.But I'm not sure your solution looks at every Name in MyTable anywhere? select min(names) from mytable will return the same name no matter what N is in your query.SamI'll crunch this in QA to check. |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2003-10-29 : 15:54:44
|
| Why does it need to?The initial characters shared by the smallest and largest value in Names must be shared by all of them. I think... |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-29 : 16:00:01
|
| That is the puzzle piece I'm missing. If you're right, then the calculation is done only on the MIN and MAX, not the entire table repeatedly.Thanks Arnold,Sam |
 |
|
|
ehorn
Master Smack Fu Yak Hacker
1632 Posts |
Posted - 2003-10-30 : 09:13:22
|
| Can someone please explain the n notation in the preceeding SQL? |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-30 : 09:18:17
|
| N is a column value from the table Tally. Tally tables can be very useful for some queries (like this).There are a few articles on SQLTEAM about using Tally tables.Sam |
 |
|
|
ehorn
Master Smack Fu Yak Hacker
1632 Posts |
Posted - 2003-10-30 : 09:56:29
|
| Thank you very much Sam, I will study those articles.Does the following apply or am I missing the mark...declare @strsql varchar(1000), @maxlen int, @i intselect @maxlen = max(len(Names)) from MyTableselect @strsql = 'create table ##names (nid int identity(0,1),val varchar(' + cast(@maxlen as varchar(10)) + '),numexists int)' from MyTableexec (@strsql)set @i=1while @i<=@maxlen begin select @strsql = 'insert into ##names select distinct rtrim(ltrim(left(Names,' + cast(@i as varchar(10)) + '))),count(*) from MyTable where rtrim(ltrim(left(Names,' + cast(@i as varchar(10)) + ')))not in (select val from ##names) group by rtrim(ltrim(left(Names,' + cast(@i as varchar(10)) + ')))' exec (@strsql) set @i = @i+1 endselect val,numexists from ##names order by numexists descdrop table ##names |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-30 : 10:30:37
|
Hi ehorn,It looks like your select may work, but the solution proposed by Arnold is hard to beat. There's no iteration, it works on an existing table, doesn't require dynamic SQL.And it's even better once I've improved it DECLARE @Maxname AS VARCHAR (100)DECLARE @Minname AS VARCHAR (100)SELECT @Maxname = MAX(Names), @MinName = MIN(Names) FROM MyTableSELECT MAX(id) AS MatchLengthFROM TallyWHERE id BETWEEN 1 AND LEN(@Minname) -- No point in evaluating past the minimum AND SUBSTRING(@Minname, 1, ID) = SUBSTRING(@Maxname, 1, ID)Couple of bugs found - Substring args were reversed, and the compare should be = not <> and I sort of prefer MAX rather than TOP 1, ORDER BY, and I like the use of MAX / MIN temporary variables marginally better than coding the queries together. I doubt there's any performance difference, but it clearly shows that the data table MyTable(Names) isn't repeatedly scanned to find the solution.This doesn't take away from the beauty of Arnold's solution - the use of MAX / MIN Name to represent the search set - rather than the entire table.The approach I would have taken was like yours ehorn. I'd have done repetitive scans of the data to find the MAX match of characters.Sam |
 |
|
|
ehorn
Master Smack Fu Yak Hacker
1632 Posts |
Posted - 2003-10-30 : 10:54:16
|
| I have much to learn - what a great site!! |
 |
|
|
homam
Starting Member
31 Posts |
Posted - 2003-10-30 : 14:12:11
|
| Arnold's solution is pretty smart! It finds the solution by attacking the problem from a completely different angle: instead of directly finding the matches, let's find the first unmatch position and our answer is that number minus one. I like that!Edit: BTW, it doesn't matter if you use min() or max(); it'll work with any pair. |
 |
|
|
robvolk
Most Valuable Yak
15732 Posts |
Posted - 2003-10-30 : 15:27:32
|
Arnold is god at non-linear thinking. But before you read his solution, read this:http://www.sqlteam.com/item.asp?ItemID=8425Give it a try before you look at how he solved it. http://www.sqlteam.com/item.asp?ItemID=10084 |
 |
|
|
SamC
White Water Yakist
3467 Posts |
Posted - 2003-10-30 : 15:32:19
|
| Hi homam,I think you're mistaken about it working with any pair. It MUST be the MIN and MAX.Rather than explain, here's an example:abcdefg123abcdefg456abc789Using the first two rows as "any" arbitrary rows would give an incorrect solution of 7 characters that match. The trick in Arnold's solution is understanding how MIN and MAX contribute to finding the 1st and 3rd row in the sample data I just listed.Sam |
 |
|
|
homam
Starting Member
31 Posts |
Posted - 2003-10-30 : 16:53:12
|
You're right :) I was thinking it should be min(len(name)) but then it doesn't really matter because adding characters will essentially push it further alphabetically.quote: Originally posted by SamC Hi homam,I think you're mistaken about it working with any pair. It MUST be the MIN and MAX.Rather than explain, here's an example:abcdefg123abcdefg456abc789Using the first two rows as "any" arbitrary rows would give an incorrect solution of 7 characters that match. The trick in Arnold's solution is understanding how MIN and MAX contribute to finding the 1st and 3rd row in the sample data I just listed.Sam
|
 |
|
|
Stoad
Freaky Yak Linguist
1983 Posts |
Posted - 2003-11-16 : 16:23:14
|
| Who can make use of the following:declare @max varchar(6), @min varchar(6)set @max='aszeee'set @min='asheee'select1.000000000000000 *cast(cast(@max as varbinary) as int) /cast(cast(@min as varbinary) as int)------------------------------------- 1.17242026975087269231014452 |
 |
|
|
Arnold Fribble
Yak-finder General
1961 Posts |
Posted - 2003-11-17 : 13:19:17
|
| People who only care about the last 4 characters of @min and @max? |
 |
|
|
Stoad
Freaky Yak Linguist
1983 Posts |
Posted - 2003-11-17 : 16:35:58
|
I think I meant here subtraction instead of division. Find subtraction,then convert result into binary and at last compare its length withlength of @max (or @min). And by this get number of matchingfirst characters. |
 |
|
|
Next Page
|