Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
rlull
Starting Member
39 Posts |
Posted - 2014-04-30 : 18:59:08
|
I can accomplish what I want using the following syntax but I'm sure there's a more "pure" way to do this. Suggestions?SELECT DISTINCT(nv.NodeID), -- needed to filter out articles with multiple categories assigned MIN(nv.NewsTitle) AS NewsTitle, MIN(nv.NewsReleaseDate) AS NewsReleaseDate, MIN(nv.NewsSummary) AS NewsSummary, MIN(nv.NewsTeaser) AS NewsTeaser , MIN(dc.CategoryID) AS CategoryID, MIN(nv.NodeAliasPath) AS NodeAliasPathFROM View_CONTENT_News_Joined nv JOIN CMS_DocumentCategory dc ON nv.DocumentID = dc.DocumentID JOIN CMS_Category ct ON dc.CategoryID = ct.CategoryIDWHERE nv.NodeSiteID = 1 AND nv.Published = 1 AND nv.DocumentCulture = N'en-US' AND nv.NodeAliasPath LIKE N'/Blog/Article/%' AND (ct.CategoryParentID = 68 OR dc.CategoryID = 68) AND nv.NewsReleaseDate <= GETDATE()GROUP BY NodeID, NewsReleaseDateORDER BY NewsReleaseDate DESC |
|
stepson
Aged Yak Warrior
545 Posts |
|
gbritton
Master Smack Fu Yak Hacker
2780 Posts |
Posted - 2014-05-01 : 07:50:39
|
quote: Originally posted by rlull I can accomplish what I want using the following syntax but I'm sure there's a more "pure" way to do this. Suggestions?
If you're grouping by NodeId, NewsReleaseDate, you shouldn't need DISTINCT |
|
|
rlull
Starting Member
39 Posts |
Posted - 2014-05-01 : 11:50:17
|
Ok, I've tried to put this in the recommended format:How do I select the unique rows from the following 4 tables? (Because of the joins on the category tables, and because a blog article may have multiple categories assigned to it, I get multiple rows with the same article.) I am able to accomplish what I want with the query below but it feels "hacky" putting "MIN" on each field.Tables: CREATE TABLE [dbo].[CONTENT_News]( [NodeID] [int] IDENTITY(1,1) NOT NULL, [NewsTitle] [nvarchar](450) NOT NULL, [NewsReleaseDate] [datetime] NOT NULL, [NewsSummary] [nvarchar](max) NOT NULL, [NewsTeaser] [nvarchar](500) NULL, [AuthorID] [int] NULL CONSTRAINT [PK_CONTENT_News] PRIMARY KEY CLUSTERED ( [NewsID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]------------------------CREATE TABLE [dbo].[custom_Author]( [AuthorID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [nvarchar](50) NULL, [LastName] [nvarchar](50) NOT NULL CONSTRAINT [PK_custom_Author] PRIMARY KEY CLUSTERED ( [AuthorID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]------------------------CREATE TABLE [dbo].[CMS_DocumentCategory]( [DocumentID] [int] NOT NULL, [CategoryID] [int] NOT NULL, CONSTRAINT [PK_CMS_DocumentCategory] PRIMARY KEY CLUSTERED ( [DocumentID] ASC, [CategoryID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]GOALTER TABLE [dbo].[CMS_DocumentCategory] WITH CHECK ADD CONSTRAINT [FK_CMS_DocumentCategory_CategoryID_CMS_Category] FOREIGN KEY([CategoryID])REFERENCES [dbo].[CMS_Category] ([CategoryID])GOALTER TABLE [dbo].[CMS_DocumentCategory] CHECK CONSTRAINT [FK_CMS_DocumentCategory_CategoryID_CMS_Category]GO------------------------CREATE TABLE [dbo].[CMS_Category]( [CategoryID] [int] IDENTITY(1,1) NOT NULL, [CategoryName] [nvarchar](250) NULL, CONSTRAINT [PK_CMS_Category] PRIMARY KEY NONCLUSTERED ( [CategoryID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY] ------------------------ My query that works but feels hacky SELECT nv.NodeID, MIN(nv.NewsTitle) AS NewsTitle, MIN(a.FirstName + ' ' + a.LastName) AS Author, MIN(nv.NewsReleaseDate) AS NewsReleaseDate, MIN(nv.NewsSummary) AS NewsSummary, MIN(nv.NewsTeaser) AS NewsTeaser , MIN(dc.CategoryID) AS CategoryID, MIN(nv.NodeAliasPath) AS NodeAliasPathFROM CONTENT_News nv JOIN CMS_DocumentCategory dc ON nv.DocumentID = dc.DocumentID JOIN CMS_Category ct ON dc.CategoryID = ct.CategoryID JOIN custom_Author a ON nv.AuthorID = a.AuthorIDWHERE (ct.CategoryParentID = 68 OR dc.CategoryID = 68) AND nv.NewsReleaseDate <= GETDATE()GROUP BY NodeID, NewsReleaseDateORDER BY NewsReleaseDate DESC Results in 1 row per article |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2014-05-01 : 12:04:10
|
quote: How do I select the unique rows from the following 4 tables
Your question doesn't exactly match your query. While you are getting sub set of "unique" rows per NodeID and NewsReleaseDate. That really isn't all the unique rows (assuming the aggregate is removing some). If that is what you want then your query works just fine. But, I think there is a detail or two that is still missing here.Also, your sample tables do not match your query (CONTENT_News doesn't have a DocumentID column). It's hard to tell from what you posted but, are we to assume that you have a 1-to-many relationship someplace that is causing "duplicates" (multiple rows per NodeID & NewsReleaseDate)? Or is there something going on with the NodeID & NewsReleaseDate date that isn't clear? I'd assume that NodeID is unique for CONTENT_News, but you have another nonexistent column as the primary key (NewsID).1. Can you clean up your schema so that it is correct?2. Can you post sample data that show where you are getting duplicates (so we can run queries against it)?3. Can you define in words what Unique means across those four tables?My guess is that this query hinges on one table and if we can figure out how to resolve that you won't need to do aggregates. But, hard to tell from the information provided thus far. |
|
|
rlull
Starting Member
39 Posts |
Posted - 2014-05-01 : 12:47:38
|
Ok, sorry for the confusion. I tried to simplify for brevity. Let's try this again. There is a News table with articles in it. Each article has only 1 author associated with it. Each article can have one or more categories assigned to it. So if I don't filter by NodeID, which is unique, the same article will appear multiple times if it has more than one category assigned to it. I only want each article to appear once, regardless of how many categories are assigned to it.Tables: CREATE TABLE [dbo].[CONTENT_News]( [NodeID] [int] IDENTITY(1,1) NOT NULL, [NewsTitle] [nvarchar](450) NOT NULL, [NewsReleaseDate] [datetime] NOT NULL, [NewsSummary] [nvarchar](max) NOT NULL, [NewsTeaser] [nvarchar](500) NULL, [AuthorID] [int] NULL CONSTRAINT [PK_CONTENT_News] PRIMARY KEY CLUSTERED ( [NodeID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]------------------------CREATE TABLE [dbo].[custom_Author]( [AuthorID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [nvarchar](50) NULL, [LastName] [nvarchar](50) NOT NULL CONSTRAINT [PK_custom_Author] PRIMARY KEY CLUSTERED ( [AuthorID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]------------------------CREATE TABLE [dbo].[CMS_DocumentCategory]( [NodeID] [int] NOT NULL, [CategoryID] [int] NOT NULL, CONSTRAINT [PK_CMS_DocumentCategory] PRIMARY KEY CLUSTERED ( [NodeID] ASC, [CategoryID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY]GOALTER TABLE [dbo].[CMS_DocumentCategory] WITH CHECK ADD CONSTRAINT [FK_CMS_DocumentCategory_CategoryID_CMS_Category] FOREIGN KEY([CategoryID])REFERENCES [dbo].[CMS_Category] ([CategoryID])GOALTER TABLE [dbo].[CMS_DocumentCategory] CHECK CONSTRAINT [FK_CMS_DocumentCategory_CategoryID_CMS_Category]GO------------------------CREATE TABLE [dbo].[CMS_Category]( [CategoryID] [int] IDENTITY(1,1) NOT NULL, [CategoryName] [nvarchar](250) NULL, CONSTRAINT [PK_CMS_Category] PRIMARY KEY NONCLUSTERED ( [CategoryID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]) ON [PRIMARY] ------------------------ This query results in multiple instances of the same article SELECT nv.NodeID, nv.NewsTitle AS Title, nv.NewsReleaseDate AS ReleaseDate, LTRIM(a.FirstName + ' ' + a.LastName) AS Author, nv.NewsSummary AS Summary, nv.NewsTeaser AS SmallImagePath , dc.CategoryID AS CategoryId, ct.CategoryDisplayName AS CategoryName, nv.NodeAliasPath AS ArticleLinkFROM View_CONTENT_News_Joined nv JOIN CMS_DocumentCategory dc ON nv.DocumentID = dc.DocumentID JOIN CMS_Category ct ON dc.CategoryID = ct.CategoryID JOIN custom_Author a ON nv.AuthorID = a.AuthorIDWHERE nv.NodeSiteID = 1 AND nv.Published = 1 AND nv.DocumentCulture = N'en-US' AND nv.NodeAliasPath LIKE N'/Blog/Article/%' AND (ct.CategoryParentID = 8 OR dc.CategoryID = 8) AND nv.NewsReleaseDate <= GETDATE()ORDER BY NodeID, NewsReleaseDate DESC |
|
|
stepson
Aged Yak Warrior
545 Posts |
Posted - 2014-05-02 : 01:41:24
|
quote: I only want each article to appear once, regardless of how many categories are assigned to it.
Base on that, you need or not to display Category or display just one Category (any one) ? dc.CategoryID AS CategoryId,ct.CategoryDisplayName AS CategoryName, Also, read the Lamprey's post.And about the sample, some sample (few rows with dates -real/dummy- ) and the expected output.sabinWeb MCP |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
Posted - 2014-05-02 : 11:58:27
|
Without sample data it's much harder to help. Plus your Schema doesn't math your query (CMS_DocumentCategory doesn't have a DocumentID column). I'm reading between the lines and I'm reading that you want to join all those tables together and only have one row per NodeID and that means selecting a random Category... Is that right?If it were me I'd check performance of doing:1. A Derived Table on CMS_DocumentCategory (sample below) 3. An OUTER APPLY 3. A Correlated Sub QuerySomething like this:SELECT nv.NodeID, nv.NewsTitle AS Title, nv.NewsReleaseDate AS ReleaseDate, LTRIM(a.FirstName + ' ' + a.LastName) AS Author, nv.NewsSummary AS Summary, nv.NewsTeaser AS SmallImagePath , dc.CategoryID AS CategoryId, ct.CategoryDisplayName AS CategoryName, nv.NodeAliasPath AS ArticleLinkFROM View_CONTENT_News_Joined AS nvJOIN ( SELECT NodeID, MIN(CategoryID) AS DocumentID --?? FROM CMS_DocumentCategory GROUP BY NodeID ) AS dc ON nv.DocumentID = dc.DocumentIDJOIN CMS_Category AS ct ON dc.CategoryID = ct.CategoryIDJOIN custom_Author AS a ON nv.AuthorID = a.AuthorID |
|
|
|
|
|
|
|