Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 Partitioned View Query Plan "issue"

Author  Topic 

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 10:49:01
This question is in relation to the article found at [url]http://www.sqlteam.com/Item.ASP?ItemID=684[/url].

I've got a table with a little over 80M rows. This isn't our largest table, but is a good candidation for a Partitioned View.

I decided to split the table into smaller tables based on a date field. There's a table for each month, and for testing, a total of 9 tables in all.

Here's the layout for the sample tables:

CREATE TABLE [tbMonthlyCDRData_01_2003] (
[srcCDRRecordID] [int] NOT NULL ,
[CallDate] [smalldatetime] NOT NULL ,
[CallTime] [char] (6) NOT NULL ,
[BillPhoneNum] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[BillableDurationTotal] [decimal](18, 4) NOT NULL ,
CONSTRAINT [PK_tbMonthlyCDRData_01_2003] PRIMARY KEY CLUSTERED
(
[srcCDRRecordID] ,
[BillPhoneNum] ,
[CallDate] ,
[CallTime] ,
[BillableDurationTotal]
) ON [PRIMARY] ,
CONSTRAINT [CK_CallDate_01_2003] CHECK (CallDate >= '1/1/2003' and CallDate < '2/1/2003')
) ON [PRIMARY]
GO


There's a table for Jan, Feb, etc. through Sept.

The view is as follows:

Create View vwtbCDRData_MonthlyMOU
As
Select * From tbMonthlyCDRData_09_2003
Union All
Select * From tbMonthlyCDRData_08_2003
Union All
Select * From tbMonthlyCDRData_07_2003
Union All
Select * From tbMonthlyCDRData_06_2003
Union All
Select * From tbMonthlyCDRData_05_2003
Union All
Select * From tbMonthlyCDRData_04_2003
Union All
Select * From tbMonthlyCDRData_03_2003
Union All
Select * From tbMonthlyCDRData_02_2003
Union All
Select * From tbMonthlyCDRData_01_2003


Here's the "issue"...

This query:
Select * from vwtbCDRData_MonthlyMOU Where CallDate = '9/2/2003'

Does clustered index scan on each real table.

This query:
Select top 2000 * from vwtbCDRData_MonthlyMOU Where CallDate = '9/2/2003'

Does a clustered index scan on the 09/2003 table.

The "issue" is, when running them both, and showing the query plan, the 2nd query only takes 9.91% of the total execution time.

Obviously the 2nd one is more efficient, my question is why? Why does the first one scan all the tables when the check constraint should indicate to the query optimizer that it only needs to check one of them?

Note that there are only 1473 rows returned by each query, so they are returning the same total number of records.

Thanks for any help you may be able to offer.

nr
SQLTeam MVY

12543 Posts

Posted - 2003-09-18 : 12:23:14
I can remember seeing a bug reported about partitioned views and think it may be like your problem.
Can't find the bug but I'll keep looking.

You can get round it by
Select top 100 percent * from vwtbCDRData_MonthlyMOU Where CallDate = '9/2/2003'

At least with this test you can

drop table t1
drop table t2
drop view t

create table t1 (id int, dte datetime not null ,
constraint c_t1 check (dte >= '20030101' and dte < '20030201')
)
go
create table t2 (id int, dte datetime not null,
constraint c_t2 check (dte >= '20030201' and dte < '20030301')
)
go
create view t
as
select * from t1
union all
select * from t2
go

insert t1 select 1, '20030105'
insert t2 select 1, '20030205'

select dte from t where dte = '20030201'
select top 100 percent dte from t where dte = '20030201'


==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 12:26:17
Thanks for replying, I'll test that out.

Let me know if you locate that article. I'll try to do some searching on it as well.

Thanks again.
Go to Top of Page

nr
SQLTeam MVY

12543 Posts

Posted - 2003-09-18 : 13:37:57
Think it was on the microsoft sql server newsgroup that I saw it.
http://www.msdn.microsoft.com/newsgroups/

==========================================
Cursors are useful if you don't know sql.
DTS can be used in a similar way.
Beer is not cold and it isn't fizzy.
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 14:17:57
Performance update

For those interested...

As a test, I used a subset of 5 key columns from a main table. I inserted 1M rows (of these 5 columns) into a table, and even placed an index on the Date field that I will query it by. Note that it's not a clustered index, as data that's added everyday to the real table has a lot of different dates in this particular field.

I inserted the same 1M rows using my Partitioned View into the 9 tables behind the scenes. Note that most of the data went into about 2 or 3 main tables.

I wrote a query that would return about 1500 records based on the Date field for both the 1M row table, and the Partitioned View.

I ran them together, showing the execution plan. The result? The query hitting the Partitioned View took under 10% of the total time, whereas the query hitting the table with all 1M rows took over 90% of the time.

The plan showed that the Partitioned View query only hit the table(s) it should to retrieve the necessary data, resulting in much better performance.

More tests will need to be done, but at first glance...it's worth the effort.

Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 14:34:45
Performance Update

I added an index to both the 1M row table, and each of the tables behind the Partitioned View on a different column (not the one the tables are partitioned/split up by).

When querying the tables for a particular row of data, the 1M row table took 25% of the time, and the Partitioned View took 75% of the time.

Result? Performance hindered by having to check all of the tables behind the Partitioned View, even though it was using indexes for each of them. I'm guessing Disk I/O of jumping around the drives caused this.

Note that this is running on a dual processor server with 2G of ram, and drives in a RAID 5 configuration.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-09-18 : 15:18:35
Are the tables in the partitions on separate drives, or the same drive?

Also, % is relative to the cost of the query..

What happens when you look at the server trace?

I'm very curious though why adding an index that has a non referenced column caused all of the tables to be searched...

bizarro





Brett

8-)

SELECT @@POST=NewId()

That's correct! It's an AlphaNumeric!
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 15:29:25
quote:
Are the tables in the partitions on separate drives, or the same drive?


They are on the same RAID Array. Think striping....

Haven't had time to look at the server trace yet.

What caused all the tables to be searched was...the tables are broken up by CallDate Month. I queried by BillPhoneNum. The same BillPhoneNum could (and would) be in all of the different months. So, each table has to be searched to ensure we've found all occurences of BillPhoneNum = x.

Make sense?
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 15:32:21
Question

I had copied/pasted code to create each of the tables for the Partitioned View. Each new "Create Table" statement was identical except for the month references.

To make the code listing shorter, I threw together a while loop and dynamically built the SQL required to build the table. Then executed the SQL to create the table. The loop went from 1 to 9 (Jan to Sept).

However...when attempting to insert data now, I'm getting the following error:

Server: Msg 550, Level 16, State 1, Line 1
The attempted insert or update failed because the target view either specifies WITH CHECK OPTION or spans a view that specifies WITH CHECK OPTION and one or more rows resulting from the operation did not qualify under the CHECK OPTION constraint.
The statement has been terminated.


Here's the code used to create the tables:

	Declare	@SQL	Varchar(2000),
@Month Tinyint

Select @Month = 1

While @Month < 10
Begin
Select @SQL = 'CREATE TABLE [tbMonthlyCDRData_' + Right('00' + Convert(Varchar(2), @Month), 2) + '_2003] ('
Select @SQL = @SQL + ' [srcCDRRecordID] [int] NOT NULL ,'
Select @SQL = @SQL + ' [CallDate] [smalldatetime] NOT NULL ,'
Select @SQL = @SQL + ' [CallTime] [char] (6) NOT NULL ,'
Select @SQL = @SQL + ' [BillPhoneNum] [char] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,'
Select @SQL = @SQL + ' [BillableDurationTotal] [decimal](18, 4) NOT NULL ,'
Select @SQL = @SQL + ' CONSTRAINT [PK_tbMonthlyCDRData_' + Right('00' + Convert(Varchar(2), @Month), 2) + '_2003] PRIMARY KEY '
Select @SQL = @SQL + ' ('
Select @SQL = @SQL + ' [srcCDRRecordID] ,'
Select @SQL = @SQL + ' [BillPhoneNum] ,'
Select @SQL = @SQL + ' [CallDate] ,'
Select @SQL = @SQL + ' [CallTime] ,'
Select @SQL = @SQL + ' [BillableDurationTotal]'
Select @SQL = @SQL + ' ) ON [PRIMARY] ,'
Select @SQL = @SQL + ' CONSTRAINT [CK_CallDate_' + Right('00' + Convert(Varchar(2), @Month), 2) + '_2003] CHECK (CallDate >= ' + Right('00' + Convert(Varchar(2), @Month), 2) + '/1/2003 and CallDate < ' + Right('00' + Convert(Varchar(2), @Month + 1), 2) + '/1/2003)'
Select @SQL = @SQL + ') ON [PRIMARY]'

Exec (@SQL)

Select @Month = @Month + 1
End


Any ideas on that one??
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 16:14:04
Problem Solved

Figured out the WITH CHECK OPTION issue...

This line:
Select @SQL = @SQL + '	CONSTRAINT [CK_CallDate_' + Right('00' + Convert(Varchar(2), @Month), 2) + '_2003] CHECK (CallDate >= ' + Right('00' + Convert(Varchar(2), @Month), 2) + '/1/2003 and CallDate < ' + Right('00' + Convert(Varchar(2), @Month + 1), 2) + '/1/2003)'


Need to be changed to:
Select @SQL = @SQL + '	CONSTRAINT [CK_CallDate_' + Right('00' + Convert(Varchar(2), @Month), 2) + '_2003] CHECK (CallDate >= ''' + Right('00' + Convert(Varchar(2), @Month), 2) + '/1/2003'' and CallDate < ''' + Right('00' + Convert(Varchar(2), @Month + 1), 2) + '/1/2003'')'


(Note the quotes around the dates)
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-09-18 : 16:24:25
Nice catch...was just executing you're sql...

Sometimes it's easier to do a select and check before executing..

hoe's the performance coming?



Brett

8-)

SELECT @@POST=NewId()

That's correct! It's an AlphaNumeric!
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 16:32:30
I can't check the performance fully, as there isn't enough space on the production machine's drives. Total drive space required is around 140G. I've got quite a bit of space on the array where the log files are, but not much where the data files are.

We are hiring a DBA (I'm the IT Manager) sometime very soon (we've extended an offer and are waiting to hear back). I'm hoping they can take this idea, determine the best implementation plan and run with it.

My first impressions are, if you plan on querying a lot and your filtering criteria (where clause basically) is based on how the data is partitioned, you will see major speed improvements.

However, if your queries are looking for something that could possibly be in half or more of the tables (ie; not how they are partitioned) performance could suffer somewhat.

As always, YMMV (Your Mileage My Vary) depending on your situation. I'm still hoping to get some good use out of Partitioned Views though, and am thankful the original article was there to help me out. :)
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-09-18 : 16:37:50
SQL Server is limited in the partitioning department...Oracle/DB2 allow composit patitions on Indexes, tablespaces, ect.

I think MS wants to drive this type of OLAP in to their Data Warehouse and Analysis Tools, which I think is where it's better suited.

But (and this is a big but), if the data has to be OLTP....then...

I don't know...you must be able to separate out real time reporting from day to day or forecasting? no?



Brett

8-)

SELECT @@POST=NewId()

That's correct! It's an AlphaNumeric!
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-18 : 16:50:59
Man, I think they need to beef up the server(s) behind these forums. These things are getting slow.

This is a data warehouse database. Eash night there's a big insert, a big delete, some big updates, and then it's queried all day long.

The queries come from multiple "angles" so it might be hard to determine the best way to partition it, but I'm sure it can be done.
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-09-18 : 16:58:39
Yeah but if it's warehouse doesn't every "dimension" have it's own set of tables with it's own key...(potentitally multiple tables per partitioned view?)...

Goota read up more on the stuff...

Gotta go...hurricane a comin..



Brett

8-)

SELECT @@POST=NewId()

That's correct! It's an AlphaNumeric!
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-19 : 09:44:15
quote:
Gotta go...hurricane a comin..


Hey be careful man.
Go to Top of Page

mr_mist
Grunnio

1870 Posts

Posted - 2003-09-23 : 03:01:13
The partitioned views seems to behave somewhat erratically. We have the same tables split over a view on two different machines, with identical data, but different CPU setups (one dual one single). The dual processor machine handles aggregations over the views badly, whilst the single processor machine performs them quite happily.

Strangely, even if you raise the parallel operation threshold, the query plans still do not match.

-------
Moo. :)
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-24 : 14:05:24
If you force the dual cpu machine to only use one processor for that view do the query plans match?
Go to Top of Page

airjrdn
Starting Member

35 Posts

Posted - 2003-09-24 : 14:08:04
Additionally, we have a test server setup to mirror our production server. The thing is, the test server is basically just a workstation with a couple of physical 7200rpm IDE drives in it.

I've DTS'd the table w/87M rows over to the test machine, and created the monthly tables along with the corresponding view. I'm inserting from the 87M row table into the view, but won't be able to do any testing until the insert is finish. At the rate it's going it should be done sometime tomorrow evening. (Note that I started it yesterday morning). :)

Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-09-24 : 15:02:00
Wouldn't bcp have been faster?



Brett

8-)

SELECT @@POST FROM Brain ORDER BY NewId()

That's correct! It's an AlphaNumeric!
Go to Top of Page

MichaelP
Jedi Yak

2489 Posts

Posted - 2003-09-24 : 15:31:12
A backup and restore or a detach and reattach might have been faster as well.

Michael

<Yoda>Use the Search page you must. Find the answer you will.</Yoda>
Go to Top of Page
    Next Page

- Advertisement -