Sunday, February 19, 2012

Adding DEFAULT columns

Hi

I have a table that currently has 466 columns and about 700,000
records. Adding a new DEFAULT column to this table takes a long time.

It it a lot faster to recreate the table with the new columns and then
copy all of the data across.

As far as I am aware when you add a DEFAULT column the following
happens:

a) The column is added with a NULL property
b) Each row is updated to be set to the DEFAULT value
c) The column is changed to NOT NULL.

However, adding the column as NOT NULL with the DEFAULT seems to take a
lot longer than if I do steps a) - c) separately.

When I say a long time, adding just a single DEFAULT column takes
around 6 hours. Surely it should not take this long?

There is a trigger on this table but disabling this does not seem to
make much difference.

Can anybody give me any advice on the use of DEFAULT columns please?
When should they be used, benefits, disadvantages, alternatives etc.
Also should it really take as long as it is taking or is there a
problem with my setup?

If I am honest I can't see why DEFAULT columns should be used as the
values could always be inserted explicitly via the application
Thanks in Advance.

PaulI think you may be able to speed it up by using the NoCheck option, so it
doesn't look at the existing data.

I didn't know that specifying a default updates the null values in the
table - so you are either incorrect, or I have an incomplete understanding.

But I am pretty sure that having so many columns isn't helping you in the
least bit, and may be the real reason.

Splitting the tables apart and using a view to be backwards compatible (with
an 'instead of ' trigger for updates) - aught to make things go faster.
Especially if all or the most frequent searchable columns stay in one of the
new tables (a hub table as it were).

If the data values are ALWAYS explicitly added ALL the time, there is no
reason for a default.
You may need to beat up on some wayward programmer to guarantee that they
get populated each and every time with the correct values - But since I
don't like violence (nor the testing to find the problem), I add defaults
to guarantee that happens regardless. Also it is possible to do an insert
without a column list and specify defaults (so that all the columns get the
default values), which could be useful in some instances.

"Paul" <paulwragg2323@.hotmail.com> wrote in message
news:1103024817.227784.219090@.z14g2000cwz.googlegr oups.com...
> Hi
> I have a table that currently has 466 columns and about 700,000
> records. Adding a new DEFAULT column to this table takes a long time.
> It it a lot faster to recreate the table with the new columns and then
> copy all of the data across.
> As far as I am aware when you add a DEFAULT column the following
> happens:
> a) The column is added with a NULL property
> b) Each row is updated to be set to the DEFAULT value
> c) The column is changed to NOT NULL.
> However, adding the column as NOT NULL with the DEFAULT seems to take a
> lot longer than if I do steps a) - c) separately.
> When I say a long time, adding just a single DEFAULT column takes
> around 6 hours. Surely it should not take this long?
> There is a trigger on this table but disabling this does not seem to
> make much difference.
> Can anybody give me any advice on the use of DEFAULT columns please?
> When should they be used, benefits, disadvantages, alternatives etc.
> Also should it really take as long as it is taking or is there a
> problem with my setup?
> If I am honest I can't see why DEFAULT columns should be used as the
> values could always be inserted explicitly via the application
> Thanks in Advance.
> Paul|||Thanks for the response David.

I have suggested that we split this table up and I think this will be
eventually done (it's a case of having the time up front to do this).

I wasn't saying that adding a DEFAULT value to the column updates
existing data - rather that when a new DEFAULT column is added it
follows the steps a) - c) in order to add the new column.

As the table is so large I think I need to investigate the way the data
for this table is actually stored. At present we have no clustered
index on this table which is probably also contributing to the problem.
I don't know too much about the way the data is stored to be honest!|||OK, I get it now, you are adding a column and not just binding a new default
to an existing column.

Copy into the new table, drop the original, and do a sp_rename.
While you are at it break the table apart - If you can't find the time to do
it right, when will you find the time to do over and over incorrectly.

Get one of those MCSD prep books for the SQL Server Design Exam to find out
how stuff gets stored in a database. Their 1st or 2nd chapter normally goes
over devices, extents, pages and all that stuff.

I am sure there are a boat load of free sources on the Web on that as well.

"Paul" <paulwragg2323@.hotmail.com> wrote in message
news:1103033046.110249.69780@.f14g2000cwb.googlegro ups.com...
> Thanks for the response David.
> I have suggested that we split this table up and I think this will be
> eventually done (it's a case of having the time up front to do this).
> I wasn't saying that adding a DEFAULT value to the column updates
> existing data - rather that when a new DEFAULT column is added it
> follows the steps a) - c) in order to add the new column.
> As the table is so large I think I need to investigate the way the data
> for this table is actually stored. At present we have no clustered
> index on this table which is probably also contributing to the problem.
> I don't know too much about the way the data is stored to be honest!|||Paul (paulwragg2323@.hotmail.com) writes:
> It it a lot faster to recreate the table with the new columns and then
> copy all of the data across.

Is that a question or a statement? Which "It" is a typo for "is"?

> When I say a long time, adding just a single DEFAULT column takes
> around 6 hours. Surely it should not take this long?

Just because "ALTER TABLE tbl ADD col DEFAULT 0" is easy to type, that
does not mean that it executes equally fast. There is a lot of work to
be done - since all rows expand, basically all pages have to be written.

In our shop we do all table changes the long way - rename, create new,
insert over, move foreign keys, drop old. We have a build that generates
a skeleton for this manoeuvre. One reason we do this is that ALTER TABLE
only can handle some changes, and you can not insert columns in the
middle with. (And our scheme was established in 6.5 when you could do
even less with ALTER TABLE.)

Generally, I would not expect reload of a 700000 rows table, not even
that wide to take six hours. Also, when moving over, you can do that
in chunks.

> There is a trigger on this table but disabling this does not seem to
> make much difference.

The trigger is not fired when you to ALTER TABLE. Note that if you do
the long way, you will need to recreate the trigger. Whether you do
that before or after you reload the data depends on whether you want
the checks in the trigger to be performed (I usually want to). But for
performance, it's best to recreate the trigger after the data move.

> Can anybody give me any advice on the use of DEFAULT columns please?
> When should they be used, benefits, disadvantages, alternatives etc.
> Also should it really take as long as it is taking or is there a
> problem with my setup?

If you need to add to existing column to a database, and you don't want
NULL values in the column, the a default value is a good way to go, to
avoid problems with existing software that writes to this table. And,
even if existing software is rewritten - it may after all be a single
GUI form - existing data needs to be handled.

Sometimes NULL values can be feasible, but for instance a bit column
is typically NOT NULL. I think the choice should be made from the
anticpated use in the future, and not what is the most convenient
right now.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||A technique that I sometimes use with large tables is SELECT ... INTO
followed by a drop and a rename. This is minimally logged in the SIMPLE or
BULK_LOGGED recovery model.

Whether or not this is faster depends on the particulars of the changes made
and the indexes that need to be rebuilt.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Paul" <paulwragg2323@.hotmail.com> wrote in message
news:1103024817.227784.219090@.z14g2000cwz.googlegr oups.com...
> Hi
> I have a table that currently has 466 columns and about 700,000
> records. Adding a new DEFAULT column to this table takes a long time.
> It it a lot faster to recreate the table with the new columns and then
> copy all of the data across.
> As far as I am aware when you add a DEFAULT column the following
> happens:
> a) The column is added with a NULL property
> b) Each row is updated to be set to the DEFAULT value
> c) The column is changed to NOT NULL.
> However, adding the column as NOT NULL with the DEFAULT seems to take a
> lot longer than if I do steps a) - c) separately.
> When I say a long time, adding just a single DEFAULT column takes
> around 6 hours. Surely it should not take this long?
> There is a trigger on this table but disabling this does not seem to
> make much difference.
> Can anybody give me any advice on the use of DEFAULT columns please?
> When should they be used, benefits, disadvantages, alternatives etc.
> Also should it really take as long as it is taking or is there a
> problem with my setup?
> If I am honest I can't see why DEFAULT columns should be used as the
> values could always be inserted explicitly via the application
> Thanks in Advance.
> Paul

No comments:

Post a Comment