5 min læsning

Are you wasting storage space?

Featured Image

Do you see tables taking up much more space than expected? And when looking at the table design and doing a little math numbers just don’t add up! I have seen issues so many times, where you have a simple little table with only a few thousand rows taking up GB’s of storage space. The case is more or less always the same, and that is what I’ll demonstrate in this blog post.

WasteHead

 

 




Imagine that when doing your table design you are not 100% sure what data a column will contain, you know that it will be some kind of status information but wheatear it will be char(3) or char(30) is unknown at this stage. Another scenario could be that the team that does the table design has made a mistake, and put in char(3000) instead of char(3).

Yes, I’m aware that using varchar(x) here could be a good alternative, but I don’t want to go down that road in this example. So let’s go on with a table design looking something like this:

<span style="color:blue;">CREATE TABLE </span><span style="color:teal;">myLittleTest
</span><span style="color:gray;">(
    </span><span style="color:teal;">colId </span><span style="color:blue;">INT IDENTITY</span><span style="color:gray;">,
    </span><span style="color:teal;">colStatus </span><span style="color:blue;">CHAR</span><span style="color:gray;">(</span>4000<span style="color:gray;">)
)
</span><span style="color:blue;">GO
</span>

Being the cleaver guy You are, You discover the mistake and alter the colStatus column to be char(4) instead of char(4000) – you do that using this statement:

<span style="color:blue;">ALTER TABLE </span><span style="color:teal;">myLittleTest </span><span style="color:blue;">ALTER COLUMN </span><span style="color:teal;">colStatus </span><span style="color:blue;">CHAR</span><span style="color:gray;">(</span>4<span style="color:gray;">)
</span><span style="color:blue;">GO
</span>

Now let’s start loading data into the table. Here is how you can load 10.000 rows of useful information into a table.

<span style="color:blue;">INSERT INTO </span><span style="color:teal;">myLittleTest </span><span style="color:gray;">(</span><span style="color:teal;">colStatus</span><span style="color:gray;">)
</span><span style="color:blue;">SELECT TOP </span>10000 <span style="color:red;">'Beer'
</span><span style="color:blue;">FROM </span><span style="color:green;">sys</span><span style="color:gray;">.</span><span style="color:green;">columns
</span><span style="color:gray;">CROSS JOIN </span><span style="color:green;">sys</span><span style="color:gray;">.</span><span style="color:green;">messages
</span>

Before looking at the table stats, let’s do some math so that we have an idea about how much space this data would take up on disk. Two columns of each 4 bytes (int + char(4)) + 7 internal bytes * 10.000 rows. So all together 15 bytes pr. Row giving a total of approx. 150 KB.

The size can vary from what you see in the table statistics because the data is stored in pages of 8 Kb.

If you want to know even more about the internals of a SQL Server page or row, here is a link to a blog post written by Danish MVP Mark S. Rasmussen that explains things in detail. (http://improve.dk/deciphering-a-sql-server-data-page/)

Now let’s have a look at the table stats:

<span style="color:blue;">EXEC </span><span style="color:maroon;">SP_SPACEUSED </span><span style="color:teal;">myLittleTest
</span>

 

pic1

WOW what just happened, 40 MB to store 10.000 rows of 15 bytes. How is that possible? Is that a bug or what? Could it be fragmentation, let’s try to look at the fragmentation stats for the table:

<span style="color:blue;">SELECT
    </span><span style="color:teal;">database_id</span><span style="color:gray;">,
    </span><span style="color:magenta;">object_id</span><span style="color:gray;">,
    </span><span style="color:teal;">index_type_desc</span><span style="color:gray;">,
    </span><span style="color:teal;">avg_fragmentation_in_percent
</span><span style="color:blue;">FROM </span><span style="color:green;">sys</span><span style="color:gray;">.</span><span style="color:green;">dm_db_index_physical_stats</span><span style="color:gray;">(</span><span style="color:magenta;">DB_ID</span><span style="color:gray;">(), </span><span style="color:magenta;">OBJECT_ID</span><span style="color:gray;">(</span><span style="color:red;">'myLittleTest'</span><span style="color:gray;">), NULL, NULL , NULL)
</span>

 

pic2

As the output shows, the answer is NOT fragmentation.  A fragmentation level of below 1% is not something that your regular index maintenance scripts would catch and rebuild. The 40 MB of wasted space is not hidden here.

To be honest the first time I saw this in a production environment I did not believe it, but after some research I found out that it is not a bug but standard SQL Server behavior.

When you alter the size a char column SQL Server keeps a copy of the original column until you rebuild the table. This is also the fact if you increase the size of the column. Let’s try to solve the problem, and free up some space.

<span style="color:blue;">ALTER TABLE </span><span style="color:teal;">myLittleTest </span><span style="color:blue;">REBUILD
</span>

 

Let’s have a look at the new table stats after the rebuild command:

pic3

Sweet I got my space back - maybe you should have a look at the space consumption of your production tables. I am sure this can save You a ton of GB’s.

Any questions?
Please reach out to info@inspari.dk or +45 70 24 56 55 if you have any questions. We are looking forward to hearing from you.