FILESTREAM and MAXSIZE

When creating a FILESTREAM-enabled database and creating the file to hold the FILESTREAM data, you can set an optional maximum size. In script, this is how you create a database with a single FILESTREAM file with a maximum size of 2 megabytes:

CREATE DATABASE [Test_FILESTREAM]
ON PRIMARY
    (NAME = 'Test_FILESTREAM', FILENAME = 'C:\SQLData\Data\Test_FILESTREAM.mdf', SIZE = 4096KB, FILEGROWTH = 1024KB ),
FILEGROUP [FS] CONTAINS FILESTREAM
    (NAME = 'FS', FILENAME = 'C:\SQLData\Data\FS', MAXSIZE = 2048KB )
LOG ON
    (NAME = N'Test_FILESTREAM_log', FILENAME = 'C:\SQLData\Data\Test_FILESTREAM_log.ldf', SIZE = 1024KB, FILEGROWTH = 10%)

We can then create a new table in this database and have a column with data type VARBINARY(MAX) FILESTREAM:

CREATE TABLE [dbo].[Test](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [value] [varchar](50) NOT NULL,
    [blob] [varbinary](max) FILESTREAM NULL,
    [rowguid] [uniqueidentifier] ROWGUIDCOL NOT NULL UNIQUE,
    CONSTRAINT [PK_Test] PRIMARY KEY CLUSTERED
    (
        [ID] ASC
    )
)
ON [PRIMARY] FILESTREAM_ON [FS]

With the MAXSIZE property of the FILESTREAM file set to 2 MB, what happens when we attempt to insert a file larger than 2 MB and what happens when we attempt to insert a series of files that combined exceed 2 MB? Perhaps even more important for storage administrators, what about the garbage files that remain after updates and deletes?

Single File Larger than MAXSIZE

In order to address this question, I took a digital photo and attempted to insert a new row containing that photo in the otherwise empty test table. The T-SQL code is below:

INSERT INTO [dbo].test
    (rowguid, value, blob)
    SELECT newid(), 'Photo',
        bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo.jpg', SINGLE_BLOB) AS x

This INSERT statement fails with the following error message:

Msg 1105, Level 17, State 40, Line 1
Could not allocate space for object ‘dbo.Test’PK_Test in database ‘Test_FILESTREAM’ because the ‘FS’ filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

This message isn’t technically correct because

  1. the filegroup isn’t full, it’s empty (but it would be beyond capacity if the statement would be executed successfully)
  2. you cannot actually set autogrowth on a FILESTREAM data container.

Nevertheless, the point is made that it’s not possible to insert a BLOB whose size exceeds the maximum size set for the FILESTREAM data container.

Multiple Files Combined Larger than MAXSIZE

For this second test, I collected 3 picture files, each less than 800 KB, but combined exceeding 2 MB. Here are the T-SQL INSERT statements:

INSERT INTO [dbo].test (rowguid, value, blob)
SELECT newid(), 'Photo1',
    bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo1.jpg', SINGLE_BLOB) AS x
INSERT INTO [dbo].test (rowguid, value, blob)
SELECT newid(), 'Photo2',
    bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo2.jpg', SINGLE_BLOB) AS x
INSERT INTO [dbo].test (rowguid, value, blob)
SELECT newid(), 'Photo3',
    bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo3.jpg', SINGLE_BLOB) AS x

The output of these three statements is as follows:

(1 row(s) affected)

(1 row(s) affected)

Msg 1105, Level 17, State 40, Line 9
Could not allocate space for object ‘dbo.Test’PK_Test in database ‘Test_FILESTREAM’ because the ‘FS’ filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

After this statement, as you might expect, there are two rows in the table and there are two files stored in the FILESTREAM file’s folder.

What if we included these three INSERT statements inside a transaction? The T-SQL code is below.

Note that if you want to keep following this example, you’ll need to either drop and re-create the table (remember, we already have two files in the table), or delete the two rows, perform a full database backup, then a transaction log backup, then issue a checkpoint. (What? You said you’ll just drop and recreate the sample table? Smart choice!)

BEGIN TRAN
BEGIN TRY
    INSERT INTO [dbo].test (rowguid, value, blob)
    SELECT newid(), 'Photo1',
        bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo1.jpg', SINGLE_BLOB) AS x
    INSERT INTO [dbo].test (rowguid, value, blob)
    SELECT newid(), 'Photo2',
        bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo2.jpg', SINGLE_BLOB) AS x
    INSERT INTO [dbo].test (rowguid, value, blob)
    SELECT newid(), 'Photo3' ,
        bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo3.jpg', SINGLE_BLOB) AS x
    COMMIT
END TRY
BEGIN CATCH
    ROLLBACK
END CATCH

Notice that the output of these statements is as follows:

(1 row(s) affected)

(1 row(s) affected)

(0 row(s) affected)

The first two rows were inserted successfully, the last one didn’t because it would cause the FILESTREAM data container to grow larger than 2 MB. That caused an error and because I’ve used the new SQL Server 2012 error handling syntax, the CATCH block was executed which rolled back the transaction. Now how many rows are there in the table? How many files are there in the FILESTREAM file’s folder on disk?

The answer is that there are no rows in the table. I hope that was obvious, the entire transaction rolled back after all. There are, however, two files in the FILESTREAM data container. They are garbage files which will not be removed by the garbage collector until it is certain they would no longer be needed for recovery. (Again, in most cases, replication scenarios excluded, if your database is in full recovery mode, the following steps will trigger garbage collection: full backup, transaction log backup, CHECKPOINT.) Note that garbage collection does not run immediately, but when the engine allows. Also, garbage collection removes garbage files in batches. If you have several dozen garbage files, they will not all be removed at the same time to reduce I/O impact.

In my case, I now have about 1.5 MB worth of garbage files after these statements have executed.

Garbage Files

What is the effect of the garbage files on being able to insert additional rows? To test this, we’ll simply attempt to insert a single BLOB whose size exceeds 700 KB. If the size of the garbage files is counted against the available space in the FILESTREAM file, this operation will fail. However, if the garbage files do not count, the operation will succeed.

You can use the next INSERT statement to test this out. You’ll find that the row gets inserted successfully. There will then be a single row in your test table, but there will be three files in the FILESTREAM data container and their combined size will exceed 2 MB.

INSERT INTO [dbo].test (rowguid, value, blob)
SELECT newid(), 'Photo1',
bulkcolumn FROM OPENROWSET(BULK 'C:\temp\photo1.jpg', SINGLE_BLOB) AS x

Implications

This has some important implications for storage. If your environment isn’t set up to trigger regular garbage collections, you may end up storing (and backing up) a lot more files than you anticipated. You might run out of space or experience performance problems this way.

The bottom line is to plan appropriately for the amount of space your FILESTREAM implementation will require. You will need to take into account the usual suspects, such as current data size and data growth, but also measure how often garbage collection will run in your environment and plan to have sufficient space to store the garbage files in the meantime.

Note that everything here also applies to FileTable, as FileTable storage is nothing more than FILESTREAM. In a next blog post, I will discuss implementing multiple FILESTREAM data containers in a single filegroup (new in SQL Server 2012) and we’ll see if this can help avoid out of space conditions.

Advertisements

One thought on “FILESTREAM and MAXSIZE

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s