SQL Server Integration Services (SSIS) is a great tool for
building ETL processes. On SQL Server,
we can configure an integration services catalog to deploy and configure all
the SSIS packages. When using this service, there is also a catalog database named
SSISDB that needs to be maintained before we quickly run of disk space. In this
article, we look at the catalog retention policy to better manage the disk
space requirements on the SSISDB database.
SSISDB Database
The SSISDB is a database that holds all the integration
service catalog metadata as well as version and execution history. Depending on
the frequency of execution of our packages, the database size can grow very
quickly.
Fortunately for us, there is a SSIS Server Maintenance Job (SQL Server Agent jobs) that runs every
day to clean up and maintain the database.
The problem with that job is that it depends on some configuration to
enable the cleanup and the retention period which can be 365 days. Depending on
our package activity, that retention window can lead our database space to grow
in the hundreds of gigabytes.
Catalog Configuration
The SSISDB has a catalog schema which contains the objects
that can enable us to look at the catalog configuration. We need to look at the
following objects to view and update the configuration:
Object
|
Description
|
catalog.catalog_properties (view)
|
This is a
view for the catalog configuration.
|
catalog.configure_catalog (procedure)
|
This stored procedure is used
to update a configuration setting.
|
When selecting the information from the view, we may get
results to similar to the ones on this image:
When we query the view, we need to look at these two
settings:
Setting
|
Description
|
OPERATION_CLEANUP_ENABLED
|
This should be set to TRUE to enable the cleanup of
historical data.
|
RETENTION_WINDOW
|
This is the amount of dates that are allowed for data retention. If this data is not critical, set it to a
low number like 30 days or less.
|
Change the Settings
To enable this setting and set a low retention window, we
can use a stored procedure within the catalog schema. Let’s take a look at how
that can be done with TSQL:
--SET CLEANUP ENABLE TO TRUE
exec [catalog].configure_catalog OPERATION_CLEANUP_ENABLED, TRUE
--SET THE RETENTION WINDOW TO 30
exec [catalog].configure_catalog RETENTION_WINDOW, 30
--OPTIONAL RUN THE CLEANUP ON DEMAND OR WAIT FOR SCHEDULE TASK
TO RUN
EXEC [SSISDB].[internal].[cleanup_server_retention_window]
|
By setting those fields, we can run the stored procedure to
clean up the data on demand, or we could also wait for the SQL job to run at
its scheduled time and clean up the data for us.
Database Size is Hundreds of Gigs
In the event that the database is big, changing the
retention window to a very low number (i.e.
365 days to 30 days) in one step may cause the job to eventually fail. For these cases, we need to decrease the
retention window in smaller steps. For
example, we could write a script which decrements the retention window by one
and runs the cleanup procedure as shown here:
--we reduce the retention window by one until we reach the
target window of 30
declare @index int = 364, @max int = 30
while @index > @max
begin
exec [catalog].configure_catalog RETENTION_WINDOW, @index
EXEC [SSISDB].[internal].[cleanup_server_retention_window]
--shrink the log file as well
DBCC SHRINKFILE('log',10)
set @index = @index -1
end
|
If the amount of data is very large, this script may take
some time to run. Just let it run and monitor how the retention window
decreases with every cycle.
Conclusion
SSISDB like any other database needs to be maintained.
Depending on the activity of our SSIS packages, we need to be mindful of the
maintenance plan for this database. We
need to look at the catalog retention policy to make it is compliant with our
disk space capacity.
Thanks for reading.
0 comments :
Post a Comment
What do you think?