Thursday, 26 September 2013

WHY WE NEED REMOTE BLOB STORAGE (RBS)?

ASSUMPTION:

· You have knowledge of SharePoint Administration
· Know SQL Server Administration

INTRODUCTION

In SharePoint, a binary large object (BLOB) is a huge block of data stored in a database that is known by its size and location instead of by its structure like Microsoft Office 2010 document or a video file. The fact is that these BLOBs are unstructured data, are stored directly in the SharePoint content database along with the associated metadata. As said these BLOBs can be very large can increase database size within no time so, it might be better to store BLOBs outside the content database. RBS is a Microsoft SQL Server library API set that is incorporated as an add-on feature pack for Microsoft SQL Server 2008 R2, SQL Server 2008 or Microsoft SQL Server 2008 R2 Express. The RBS feature enables applications, such as SharePoint Foundation 2010, to store BLOBs outside the content databases. So, storing the BLOBs externally can reduce how much SQL Server database storage space is required. The metadata (information associated with blob) for each BLOB is stored in the SQL Server database and the BLOB is stored in the RBS store.

UNDERSTANDING BLOBs

Almost all values stored in SQL Server consist of ASCII (American Standard Code for Information Interchange) characters. An ASCII characters are letters, numbers, and symbols found on the keyboard. A text editor such as Notepad can alter a file holding only ASCII characters without any consequences but what if we have to alter or store a large amount of binary data in a SQL Server table along with other ASCII data; Word documents, XML documents, and images. Binary files contain ASCII characters, special control characters, and byte combinations not found on the keyboard. SQL Server provides special data types for dealing with such large volumes of binary data. These various data types have changed over time. In SQL 2000, there were two different families of data type options for this type of data: binary and image. The binary family included two different data types: the binary data type and the VARBINARY data type. The VAR in VARBINARY means that the size is variable rather than fixed, as in the case of the standard binary data type. However, it still has a maximum length of 8,000 bytes. The image data type family was used to store binary large objects that are greater than 8,000 bytes. Anyways, this will keep changing.

SHOULD I CONSIDER USING RBS ALWAYS?

There is no need of using RBS all the time. You can use RBS only when:
  • You have huge content databases for document archiving so you want to reduce storage cost (terabytes of data)
  • You have large media files to stream to the audience
  • You need to use RBS to integrate 3rd party storage/archiving solutions to SharePoint. (For example Interwoven DAM or EMC Documented)
Note: if you only have 100GB of data, separated in several content databases, and most of the content is documents, There is no benefit to my server farm by using RBS.

RBS COMPONENTS

RBS has fallowing components:
  • RBS client library : An RBS client library consists of a managed library that coordinates the BLOB storage with Microsoft SharePoint Foundation, SQL Server 2008 Express, and RBS provider components.
  • Remote BLOB Storage provider: An RBS provider consists of a managed library and, optionally, a set of native libraries that communicate with the BLOB store.
  • BLOB store: A BLOB store is an entity that is used to store BLOB data. This can be a content-addressable storage (CAS) solution, a file server that supports Server Message Block (SMB), or a SQL Server database.

RBS INTEGRATION WITH SHAREPOINT 2010

This document explains RBS architecture and gives step by step implementation of RBS for content databases on SharePoint farm

WHY WE HAVE CHOOSE RBS

They presented a table outlining the key advantages of RBS over EBS.
  • RBS has Managed interface.
  • The RBS BLOB store scope is per Farm.
  • SharePoint 2010 adds the Configurable Maintainer to sync operations from SharePoint down to RBS.
  • RBS also has a PowerShell-based UI within SharePoint, this was pretty rudimentary but again it shows a commitment to the integration.
  • RBS supports multiple providers.

ADVANTAGES OF USING RBS WITH SHAREPOINT

  • Reduces you database size: If you’re having trouble with disk space with your SQL Server and don’t have facility to add additional disks? RBS can definitely help you, as having compact database is always beneficial.
  • Extend your SQL Express: if you are using SharePoint with SQL Express, then you might be facing database size limitations (4GB for 2008 & 10GB for R2) along with SQL Server free edition. RBS implementation can offload your data, as data will not be directly stored into your database.
  • Performance advantage for large BLOB: you can stream data directly from a database, but configuring SQL Server to use remote file streaming is an added overheads. RBS allows you to configure specific file size threshold that can store small files directly in the database and large files can be stored using RBS, storing large files in a different location can provide optimal performance.
  • Reduced DB backup time: RBS can offload your database that means reduced database size and time associated with database back up. However you still need to take back up of the BLOB Store, but that can be performed as a parallel process and saves lots of time.
  • Cost effective file storage: Storing barely accessed files on database hard drives is an expensive approach and waste of resources, because DB drives are tend to be fast access and they are expensive. RBS provides another approach to move these BLOBs to BLOB store, allows moving these files to cheaper hard drives with performance boost.

DISADVANTAGES OF USING RBS WITH SHAREPOINT

  • Extra security measures: With the FILESTREAM RBS provider, securing BLOB store data is very important practice. You need to plan and implement security measures
  • No support for DAS and database mirroring: RBS with FILESTREAM doesn’t support Direct Attached Storage (DAS) and database mirroring.
  • Database level configuration: You need to configure individual per content database settings instead of per library or site.
  • Cost increase: RBS implementation increases overall IT operational costs, as your IT department needs put more efforts to running and monitoring RBS.
  • NO Encryption Support : No encryption supported, even transparent DB encryption
  • Better for Only Big Data : RBS is bad for situations where your writing lots of small BLOBs (< 256KB), it's good for situations where you have fewer larger BLOBs (>256KB)

SQL SERVER REMOTE BLOB STORE & FILESTREAM FEATURE COMPARISON

FILESTREAM and RBS are complementary approaches for storage and management of BLOBs and we see them both continuing to evolve.
  • FILESTREAM provides a storage option that allows storage, efficient streaming and integrated management of large BLOBs in a SQL database by utilizing the underlying NTFS file system for BLOB storage/streaming. It offers fully transactional access and compatible operations as varbinary(max).
  • RBS is a set of standardized APIs that allow storage/retrieval of BLOBs outside of your main SQL database where a dedicated BLOB store is desirable for various reasons. This uses a provider model for plugging in any dedicated BLOB store that implements these RBS APIs.
We have also built an out-of-the-box RBS FILESTREAM provider that allows a deployment to use a SQL Database (local or remote) as a dedicated BLOB store. This provider utilizes the FILESTREAM as the BLOB storage mechanism and ties the two technologies together.

RBS OVERVIEW AND ARCHITECTURE

Since RBS means “Remote BLOB Storage”, it is important to amend the correct definition of the expression Binary Large Object as data elements that have either of the following characteristics:
· Unstructured data that has no scheme
Based on this preliminary remark, we can start our Overview asserting that RBS is an API Library included as an add-on feature pack for Microsoft SQL Server. It can be executed locally on Microsoft SQL Server 2008 R2, SQL Server 2008 and SQL Server 2008 R2 Express instances and RBS can be executed on remote servers however the Enterprise edition is needed.RBS offers the great advantage of ousting the BLOBs from the Content Database to store them on the File System. Within the object metadata, a pointer will be generated to the corresponding streaming file (similar to the one produced by EBS).
As in EBS, this behavior reduces the need of storage space of the SQL databases, so that it is smoother the respect of the Best Practices recommendation about the size limit at 100 GB. The good news among the “old” EBS is a greater transparency of the processes among the Applications using the involved databases. One constraint consists in the fact that the storage hosting the BLOBs has to be accessible by SQL Server, this means at least that:
  1. a) Network storage can’t be used for RBS folder hierarchy (no NAS or DAS);
    b) In a Cluster environment, RBS has to be configured on a Share accessible by each node.
The best presentation of the RBS architecture overview is shown below

RBS API uses a provider to access the storage hosting the BLOBs. The provider can be supplied by a Vendor in the context of a custom Solution, but consider the fact that the RBS add-on is distributed as feature pack for SQL Server 2008 R2 and includes RBS FILESTREAM, that makes SQL Server itself an RBS Provider. SharePoint 2010 supports OOB this provider, together with a number of third party providers. Currently, however, the only provider fully compatible with the database backup and restore operations of SharePoint 2010 is precisely the RBS FILESTREAM Provider. The eventual Site Collection backup / restore and the sites or lists export / import can be executed besides the provider in use (Of course, because such operations are in fact selective queries against the Content Database)another interesting characteristic of RBS consists in the fact that we can configure it on a new Content Database AND / or on existing Content Databases, in this last case the previously existing BLOBs will “migrate” from the content database to the external storage.
Finally, once we have installed and configured RBS, it doesn’t allow direct access to the files stored in his configured storage and any access will be performed ONLY by SharePoint.

RBS CONFIGURATION FOR CONTENT DATABASE

The following steps to configure RBS using the FILESTREAM provider of SQL Server 2008 or 2008 R2 for a SharePoint 2010 web application and associated content database. I assume SQL Server 2008 R2 Enterprise Edition is used.
Considerations for using RBS with Filestream
There are many reasons why you should and should not use RBS. The decision to use RBS must not be made lightly since it affects your overall farm management model. This guide is a guide only and not an endorsement to use RBS in your deployment. We hope that you can use the guide to test and configure RBS and make an informed decision.
Scenarios covered in this article
1. Setup and configure FILESTREAM and RBS for a new content database.
2. Configure RBS for an existing content database and site collection.
3. Disable RBS from a content database.
General considerations for using the article
1. These instructions assume that you are using the FILESTREAM RBS provider.
2. A SharePoint Server Farm configured and tested to be operational (In the example, the steps are based on a server farm with three servers).
3. The correct version of the RBS provider for the SQL Server Edition (In this sample the download was obtained from RBS PROVIDER DOWNLOAD
4. RBS can only be used for content databases in SharePoint Server 2010.
Implementing RBS with the FILESTREAM provider on a remote SQL Server to be used by SharePoint requires SQL Server Enterprise Edition.
RBS is most suited for ‘infrequent access’ scenarios such as ‘archival’ of content.
1. The FILESTREAM provider is supported when it is used on local hard disk drives only (i.e.: No NAS is supported).
2. Files that are stored using RBS cannot be directly accessed.

SAMPLE SERVER DEPLOYMENT

SPAPP1 – SharePoint application server
SPWFE1 – SharePoint Front end web server
SPWFE2 – SharePoint Front end web server
SPSQL1 (The SQL server is configured to use a DNS name ‘SQLCORE’)
Note : You will need to carry out configurations on all servers in order for RBS to work.

STEPS TO ENABLE RBS

In this setup guide a web application named http://aes.com is used to host a number of site collections. The web application currently has two content databases. The root site is located in ‘SP_CONTENT_HOMEBASE_KC’ content database. An additional content database is used for another site collection. A third (*new) database will be created to enable RBS and store data on the file system.
The overall high level steps are:
1. Enable FILESTREAM provider on SQL Server.
2. Create a new content database.
3. Create a new site collection to be housed in the new content database.
4. Configure the content database to use RBS with FILESTREAM.
5. Install the RBS provider on the SQL server.
6. Install the RBS provider on all SharePoint servers.
7. Run the required PowerShell cmdlets to enable the content database to use RBS.
8. Test that content uploaded ends up in the specified file directory.
In below assumption the following names are used.
1. RBS Content DB = SP_CONTENT_ARCHIVES_KC
2. SharePoint Farm Account = Faithcue\svcsp14farmservice
3. SharePoint Setup Account = Faithcue\svcsp14setup
4. SQL Server = SQLCORE
5. Web Application - http://mybase.com

1 Enable FILSTREAM in SQL Server

In this part the SQL server used for RBS will be configured with FILESTREAM

1. Open SQL Server Configuration Manager

2. Right click ‘SQL Server’ (highlighted) and select properties
3. In the FILESTREAM tab select all of the options
i. Enable FILESTREAM for Transact-SQL access
ii. Enable FILESTREAM for file I/O streaming access
iii. Windows Share Name = ‘SQLCORE’ (Use a sensible name)
iv. Allow remote clients to have streaming access to FILESTREAM data
4. Click ‘Apply’ then ‘OK’ to configure

5. Open SQL Server Management Studio and execute the following command


EXEC sp_configure filestream_access_level, 2

RECONFIGURE


2. Create a content database

This is the content database used for the site collection. This content database will be set to a pre-grown size of 50MB with auto grow size of 50MB and 10MB log growth.
Note : If you want to enable RBS on an existing content DB skip this part and go to ‘4. Configure the content database to use RBS’ section below.
1. Open SQL Server Management Studio and on the Databases node right click and create a new database with the following settings
v. Database name - SP_CONTENT_ARCHIVES_KC Database owner –Faithcue\svcsp14farmservice (Farm service account) Initial Size of 50MB Enable Autogrowth with File Growth in Megabytes (50MB)
vi. Maximum File Size with Restricted File Growth of 500MB
vii. Collation Setting - Latin1_General_CI_AS_KS_WE
viii. Recovery model – Full


2. Add the SharePoint setup account (Faith\svcsp14setup) to have dbowner access to the database created above.
Note : that you will need to make sure you have correctly identified appropriate database settings

3. Create site collection

Now in order to test our scenario we need to create a new site collection in the new content database. Since you created this database outside of SharePoint you need to go to Central Administration and add this database to the web application first. Remember in our scenario we have a web application with three site collections including the root site. We are configuring RBS for the third site collection which will use the database we created (SP_CONTENT_ARCHIVES_KC) in the above step.
1. Go to Central Administration and select Manage Content Databases
2. Ensure you have selected the appropriate web application. In this case it’s http://homebase

3. Click “Add a content database” link.
4. Add the database name and SQL server information. In this scenario I am restricting this content database to only contain 1 site collection.

5. Once added you can now create a site collection (Note that since the other two databases have been locked to contain only one site collection each when you create the new site collection it will go into the SP_CONTENT_ARCHIVES_KC database)


6. Open a SharePoint PowerShell command window and execute the following PowerShell
Command to create the site collection.
   
New-SPSite http://aes.com/sites/archives -name "RBS
Demo Site" -description "RBS Demo Site" -owneralias
"KNOWLEDGECUE\svcSP14Setup" -template "STS#0" -ContentDatabase
"SP_CONTENT_ARCHIVES_KC"

7. Navigate to the site collection to verify access or check via Central Administration

Enabling RBS must be carried out per content database being used. Verify that the user account that you use to perform this step is a member of the Administrators group on the SharePoint web server, and is a member of the SQL Server dbcreator and securityadmin fixed server roles on the computer that is running Microsoft SQL Server 2008 R2, SQL Server 2008 with Service Pack 1 (SP1) and Cumulative Update 2. In our example this is the SharePoint setup account (Faith\svcsp14setup).
The following commands must be executed on SQL server in the following order.
1. Open SQL Server Management Studio and Open a Query Window
2. Execute the following command in the order of: First command
   
USE [SP_CONTENT_ARCHIVES_KC]
If not exists (select * from sys.symmetric_keys where name =
N'##MS_DatabaseMasterKey##')create master key encryption by password
= N'Admin Key Password !2#4'
3. Second command
   
USE [SP_CONTENT_ARCHIVES_KC]

If not exists (select groupname from sysfilegroups where

groupname=N'RBSFilestreamProvider')

alter database [SP_CONTENT_ARCHIVES_KC]

Add filegroup RBSFilestreamProvider contains filestream
/code>

4. Third command

USE [SP_CONTENT_ARCHIVES_KC]
Alter database [SP_CONTENT_ARCHIVES_KC] add file (name =
RBSFilestreamFile, filename = 'E:\RBSDataStore') to filegroup
RBSFilestreamProvider
5. Check the location specified if the E:\RBSDataStore directory has been created. Do not attempt to create this directory manually if this step does not work.
Note: that you can provision a RBS store for each database only one time. If you attempt to provision the same RBS data store multiple times, it will not work.
In a real world scenario you can create multiple locations for each content database that you want RBS enabled such as E:\RBSDataStore2 etc. See section 7 for enabling an existing database.

5. Install the RBS provider on SQL and SharePoint Servers


Make sure the correct RBS.msi has been downloaded from http://go.microsoft.com/fwlink/?LinkID=177388.
1. Run the installation on the SQL server first.
2. Copy the downloaded RBS.msi to ex: C:\SPSetupInfo\.
3. Open a command prompt as an administrator and execute the following command within C:\SPSetupInfo\.
    
msiexec /qn /lvx* rbs_install_log.txt /i RBS.msi
TRUSTSERVERCERTIFICATE=true FILEGROUP=PRIMARY
DBNAME="SP_CONTENT_ARCHIVES_KC" DBINSTANCE="SQLCORE"
FILESTREAMFILEGROUP=RBSFilestreamProvider
FILESTREAMSTORENAME=FilestreamProvider_1
4. Keep an eye on taskmgr for msiexec which will start the install. Until it’s finished don’t attempt to do anything as you need to wait for all the configurations to happen in the background.

Eventually this will show as ‘Terminated’ in process explorer.

5. To verify the installation check the ‘rbs_install_log.txt’ files from C:\SPSetupInfo. Towards the end of the file the following message should be present.
MSI (s) (60:E8) [00:33:58:652]: Product: SQL Server 2008 R2 Remote
Blob Store -- Installation completed successfully.
MSI (s) (60:E8) [00:33:58:653]: Windows Installer installed the
product. Product Name: SQL Server 2008 R2 Remote Blob Store. Product
Version: 10.50.1600.1 . Product Language: 1033. Manufacturer:
Microsoft Corporation. Installation success or error status : 0 .
6. Now you need to run the following command on every SharePoint server joined to the farm.
Meaning that in a multi-server deployment every SharePoint server needs to have the RBS configuration installed. Note that this command is slightly different from above.
    
msiexec /qn /lvx* rbs_install_log.txt /i RBS.msi


DBNAME="SP_CONTENT_ARCHIVES_KC" DBINSTANCE="SQLCORE"


ADDLOCAL="Client,Docs,Maintainer,ServerScript,FilestreamClient,FilestreamServer"
7. The install will take the same amount of time on each server so please be patient. Check that the ‘msiexec’ process has finished before proceeding.
8. To verify that the required configurations and table changes have been applied open SQL server Query window and run the following query.

    
USE [SP_CONTENT_ARCHIVES_KC]
select * from dbo.sysobjects where name like 'mssqlrbs%'
9. Navigate to the SharePoint server and open a PowerShell command and execute the following PowerShell command.

    
$cdb = Get-SPContentDatabase SP_CONTENT_ARCHIVES_KC
$rbss = $cdb.RemoteBlobStorageSettings
$rbss.Installed()
$rbss.Enable()
$rbss.SetActiveProviderName($rbss.GetProviderNames()[0])
$rbss

6. Verify RBS via Site Collection

1. Open the site collection http://homebase/sites/archives and navigate to the “Shared Documents” library.
2. Upload a number of documents to the document library and note down the file sizes (you can enable the file size column for the view)

3. Navigate to the E:\RBSDataStore location specified as the store. You will be able to see the files with matching file sizes located on the file store on local disk.


7. Moving content from an existing database to RBS

To move existing content from an existing DB you need to follow steps outlined in this section for that specific content database.
In the existing site at http://homebase has a number of documents but these are stored in SQL since this database has not being RBS enabled.
In the assumption we will enable RBS on ‘SP_CONTENT_HOMEBASE_KC’ which is one of the existing databases that houses the root site collection data. When performing this action ensure that this is done at a time where there is low user interaction or no users are using the files being moved.
Note that the commands used are slightly different in this case. Since, we already enabled the FILESTREAM provider filegroup name of ‘RBSFilestreamprovider ‘we need to change this to ‘RBSFilestreamprovider2’ and the provider store name to ‘FilestreamProvider_2’ for this to work.
1. Commands to be run on SQL server.
    
USE [SP_CONTENT_HOMEBASE_KC]
If not exists (select * from sys.symmetric_keys where name = N'##MS_DatabaseMasterKey##')create master key encryption by password = N'Admin Key Password
    !2#4'
USE [SP_CONTENT_HOMEBASE_KC]
If not exists (select groupname from sysfilegroups where groupname=N'RBSFilestreamProvider2')alter database [SP_CONTENT_HOMEBASE_KC]
add filegroup RBSFilestreamProvider2 contains filestream
USE [SP_CONTENT_HOMEBASE_KC]
Alter database [SP_CONTENT_HOMEBASE_KC] add file (name =
RBSFilestreamFile2, filename = 'E:\RBSDataStore2') to filegroup RBSFilestreamProvider2
2. Now Install the RBS provider on SQL and SharePoint Servers by substituting the values for content database name and the database instance. In the case of the guide these are:
    
‘SP_CONTENT_HOMEBASE_KC’ and ‘SQLCORE’ and Filestreamstorename to
‘FilestreamProvider_2’
#ON SQL Server
msiexec /qn /lvx* rbs_install_log2.txt /i RBS.msi
TRUSTSERVERCERTIFICATE=true FILEGROUP=PRIMARY
DBNAME="SP_CONTENT_HOMEBASE_KC" DBINSTANCE="SQLCORE"
FILESTREAMFILEGROUP=RBSFilestreamProvider2
FILESTREAMSTORENAME=FilestreamProvider_2
#On each SP Server
msiexec /qn /i rbs.msi REMOTEBLOBENABLE=1 FILESTREAMPROVIDERENABLE=1
DBNAME="SP_CONTENT_HOMEBASE_KC"
FILESTREAMSTORENAME=FilestreamProvider_2
ADDLOCAL=EnableRBS,FilestreamRunScript DBINSTANCE="SQLCORE"
Note: the above commands and the ‘msiexec’ process may take some time to finish. Refer to ‘5. Install the RBS provider on SQL and SharePoint Servers’. Once the commands have finished executing proceed to enable the site collection. *Keep an eye on taskmgr for msiexec which will start the install. Until it’s finished don’t attempt to do anything as you need to wait for all the configurations to happen in the background.
3. Verify that the content database has been configured by running the following SQL query against the
    
‘SP_CONTENT_HOMEBASE_KC’
USE [SP_CONTENT_HOMEBASE_KC]
Select * from dbo.sysobjects where name like 'mssqlrbs%'
4. On the application server open a SharePoint PowerShell window and execute the following command
    
$cdb = Get-SPContentDatabase SP_CONTENT_HOMEBASE_KC
$rbss = $cdb.RemoteBlobStorageSettings
$rbss.Installed()
$rbss.Enable()
$rbss.GetProviderNames()
$rbss.SetActiveProviderName('FilestreamProvider_2')
$rbss.Migrate()
$rbss

5. Navigate to 'E:\RBSDataStore2' and verify that the files have been moved.

6. The current logical configuration is now outlined below.

8. Disabling RBS being used for the content database

Once you have set up RBS for the database you may also need to disable RBS. This can be done using PowerShell. This can be easily disabled on a content database by setting the active provider name to the empty string. Each content database has a RemoteBlobStorageSettings property that can be used to invoke the SetActiveProviderName method. The caveats to using are that you should not remove RBS from a database already set for RBS. Microsoft does not recommend this.
1. Open a SharePoint PowerShell window on the SharePoint server
2. Execute the following command
    
#disable rbs on site collection
$site=Get-SPSite "http://aes.com"
$rbss=$site.ContentDatabase.RemoteBlobStorageSettings
$rbss.SetActiveProviderName("")
3. Verify that content is uploaded is not using RBS

GLOSSARY

BLOB Binary Large Object
RBS Remote Blob Store
IIS Internet Information Server
AD Active Directory
ASCIA American Standard Code for Information Interchange
SP Sharepoint
DB Database
EBS External BLOB Storage API

LICENSING REQUIREMENT FOR SQL SERVER WHEN USING RBS

The following table shows the license requirement for different scenarios.
Scenario Local SQL SKU Remote SQL SKU
FILESTREAM Provider,
BLOB stored on local SQL Server file system
Any SQL SKU N/A
FILESTREAM Provider,
BLOB stored on Remote SQL Server file system
SQL Enterprise SQL Standard or above
3rd Party RBS Provider SQL Enterprise N/A

SUMMARY:

I’m really glad that Microsoft finally bit the bullet and gave us something we can work with. For those of us who are willing to take the time and money to properly architect a large scale storage solution, this guidance will expand to heights but at the same time, I expect it to cause some issues as well. There are many others I may have heard of, and others I may not have considered. If you feel there’s a challenge you have which could be solved by RBS, PLEASE ASK