X
Business

SharePoint log: When databases rebel

In the ninth part of Robert Schifreen's SharePoint 2010 epic, he learns how one user can generate 16GB of logs in just three months
Written by Robert Schifreen, Contributor

In the ninth part of Robert Schifreen's SharePoint 2010 epic, he learns how one user can generate 16GB of logs in just three months.

Creating a structure for our SharePoint 2010 installation was clearly central to providing an effective service.

One plan was to split the intranet into around seven chunks, one for each faculty and an extra one for our central non-academic departments. Each of these would be a separate site collection (and thus database). Within each collection will be sub-sites for schools and for individual departments.

However, much of SharePoint's facility for sharing content and for allocating user permissions works at the site-collection level, and won't work between two separate collections. So we re-jigged the structure entirely.

Our latest plan is to have just two site collections for the intranet. One for private content, which will remain within each department, and one for shared content that can be accessed by other departments apart from the one that maintains the content. Within each of those site collections will be a separate site for each school, faculty, department, etc.

Microsoft also advises that fast SAS drives, rather than SATA, should always be used for all SharePoint data.

Again, we took the decision that SATA, which is much cheaper and more plentiful, should suffice for us, for everything apart from the C: drives on each server and tempdb. Should this prove not to be the case, we can purchase some additional SAS storage later.

Microsoft markets a separate SharePoint add-on product called FAST Search, and likes to imply that no successful SharePoint installation is complete without it.

In practice, from what I have read, it seems that FAST is unnecessary unless you have tens of millions of documents to index. Otherwise, SharePoint's out-of-the-box indexing system will crawl the full text of all your documents (you'll need to download a free ifilter, as it's called, to crawl PDF files) perfectly well.

There's a handful of things missing from the standard search, such as having the number of hits displayed in brackets within the search results page, and there are no thumbnail previews of search results, but nothing that is sufficiently must-have to warrant the added expense or complication of learning yet another Microsoft technology.

Transaction logs and broken databases

A significant area of concern for us is how quickly the SQL transaction log databases are going to fill up.

Under SQL Server you can configure each database to use either the 'simple' or the 'full' recovery model. In the simple model, your disks contain the live, current copy of the database and nothing else. You back up the database regularly. If the database breaks, you lose everything that was written to it since the last backup.

I've heard stories of SharePoint set-ups where the transaction logs grow to match or exceed the size of their corresponding databases within an hour.

If, on the other hand, you use the full recovery model, the SQL server keeps a record of every transaction that was added to the database since the last backup. So if the database gets damaged, you can use those transaction logs to update the most recent backup and reincorporate all the recent amendments. Each time you back up a database, the transaction logs are automatically wiped because they are no longer needed.

The full recovery model clearly makes sense. Problem is, those transaction logs can get very big. 

I've heard stories of SharePoint set-ups where the transaction logs grow to match or exceed the size of their corresponding databases within an hour, which means that you need to back up the databases every hour in order to avoid the problem of the transaction logs needing more storage than the core databases themselves. I guess we can look forward to finding out for ourselves whether this is indeed the case.

On my test farm, within which I'd set all the databases to use the full recovery model but was never bothering to take backups, I received a warning one day that the database server was almost full.

The transaction log for the sharepoint_config database had reached 16GB in just three months, despite the fact that I was the only user. Time to look under the bonnet...

Next: SharePoint 2010's dirty little secrets begin to come out in the wash

Robert Schifreen has reported on and implemented online technology since the early 1980s. His latest project has been a large SharePoint 2010 installation in tertiary education. We will be serialising his experiences, positive and negative, in getting it to the stage where it's ready for action; the entire series will also be available as a downloadable white paper.


Get the latest technology news and analysis, blogs and reviews delivered directly to your inbox with ZDNet UK's newsletters.
Editorial standards