Network Storage Overview
The following is a brief summary of how multiple vendor network storage is currently used at the Starz Animation facility in Toronto. Our infrastructure has grown over the past couple of years from a small 30-60 person facility with 100 render nodes to a much larger facility that currently has approximately 300 people with just over 500 render nodes. In the past, a lot of the storage for projects was all located on local attached disk (which is probably still true in many smaller shops). As our facility grew and we had access to bigger budgets, we were able to completely get rid of all our old server attached storage and move to a better enterprise solution as shown in the diagram below. This by no means the best or only solution, and may not work at all in your facility; however; in this document, I would like to outline some of the benefits and reasons why we chose to set up our storage in this manner.
By moving away from server attached storage, we were able to protect our data in the following ways:
SNAPSHOTS allows us to take a "picture" of the current filesystem and go back to retrieve files at a specific point in time. We currently have SNAPSHOT rules that take these "pictures" every 3 hours for a full day, every full day for a week, and every week for 3 weeks. This allows us to retrieve files very quickly when users remove or overwrite files by mistake (which seems to happen much too often).
With multiple storage systems, we were able to logically separate the data as if flows through our pipeline to spread the load of the filesystems across different areas. We also have the extra benefit of using CLUSTER capabilities of both BlueArc and Isilon so in the case that an entire head fails; another head will take over and pick up the load automatically.
For disaster recover purposed (backups and archives), we have added a large FC array of TIER-2 storage directly to our backup server. This gives us the ability to RSYNC all of the data we usually backup to tape and keep it on-line so it can quickly be retrieved without having to go to back tape (which is obviously much slower). This server also has an ADIC i2000 tape library attached with 4 LTO-3 tape drives that run incrementals each night with fulls once per month.
When projects are complete, we simply make two archives (one for on-site and one for off-site storage) of the full project. Then we run some custom scripts that replace the internal directory pointers on all the assets (mainly Maya files) and move them into our Library (located on the Isilon cluster) so they can be re-used for future projects. Once all are archive is complete and has been tested, we delete the project completely from the BlueArc storage arrays.
Cost of Use Benefits
By moving to an enterprise storage solution we were able to reduce our cost of ownership in the following ways:
Evenly distributing the load of file access across multiple filesystems is very tricky especially with the highs and lows of your renderfarm. Many places will assign a filer a project so that everything for that specific project is located in one place. We have chosen to split projects across all filers so that no matter what project is being worked on, the filers are always used about the same amount. The number of people and machines accessing the data remain fairly constant no matter which projects we are working on so we have built our filesystems to handle the worst case scenario (example: all artists and all render nodes hitting the filesystem simultaneously). Some filers still are worked more than others, but even at the highest load, we are working the filers well under their tested performance capabilities. In our facility (as seen in the diagram) we have split our data as follows:
This storage library contains both NFS (Unix) and CIFS (Windows) exports for our Library (Software, Media files, Non-production related documents and Assets from non-current productions for reference and possible re-use), User Windows Roaming Profiles, User Home Directories (both Windows & Unix), and Apache Web Services (See Load Balancing Network Services). This filesystem is not accessed by the renderfarm; it is strictly for all non-production data.
BLUEARC TITAN 3-A (Production)
The storage library contains all production related data - management, textures, Maya files, etc (anything a production person or artists touches for a project). All machines read from this filesystem, however, only artists write to it.
BLUEARC TITAN 3-B (Render Cache/Render Layers)
This storage library contains all Mentalray pre-render cache as well as all rendered layers. Only the renderfarm touches this filesystem as users rarely need to access these files. By separating these files out from the other filesystems, we greatly decreased the loads on other production files. When the renderfarm kicks in at full power in the middle of the day users are no longer affected by when loading and saving their files.
BLUEARC TITAN 3-B (Compositing)
This storage library contains all final composite frames; production related media files as well as the logs created by the renderfarm. The renderfarm is the only set of machines that write to this filesystem. Final compositing frames and render logs are only accessed by a small number of users. The media files created are accessed via our web-based production system.
Great article John. I found it very interesting and useful when setting up our smaller renderfarm here at UTS