Pipeline: Is complex folder structure necessary?

Does maintaining a folder structure make sense today?

One of the of the most important aspects of managing a pipeline is managing a file system(s) with an organized folder structure. Like most 'computer folk' I take organization of files very seriously for maintaining a good workflow in pipeline. A good file system structure can enable intuitive access of data without the need for new tools to locate and retrieve it. But, with a the the growing interest object storage which forces a flat structure and REST framework, the globalization and growth of online work, and the complex relationships of data occurring in film, does it make sense to continue to build complex folder structures to represent this data and its relationships?

In some of my first employments, I was instilled with the idea that a folder structure should be 3 things:

  1. file paths should be a human interpretative description of that file and what it represents.
  2. normalization of paths should be used to remove redundant info (i.e. projects/apx/sequences/bb/shot/bb0010 should be normalized to projects/apx/bb/bb0010.)
  3. file structure should aid in compartmentalizing data away and showing its relationships.

I have used these rules and they have provided me some good guidelines but I feel I might be of a different mindset lately. Even though a good file organization is ideal is the overhead of maintenance and migrations might introduce more problems in the future as working online and across the world means sharing data quickly. Especially when the file browser and spreadsheet visualization of files is a bit medieval. 

Folder structures work really well for local storage where pipeline tools require an artist to navigate a folder structure but as asset management systems (AMS) have grown and developed, artist have needed to interact with the file system less.

One thought that I experimented with was dynamic path generation where an asset's metadata (some piece of data) is stored in a database (PostgreSQL in my experiment). From there a pathing layer is introduce where by the database record never contains a path (unless the path is the data) instead the metadata used to describe the asset is used in a dynamic path library to generate paths based on a grammar template (this is similar to the shotgun toolkit style but less verbose). I really enjoyed working with this style of pathing structure because it allowed me to abstract the folder structure from the logic. This meant that we could change the folder structure and move files and only need to update the grammar. This also allow us really focus on the database and storage. 

There were some downsides to the strategy. It meant that edge cases where a asset just needs to be created, saved, and then tracked in the database HAD TO be conformed to match the grammar or a new grammar attribute added to the pathing schema. The other problem was how this really made TD code more object oriented which sounds good but for simple 'one-off' scripts, its a bit overkill. Being forced to work with the asset object representations (Python Object or C Types) meant performing quick moves and migrations of files was almost impossible. 

Ultimately I consider this experiment a successful failure and moved to refine it. One of the major improvements I wanted to make was flatten the structure further and make the artist interaction with a file system complete useless. At first I was nervous about this as this breaks the rules for file systems. However, I felt that artist don't want to use the file system if there is an easier way to present,pull, and push data to them. 

So next I tried to see if I could improve by taking some ideas from Object Storage. The flat nature of object storage is something I really love and most asset data in a film is being generated/ingested in some application where a record can be generated to represent more complex relationships

The end experiment had 3 sections to it. The database, a fake file system ui, and a simple storage schema. More or less the database held the relationship data and some extra data, and the file system ui (inspired from dropbox) allowed artists to view the flat storage as a hierarchy or as a more complex relationships (like the way a model relates to its shaded looks). 

This approach was awesome but it took sometime to get used too. Though it made going to the file system to locate the file more difficult, this also made the maintenance of the storage way easier (allowing me to mix object and block storage). The biggest gain was in the graphical representation of the data the user. 

The GUI, a web based ui, allowed us to model relationships between assets in many visual ways very quickly. Ultimately this had nothing to do with the way the file were stored on disk but it did show that if the visualization of data relationships was the most important part.

Conclusion

Is complex folder structure necessary to film? Simply, yes but not as a way to describe data or its relationships. The experiment showed that while good folder structure with really wonderful (and beautiful at times) it is only one way that data could be visualized. Data relationships in film and games now are very complex and trying to build a file system structure that embodies this is silly (its why databases exists). My opinion is a flat structure with a pathing system that abstracts the storage schema is the route to go, especially for mixing with object storage. This means file system structure can be very simple and flat and more time can be used to focus on coming up with intuitive visualizes of the data to the artists. This may means the age of the file browser is not longer a file browser, but a relationship browser which seems more in tune with how film and games work.