Friday, September 9, 2011

The BaseFilesystemItemCache abstract class

See My Coding Style for explanation of, well, my coding style...

BaseFilesystemItemCache is a nominal abstract class, providing basic caching capabilities for the DBFSpy stack. The intent is to provide an in-memory cache of the filesystem, so that file-retrieval doesn't have to wait on the back-end database, while allowing the individual file-objects themselves (IFilesystemItem-derived instances) to handle keeping track of their cache-age.
Cache (Property):
The object's IFilesystemItem-object cache
Guids (Property):
A list of all the Guid properties of the items in the cache.
Paths (Property):
A list of all of the Path properties of the items in the cache (calculated, so less efficient than using the Guids)
Add (Method):
Adds an IFilesystemItem to the cache.
_BuildCache (Method):
Builds the object's Cache from a supplied list of items.
GetItem (Method):
Returns a specific item from the cache, specified by either Guid or Path
GuidOfPath (Method):
Returns the Guid corresponding to the specified path.
PathOfGuid (Method):
Returns the Path corresponding to the specified Guid
Remove (Abstract Method):
Removes an item from the cache

class BaseFilesystemItemCache( object ):
    """Nominal abstract class, provides functional requirements for objects that can cache virtual filesystem items."""

    ###########################
    # Class Attributes        #
    ###########################

    ###########################
    # Class Property Getters  #
    ###########################

    def _GetCache( self ):
        """Gets the IFilesystemItem cache."""
        return self._cache

    def _GetCacheLifetime( self ):
        """Gets or sets the duration that items in the cache will be cached for."""
        return self._cacheDuration

    def _GetGuids( self ):
        """Gets a list of viable cached item-GUIDs."""
        return self._cache.keys()

    def _GetPaths( self ):
        """Gets a list of viable cached item-paths."""
        result = []
        for guid in self._cache.keys():
            result.append( self._cache[ guid ].Path )
        result.sort()
        return result

    ###########################
    # Class Property Setters  #
    ###########################

    def _SetCacheDuration( self, value ):
        if type( value ) not in [ types.IntType, types.LongType, types.FloatType ]:
            raise TypeError( '%s.CacheDuration error: Expected a numeric (float, int or long) value. %s is not valid.' % ( self.__class__.__name__, value ) )
        self._cacheDuration = value

    ###########################
    # Class Property Deleters #
    ###########################

    ###########################
    # Class Properties        #
    ###########################

    Cache = Property( _GetCache, None, None, _GetCache.__doc__ )
    Guids = Property( _GetGuids, None, None, _GetGuids.__doc__ )
    Paths = Property( _GetPaths, None, None, _GetPaths.__doc__ )

    ###########################
    # Object Constructor      #
    ###########################

    def __init__( self ):
        """Object constructor."""
        self._cache = {}

    ###########################
    # Object Destructor       #
    ###########################

    ###########################
    # Class Methods           #
    ###########################

    def Add( self, item ):
        """Adds an item to the cache."""
        if not isinstance( item, IFilesystemItem ):
            raise TypeError( '%s.Add error: Expected an instance implementing IFilesystemItem, %s does not' % ( self._-class__.__name__, item ) )
        self._cache[ item.Guid ] = item

    def _BuildCache( self, cacheItems ):
        """Builds the cache from a supplied list of IFilesystemItem instances."""
        if not type( cacheItems ) in [ types.ListType, types.TupleType ]:
            raise TypeError( '%s.BuildCache error: Expected a list or tuple of IFilesystemItem instances.' % ( self.__class__.__name__ ) )
        self._cache = {}
        try:
            for item in cacheItems:
                self.Add( item )
        except( TypeError, error ):
            raise TypeError( '%s.BuildCache error: Expected a list or tuple of IFilesystemItem instances. %s' % ( self.__class__.__name__, error ) )
        except:
            raise

    def GetItem( self, guidOrPath ):
        """Returns an item from the cache (attempting to get it into the cache if necessary first) using the GUID or Path of the item to identify it."""
        item = None
        # Check to see if the guidOrPath is a GUID formatted string:
        if len( guidOrPath ) == 36 and GUIDCHECKER.sub( '', guidOrPath ) == '':
            # It's a GUID
            item = self._cache[ guidOrPath ]
        else:
            # Assume it's a path
            guid = self.GuidOfPath( guidOrPath )
            if guid != None:
                item = self._cache[ guid ]
        # Check to see if the item needs to be refreshed from the cache:
        if item != None and time.time() - item.Atime > self._cacheDuration:
            item.Refresh()
        return item

    def GuidOfPath( self, path ):
        """Returns the GUID of the cache-item at a specified path, or None if no GUID is available."""
        for guid in self._cache.keys():
            if self._cache[ guid ].Path == path
            return guid
        return None

    def PathOfGuid( self, guid ):
        """Returns the Path of the cache-item idenified by the supplied GUID, or None if no Path is available."""
        if item.Guid in self.Guids:
            return self._cache[ guid ].Path
        return None

    def Remove( self, item ):
        """Removes an item from the cache."""
        if item.Guid in self.Guids:
            del self._cache[ item.Guid ]

    ###########################
    # Static Class Methods    #
    ###########################

    pass
__all__ += [ 'BaseFilesystemItemCache' ]

Commentary

The existence of the BaseFilesystemItemCache abstract class is predicated on the desire to have the ability to cache filesystem items. Whether that proves to be useful or not will have to wait until the final executable is complete and can be run, but my expectation (at this time, at any rate) is that it will be beneficial, particularly in high-load use-cases. At the same time, any time caching gets involved, there are risks that have to be managed: latency of the cache, making sure it's current, etc., ad astra. The intention in DBFSpy is to spread the load out, such that the presence or absence of a filesystem item is managed at the BaseFilesystemItemCache level, while the caching of the actual data for the item is managed by the items themselves (which will implement IFilesystemItem).

Also, in an effort to assure that the identifiers in the database are of a reasonable size, their IDs will be GUIDs - but at the same time, those GUIDs are not terribly useful outside of that database context - the native filesystem relies on paths. The Paths attribute is intended to provide those paths on demand, but I'm not confident (yet) that the mechanism I've chosen will perform as well as I want it to. The alternative, however, would potentially require more convoluted (and potentially fragile) code, to allow an item's path to also be a cache-key - complete with the ability for items to be altered or deleted from the cache by either the GUID or the path...

No comments:

Post a Comment