Media Storage API

There are many, many ways to host media on the Internet. MediaDrop supports a bunch of them right out-of-the-box, but we wanted to make it simple to add support for new methods.

By default, MediaDrop supports hosting media files:

  • on third-party services (like blip.tv, Dailymotion, Google video, Vimeo, YouTube)
  • via any direct HTTP, HTTPS, or RTMP url to a media file
  • locally, by serving uploaded files via the same webserver as MediaDrop

MediaDrop can also automatically transfer uploaded media files to a remote FTP server, if that FTP server makes the files available at an HTTP url.

All of these options are implemented using MediaDrop’s Storage Engine API. This API is designed to make it simple for new storage methods to be added.

The Storage Engine system is located in mediadrop.lib.storage and the API is defined by the abstract class mediadrop.lib.storage.StorageEngine. Most of MediaDrop’s own StorageEngines inherit from the helpful subclasses mediadrop.lib.storage.FileStorageEngine and mediadrop.lib.storage.EmbedStorageEngine. Forms associated with StorageEngines all inherit from mediadrop.forms.admin.storage.StorageForm.

Summary

There are three components in MediaDrop’s file storage system: StorageEngines, MediaFiles, and StorageURIs. Each MediaFile object represents a unique file somewhere in cyberspace. Each URI represents a method and address via which that file might be accessed.

StorageEngine classes define the logic for storing and deleting media files and for listing all of the URIs via which those files might be accessed. For example, the mediadrop.lib.storage.ftp.FTPStorage class handles the logic for storing/deleting an uploaded file on a remote FTP server, and the logic for generating an HTTP URL at which the stored media can be accessed. It also defines a dictionary of settings and a form for editing those settings. On the other hand, a StorageEngine like mediadrop.lib.storage.youtube.YoutubeStorage contains the logic for parsing YouTube URLs, fetching thumbnails and descriptions from YouTube, and generating the information required to embed a YouTube video in a page. It defines no form, because it has no settings.

Because different StorageEngines will need to keep different data about the MediaFiles that they own, MediaFiles have a flexible unique_id attribute that the owning StorageEngine is responsible for populating and reading. Some StorageEngines use this field to store a serialized dict, some store a single ID number or even simply an HTTP URL.

When called upon to be used, a MediaFile may be asked to return all of the URIs via which it is accessible. For example, a file stored on Amazon S3 may be accessible via HTTP or RTMP, while a file stored locally may be accessible via a HTTP url or via local file path. The StorageEngine that owns the MediaFile is responsible for generating with this list. URIs are not stored in the database, but are generated at request time based on the properties of the involved MediaFile and StorageEngine.

Internal Process

When a new MediaFile is being added to MediaDrop (for example, via the admin Add New Media form, or via the front-end Upload form), it may come as the result of a populated file input (e.g. an uploaded MP4 file), or a populated text input (e.g. a YouTube URL).

MediaDrop will attempt to find the most appropriate StorageEngine to handle the given file/string combo. To find an appropriate StorageEngine, It will iterate over all of the available StorageEngines, calling engine.parse() with the file or string objects as parameters. If a given StorageEngine is capable of handling the provided data, it will return a metadata dict as described in the mediadrop.lib.storage.StorageEngine.parse() docstring. If a StorageEngine is incapable of handling the provided data, it will raise a mediadrop.lib.storage.UnsuitableEngineError.

In order to ensure that the optimal StorageEngine is chosen, each StorageEngine class is responsible for defining a list of which other StorageEngine classes should come before it and which ones should come after it. For example, the StorageEngine responsible for uploading to a remote FTP server should be preferred over the default local file StorageEngine, while both should be tested before any URL-based StorageEngines, to ensure that if the user has somehow uploaded a file and provided a URL string, the uploaded file takes precedence over the text input. Likewise, the YouTube StorageEngine should be tested before the default RemoteURL storage engine so that a YouTube URL is not misclassified as a playable file.

It is up to the programmers to ensure that there are no cycles in this precedence graph. MediaDrop finds a topological ordering according to these provided restrictions, and iterates in that order.

The main logic for handling the creation of new MediaFiles is in the function mediadrop.lib.storage.add_new_media_file(). It is worthwhile to become familiar with its workings before attempting to write a new StorageEngine.

Implementation

A new StorageEngine can be added to MediaDrop simply by subclassing mediadrop.lib.storage.StorageEngine and registering that subclass with the Abstract Base Class.

Refer to the mediadrop/lib/storage/__init__.py file to see which properties must be implemented (all properties initialized as abstractproperty and all methods decorated by abstractmethod must be implemented in subclasses).

from mediadrop.lib.storage import StorageEngine

class MyStorage(StorageEngine):
    """
    Implement all abstract properties and abstract methods here ...
    """

StorageEngine.register(MyStorage)

As mentioned above, StorageEngines can optionally define a dict of editable properties in their _default_data dict if they also provide a subclass of mediadrop.forms.admin.storage.StorageForm whose display and save_engine_params methods can map the form values to and from the data dict. StorageEngines that do this will have links, in MediaDrop’s admin backend, to a page where an admin can use the rendered form to edit the StorageEngine’s properties. An example of a StorageEngine that has this feature is mediadrop.lib.storage.localfiles.LocalFileStorage.

Abstract Base Class

class mediadrop.lib.storage.StorageEngine(display_name=None, data=None)

Base class for all Storage Engine implementations.

__init__(display_name=None, data=None)

Initialize with the given data, or the class defaults.

Parameters:
  • display_name (unicode) – Name, defaults to default_name.
  • data (dict) – The unique parameters of this engine instance.
_default_data = {}

The default data dictionary to create from the start.

If you plan to store something in _data, declare it in this dict for documentation purposes, if nothing else. Down the road, we may validate data against this dict to ensure that only known keys are used.

default_name

A user-friendly display name that identifies this StorageEngine.

delete(unique_id)

Delete the stored file represented by the given unique ID.

Parameters:unique_id (unicode) – The identifying string for this file.
Return type:boolean
Returns:True if successful, False if an error occurred.
engine_params()

Return the unique parameters of this engine instance.

Return type:dict
Returns:All the data necessary to create a functionally equivalent instance of this engine.
engine_type

A unique identifying unicode string for the StorageEngine.

get_uris(media_file)

Return a list of URIs from which the stored file can be accessed.

Parameters:media_file (MediaFile) – The associated media file object.
Return type:list
Returns:All StorageURI tuples for this file.
is_singleton

A flag that indicates whether this engine should be added only once.

parse(file=None, url=None)

Return metadata for the given file or URL, or raise an error.

It is expected that different storage engines will be able to extract different metadata.

Required metadata keys:

  • type (generally ‘audio’ or ‘video’)

Optional metadata keys:

  • unique_id
  • container
  • display_name
  • title
  • size
  • width
  • height
  • bitrate
  • thumbnail_file
  • thumbnail_url
Parameters:
  • file (cgi.FieldStorage or None) – A freshly uploaded file object.
  • url (unicode or None) – A remote URL string.
Return type:

dict

Returns:

Any extracted metadata.

Raises UnsuitableEngineError:
 

If file information cannot be parsed.

postprocess(media_file)

Perform additional post-processing after the save is complete.

This is called after parse(), store(), thumbnails have been saved and the changes to database flushed.

Parameters:media_file (MediaFile) – The associated media file object.
Returns:None
settings_form

Return an instance of settings_form_class if defined.

Return type:mediadrop.forms.Form or None
Returns:A memoized form instance, since instantiation is expensive.
settings_form_class = None

Your mediadrop.forms.Form class for changing _data.

store(media_file, file=None, url=None, meta=None)

Store the given file or URL and return a unique identifier for it.

This method is called with a newly persisted instance of MediaFile. The instance has been flushed and therefore has its primary key, but it has not yet been committed. An exception here will trigger a rollback.

This method need not necessarily return anything. If parse() returned a unique_id key, this can return None. It is only when this method generates the unique ID, or if it must override the unique ID from parse(), that it should be returned here.

This method SHOULD NOT modify the media_file. It is provided for informational purposes only, so that a unique ID may be generated with the primary key from the database.

Parameters:
  • media_file (MediaFile) – The associated media file object.
  • file (cgi.FieldStorage or None) – A freshly uploaded file object.
  • url (unicode or None) – A remote URL string.
  • meta (dict) – The metadata returned by parse().
Return type:

unicode or None

Returns:

The unique ID string. Return None if not generating it here.

transcode(media_file)

Transcode an existing MediaFile.

The MediaFile may be stored already by another storage engine. New MediaFiles will be created for each transcoding generated by this method.

Parameters:media_file (MediaFile) – The MediaFile object to transcode.
Raises CannotTranscode:
 If this storage engine can’t or won’t transcode the file.
Return type:NoneType
Returns:Nothing
try_after = []

Storage Engines that should parse() before this class has.

This is a list of StorageEngine class objects which is used to perform a topological sort of engines. See sort_engines() and add_new_media_file().

try_before = []

Storage Engines that should parse() after this class has.

This is a list of StorageEngine class objects which is used to perform a topological sort of engines. See sort_engines() and add_new_media_file().

You're reading the documentation for MediaDrop 0.11dev (current git master). For the latest stable release please consult the documentation for MediaCore CE 0.10.