I tend to think of the problem in terms of resources that have different quality levels. Such a model is implemented e.g. in http://www.diversityworkbench.net/Portal/wiki/ResourcesModel_v1.3. We distinguish between resources that are derived from other resources with the intent of creating new resources (e.g. adding labeling or cropping the image or sound such that only part of the previous content remains) and conversions that can in principle be automated. The latter we model as quality levels (compression levels of audio or images, different resolutions of images, etc.)
We treat the analog original (a tape recording, photo print, etc.) as one of the quality levels.
Clearly there is a gray area between quality levels and explicitly derived works. However, expressing the intention behind a step (introducing semantics for it, as the model does) I believe to be very useful. This may be what Bob has in mind.
We distinguish identical copies for files by providing a similar mechanism for providers. Identical copies can be at different providers. Backup copies are handled through backup providers. Thus the model has AbstractResource (which may be derived from another abstract resource) and ResourceInstances.
The maximum number of resource instances is the product of AbstractResources x Providers x QualityLevels (there can be fewer).
Gregor