These embeds were done using the WordPress Media Uploader, which works great with captions:
In the following example, the full-size image is down-sized for the page and just linked to itself for display in the shadowbox. This simplifies things by only involving one image, but increases download times for pages, especially pages containing a number of small images.
And now, just to see what the code looks like, here is a WordPress embed of an mp3 that I uploaded using the WordPress uploader:
So it just creates a simple link to the file, at least providing the url. The BIG question is how mediaelement will handle this once Mediaelement finds its way into the core. Mediaelement may interpret the link correctly and display the player and title. Or we may have to tweak the code slightly to achieve the right effect:
This is the caption for the audio file