This document is for an older version of Kazoo (version 4.3) that is no longer supported. You should upgrade and read the current documentation.


About Media

Uploading media for custom music on hold, IVR prompts, or TTS (if a proper TTS engine is enabled).

Kazoo provides some default system media files for common things like voicemail prompts. These are accessible via the media Crossbar endpoint as well, if your user has super duper admin privileges. To manipulate those resources, simply omit the /accounts/{ACCOUNT_ID} from the URI.

For example, to get a listing of all system media files:

curl -v -X GET -H "X-Auth-Token: {AUTH_TOKEN}"

You can then get the id of the media file and manipulate it in a similar fashion as regular account media (including TTS if you have a TTS engine like iSpeech configured).

Media Languages

Part of the schema of media files is a language attribute. It defaults to a system_config/media value for the default_language key (and is "en-us" by default). Properly defined media files can be searched for based on language using the basic filters provided by Crossbar:

curl -v -X GET -H "X-Auth-Token: {AUTH_TOKEN}"
curl -v -X GET -H "X-Auth-Token: {AUTH_TOKEN}"
curl -v -X GET -H "X-Auth-Token: {AUTH_TOKEN}"

The comparison is case-insensitive, but en and en-US are treated separately. If a media metadata object is missing a language attribute (on an older installation when system media was imported with no language field, say), use key_missing=language in the request.

Once you’ve assigned languages, you can use the language callflow action to set the language for that call.

Normalize Media Files

Kazoo can be configured to normalize uploaded media files. This can fix things like:

  • Normalizing volume
  • Fix clipping
  • Standardize formats

By default, if enabled, normalization will convert all media to MP3 (retaining the original upload as well) using the sox utility to accomplish the conversion.

Enable Normalization Via SUP

  • Enable normalization for this particular server: sup kapps_config set normalize_media true
  • Enable normalization for all servers: sup kapps_config set_default normalize_media true

Enable Normalization Via DB

  1. Open system_config/ document, create or update the key normalize_media to true.
  2. Flush the kapps_config cache, sup kapps_config flush, on all servers running Crossbar.

Set Target Format Via SUP

  • For the server: sup kapps_config set normalization_format ogg
  • For all servers: sup kapps_config set_default normalization_format ogg

Set Target Format Via DB

In the system_config/ document, create or update the key normalization_format to your desired format (mp3, wav, etc). Flush the kapps_config cache on all servers running Crossbar. All new uploads will be normalized (if possible) to the new format.

Normalization parameters

The default sox command is sox -t <input_format> - -r 8000 -t <output_format> - but this is configurable via the system_config/media document (or similar SUP command).

You can fine-tune the source and destination arguments using the normalize_source_args and normalize_destination_args keys respectively. By default, the source args are "" and the destination args are “-r 8000” (as can be seen from the default sox command above.

The normalizer code uses stdin to send the binary data to sox and reads from stdout to get the normalized binary data back (the ” - ” (there are 2) in command above).

You can also set the specific path for sox in the normalize_executable key, in case you’ve installed it to a non-standard path.

Be sure to install sox with mp3 support! Conversion will not happen (assuming you’re targeting mp3) if sox can’t write the mp3. You can check the media meta document for the key normalization_error if sox failed for some reason.


Schema for media

KeyDescriptionTypeDefaultRequiredSupport Level
content_lengthLength, in bytes, of the fileinteger()falsesupported
content_typeUsed to override the automatic upload type`string(‘audio/mp3''audio/mpeg''audio/mpeg3''audio/x-wav'
descriptionA brief description of the media update, usually the original file namestring(1..128)falsesupported
languageThe language of the media file or textstring()en-usfalsesupported
media_sourceDefines the source of the media`string(‘recording''upload''tts’)`upload
nameA friendly name for the mediastring(1..128)truesupported
prompt_idThe prompt this media file representsstring()false
source_idIf the media was generated from a callflow module, this is ID of the propertiesstring(32)falsebeta
source_typeIf the media was generated from a callflow module, this is the module namestring()falsebeta
streamableDetermines if the media can be streamedboolean()truefalsesupported
tts.textThe text to be converted into audiostring(1..)falsesupported
tts.voiceThe voice to be used during the conversion`string(‘female/en-US''male/en-US''female/en-CA''female/en-AU'
ttsText-to-speech options used to create audio files from textobject(){}falsesupported


GET /v2/accounts/{ACCOUNT_ID}/media

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    "auth_token": "{AUTH_TOKEN}",
    "data": [
            "id": "{MEDIA_ID}",
            "is_prompt": false,
            "language": "en-us",
            "media_source": "tts",
            "name": "Main AA BG"
    "page_size": 1,
    "request_id": "{REQUEST_ID}",
    "revision": "{REVISION}",
    "status": "success"

Create a new media object (required before uploading the actual media data)

PUT /v2/accounts/{ACCOUNT_ID}/media

  • For a file:
curl -v -X PUT \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    -d '{"data":{
        "name": "File",
        "description": "My Test Media File",
        }}' \
  • For a prompt:
curl -v -X PUT \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    -d '{"data":{
        "streamable": true,
        "name": "FR-vm-enter_pass",
        "description": "FR - Enter Password prompt",
        "prompt_id": "vm-enter_pass",
        }}' \
  • For a TTS document: (requires iSpeech to be enabled)
curl -v -X PUT \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    -d '{"data":{
        "name": "TestTTS",
        "media_source": "tts",
        "tts": {"text": "Testing TTS", "voice": "female/en-US"}
        }}' \

A response:

        "streamable": true,
        "name": "vm-enter_pass",
        "description": "FR - Enter Password prompt",
        "prompt_id": "vm-enter_pass",
        "language": "fr-fr",
        "tts": {
            "voice": "female/en-US"
        "media_source": "upload",
        "id": "fr-fr%2Fvm-enter_pass"
    "revision": "{REVISION}",
    "request_id": "{REQUEST_ID}",
    "status": "success",
    "auth_token": "{AUTH_TOKEN}"

Remove metadata

Optional Parameter: “hard_delete”: true - will perform a hard delete of the document (default is soft delete)

DELETE /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}

curl -v -X DELETE \
    -H "X-Auth-Token: {AUTH_TOKEN}" \

Get metadata about a media file

GET /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    "auth_token": "{AUTH_TOKEN}",
    "data": {
        "description": "tts file",
        "id": "{MEDIA_ID}",
        "language": "en-us",
        "media_source": "tts",
        "name": "Main AA BG",
        "streamable": true,
        "tts": {
            "text": "Thank you for calling My Amazing Company where we do amazing things. You may dial any extension at any time. To schedule an appointment, press 1. For billing questions about your account, press 2. For all other inquiries, press 0.  To hear this menu again, please stay on the line.",
            "voice": "female/en-US"
        "ui_metadata": {
            "origin": "callflows",
            "ui": "monster-ui",
            "version": "4.0-7"
    "request_id": "{REQUEST_ID}",
    "revision": "{REVISION}",
    "status": "success"

Update metadata

POST /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}

curl -v -X POST \
    -H "X-Auth-Token: {AUTH_TOKEN}" \

List all prompts and the number of translations existing

GET /v2/accounts/{ACCOUNT_ID}/media/prompts

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    "auth_token": "{AUTH_TOKEN}",
    "data": [
            "agent-already_logged_in": 1,
            "agent-enter_pin": 1,
            "agent-invalid_choice": 1,
            "agent-logged_in": 1,
            "agent-logged_out": 1,
            "agent-not_call_center_agent": 1,
            "agent-pause": 1,
            "agent-resume": 1,
            "agent_enter_pin": 1,
            "agent_logged_already_in": 1,
            "agent_logged_in": 1,
            "agent_logged_out": 1,
            "cf-disabled": 1,
            "cf-disabled_menu": 1,
            "cf-enabled_menu": 1,
            "cf-enter_number": 1,
            "cf-move-no_channel": 1,
            "cf-move-no_owner": 1,
            "cf-move-too_many_channels": 1,
            "cf-not_available": 1,
            "cf-now_forwarded_to": 1,
            "cf-unauthorized_call": 1,
            "conf-alone": 1,
            "conf-bad_conf": 1,
            "conf-bad_pin": 1
    "next_start_key": "conf-deaf",
    "page_size": 25,
    "request_id": "{REQUEST_ID}",
    "revision": "{REVISION}",
    "status": "success"

List languages available

GET /v2/accounts/{ACCOUNT_ID}/media/languages

This request will return a list of languages found, as well as the counts of how many media files have that language defined:

Note, the “missing” key indicates how many media files have no associated language.

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    "data": [{ "en": 3
               ,"missing": 1

Get the raw media file

Streams back an the uploaded media.

GET /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}/

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    -H 'Accept: audio/mp3' \


There is a deprecated but maintained URL, GET /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}/raw, as well.

Add the media binary file to the media meta data

POST /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}/

curl -v -X POST \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    -H 'Content-Type: audio/mp3' \
    --data-binary @/path/to/file.mp3 \
  "auth_token": "{AUTH_TOKEN}",
  "data": {
    "id": "{MEDIA_ID}",
    "language": "{LANG}",
    "media_source": "upload",
    "name": "{FRIENDLY_NAME}",
    "streamable": true,
    "tts": {
      "voice": "female/en-US"
  "node": "{NODENAME}",
  "request_id": "{REQUEST_ID}",
  "revision": "{REVISION}",
  "status": "success",
  "timestamp": "{TIMESTAMP}"
curl -v -X POST \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    -H 'Content-Type: audio/x-wav' \
    --data-binary @/path/to/file.wav \

Only one of the above; any subsequent POSTs will overwrite the existing binary data.


There is a deprecated but maintained URL, GET /v2/accounts/{ACCOUNT_ID}/media/{MEDIA_ID}/raw, as well.

List all translations of a given prompt

GET /v2/accounts/{ACCOUNT_ID}/media/prompts/{PROMPT_ID}

You can use that list to fetch the specific media files associated with that prompt.

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    "auth_token": "{AUTH_TOKEN}",
    "data": [
    "page_size": 2,
    "request_id": "{REQUEST_ID}",
    "revision": "{REVISION}",
    "start_key": "vm-enter_pass",
    "status": "success"

List media files with specific language

GET /v2/accounts/{ACCOUNT_ID}/media/languages/{LANGUAGE}

curl -v -X GET \
    -H "X-Auth-Token: {AUTH_TOKEN}" \
    "data":["media_id_1", "media_id_2",...]

To get the IDs of the media docs missing a language:

curl -v -X GET -H "X-Auth-Token: {AUTH_TOKEN}"{ACCOUNT_ID}/media/languages/missing
"data":["media_id_1", "media_id_2",...]

