I’m impatient. I like new toys and new technologies and I generally don’t want to wait to play with them. We’ve been happy Sonos customers for years now, and Alexa customers since the UK launch. So when Sonos and Amazon announced a partnership to directly integrate the two in August last year, I was cautiously optimistic, albeit impatient to see it working. They promised a beta in 2016 and a launch in early 2017. I didn’t hear anything for the rest of that year, and wondered when we’d get an update on progress…
Fast forward to last month, when Sonos gave their first public demo of the integration in action. It looks like they’ve made good headway on it, but they have – as their Software VP admitted – more work to do to “get the experience right” and have declined to give a launch date for the functionality.
So what’s an early adopting, tech enthusiast with an instant gratification problem to do?
Roll my own, of course, with the help of a number of open source projects, most importantly the node-sonos-http-api from Jishi I mentioned a while back plus a bunch of code to run in the Amazon Web Services Lambda system, as well as the configuration to create an Alexa Skill to consume it. For these steps I have fellow CTO and tech enthusiast Ryan Graciano to thank. His work – and the help from his army of testers on Reddit mean the setup is mostly painless.
The basic flow looks like this:
Talking to any of the Echo Dots in the house triggers the Alexa Skill, that consumes the Lambda function running in AWS which in turns makes requests to the Sonos API which is exposed via a reverse SSL proxy on the edge of my network. The API then talks to the Sonos network internally to control the behaviour of all the Sonos devices.
With the API set to run automatically on a Raspberry Pi on my network, and the Skill and AWS functions configured I can issue commands like:
Alexa, tell Sonos to play The Prodigy in the Living Room.
Or;
Alexa, tell Sonos to play the song Firestarter in the Living Room.
And to avoid specifying the room everytime, you can tell it which room you want to use for future commands:
Alexa, tell Sonos to change room to Bedroom.
After that you can just say:
Alexa, tell Sonos to play the album Hamilton.
and;
Alexa, tell Sonos to turn it up.
and so on. Lovely stuff.
As with many of these projects, we have a lot of moving parts. Ryan has done an excellent job on his Github explaining the steps needed to get it up and running. Two wrinkles I encountered which slowed things down:
Dialect Matters
The Alexa Skill and the AWS Lambda code is sensitive to where it is being hosted, and what language your Echo Dot is configured to use. Initially I built the Alexa skill setting the language to English (US) – the default. My Echo Dots are all on a UK account, registered in the UK. The disparity caused the Alexa Skill to work when running in test mode in the development console, but not in the real world when actually talking to the Dots. Recreating the skill with English (UK) solved this.
Transatlantic Feature Parity
If you’re hosting your Skill and Lambda function in the US you have access to “Amazon Literals” within the interaction model’s schema of user intents. The intents are defined in a JSON document that contains the types of phrases and words that can be used.
Using Amazon Literals means that you can define concepts like “Artist” or “Album Name” and tell the model to look them up in Amazon’s music library. With this active the matching of your spoken requests is apparently far more accurate than without. These literals are not available when hosting in the UK or Germany, and there is a way for us to manually provide our own list of “hints”. In my testing the recognition was pretty good without adding a huge list of hints, so I’ll keep an eye on this and see if I need to populate the NAMES list with a bunch of artists and albums to help it along.
It will be interesting to see what the teams at Sonos and Amazon come up with to make the integration better than is currently possible. One thing they’ll be able to do is remove the “ask Sonos to” part of the request – since they’ll likely integrate at a deeper level. Hopefully that’ll make it possible to tell Alexa you always want to control Sonos with a preferred service – like Spotify – allowing for commands like:
Alexa, play The Prodigy in the Bedroom.
Or, even better, let Alexa control the Sonos in the same room as the Dot you’re talking to by mapping them to each other so you can just say:
Alexa, play Hamilton.
I do look forward to seeing what they come up with, but for now I’m already finding it easier to ask for music via the Dot instead of using the Sonos or Spotify apps.