When reading about smart homes, you’ll probably have seen ‘Amazon Alexa’ and ‘Amazon Echo’ used almost interchangeably, but they’re actually quite different. It’s almost like the difference between a guitar player and the guitar… one (the player) uses a tool (the guitar) to do something (play music).
An Amazon Echo is a voice activated, internet connected smart speaker which uses Amazon Alexa to understand and carry out voice commands. There’s more to it than that, but basically Echo is a device which relies on a cloud based technology (Alexa) to do things.
Back in 2002, a 38 year old Jeff Bezos wrote to all Amazon software developers and told them that they had to completely change how all their software projects spoke to each-other. This became known as the ‘API mandate’.
This was to move to an API-based ‘service oriented architecture’, which is just a fancy way of saying that each individual ‘bit’ of Amazon’s software should be simple to speak to.
Whilst this might seem irrelevant, this decree from Bezos was actually quite prescient: 18 years later, and pretty much all software projects follow the ‘API mandate’.
This is important because ‘Amazon Alexa’ is not just a single bit of software. Instead, it is series of little bits of software programs which can speak to each-other in a simple, common way.
In other words, when you say “Alexa, play Miley Cyrus’ latest hit song“, the following flow happens:
- Your Echo device detects that you said “Alexa”, and enters listening mode. In other words, it now actively starts listening to what you are saying.
- The audio “play Miley Cyrus’ latest hit song” is captured, and sent to an Amazon software service which understands what this request truly means.
- It’s understood that you want Miley Cyrus’ latest hit song, so a different Amazon software service is spoken to, in order to discover what song this is. This other service responds with “Slide Away (2019)”, after looking up Miley Cyrus songs and working out which is (most likely to be) the latest hit song.
- Yet another request is made, this time to a ‘play music’ software service. This service is asked to play Slide Away (2019) by Miley Cyrus, and it starts playing this through your Echo device.
So why am I telling you this? Simple, because the above flow – a series of requests being made to different software services – is Amazon Alexa.
In other words, Amazon Alexa is a cloud-based service which can understand voice requests, and then the correct software service(s) will be spoken to in order to process your request.
This might be a ‘discover music’ service, or it might be a ‘smart doorbell video’ service, or a ‘control a smart plug device’ service. Whatever it is, Jeff Bezos’ ruling in 2002 that all Amazon software should follow an API mandate was really important – because this ruling is what (eventually) given life to Amazon Alexa.
Amazon have produced a range of ‘smart speakers’ – also known as ‘virtual assistants’ – since 2014. You can speak to them and issue commands to play music, get an update on the news or weather, and a whole lot more.
Amazon’s range of smart speakers are called ‘Echo’, with a variety of options within this range:
- The ‘Echo Dot’, a smaller Echo device with decent sound quality.
- A full size ‘Echo’, which has good sound quality due to the increased space for speakers.
- ‘Echo Show’, which also contains a screen to give more visual news and weather updates, along with playing videos from YouTube and other sources.
Echo is therefore a physical device which you buy in shops (or online, more likely).
By itself, the Echo is simply an internet-connected speaker and microphone. Yes it has some in-built intelligence, because it can detect you saying “Alexa” and then it enters listening mode (denoted by the blue ring).
Anything you say after “Alexa” is not processed by the Echo device itself, however. As I explained in the earlier section on Alexa, once you issue a command to your Echo device, the command itself is sent up to Amazon’s servers – and processed by Alexa.
This is therefore the key difference between Alexa and Echo – an Echo device is a physical item with a speaker and microphone, whilst Alexa is the internet-based software which actually processes your request.
In the first section I give an example of what happens when you say “Alexa, play Miley Cyrus’ latest hit song” to an Echo device. I wanted to cover this in more detail to better show the difference between Alexa and Echo. Check out the diagram below:
Okay, so when a request (i.e. voice command) is ‘issued’, the different interactions – or flows – that roughly happen are below:
- Echo – its onboard microphone hears some sound which is probably a voice.
- Echo – a small on-board chip is passed this sound, and understands that it might be a human saying “Alexa”. The device therefore goes into full listening mode.
- Echo – the rest of your request (voice command) is captured, and saved to an audio file. This is then transmitted to the Amazon Alexa cloud.
- Alexa – an initial Amazon software service – known as an authentication/authorization service – checks to see that the device you are using is allowed to use Alexa. In this case, it’s a registered Echo device, so it’s fine.
- Alexa – your request is called onto a second Alexa software service, this time one which truly understands what your voice command actually means. In other words, this service converts a human voice into computer commands.
- Alexa – since it’s now understood that you want to play a song, but your request relates to an unspecific song from a particular artist, some intelligence is needed. As in, it’s easy if you say “play Hello by Adele”, but your request asks for a latest hit from an artist – there could be multiple song candidates here.
So a third Alexa software service is called, a music ‘discovery service’ which looks at Miley Cyrus’ latest songs, and works out which is most likely to be a hit song. It decides on “Slide Away (2019)“. Sometimes it’ll get it wrong, of course, but overall Alexa is quite good at figuring out things like this.
- Alexa – a request is made to a fourth Alexa software service, this time to stream the song Slide Away (2019) by Miley Cyrus to your Echo device.
- Echo – music/audio data (i.e. your requested song) is sent to your Echo, and played out via its speakers.
That all sounds a little complicated, but hopefully it illustrates the different parts of Echo devices and Alexa, and at what point a voice command will switch from Echo to Alexa.
If you use Alexa without an Echo device (e.g. on your phone), then the above flow is the same – apart from “Echo” is now “Your Smartphone”! Read below for more information on this.
You can use the Alexa (the cloud technology) without buying Echo (the physical device). This is by installing the Amazon Alexa phone app, and clicking the central ‘talk’ button at the bottom. On the first use, it will ask for permissions:
Once permission is granted and you start speaking, your phone will firstly go into ‘Listening’ mode. Once you have finished asking a question or issuing a command, the app will go into ‘Speaking’ mode and show your result. In the case below, I asked “Alexa, what’s the weather like in Cardiff, UK?“:
In addition, you can also click the ‘Communicate’ tab in the Alexa app:
This allows you to interact with other Alexa-enabled devices, whether they are Echos or smartphones with the app. The “drop in” feature allows you to hear the audio of the other Echo device (as long as this is specifically enabled, of course) – a handy feature for a young child’s bedroom to check that they’re sleeping okay, for example.
In essence, Alexa is like Siri or Google Assistant: they are all cloud-based, internet-connected virtual assistants that you speak to, and who understand your request. As a result, there’s no need for a physical Echo device to use this technology.