Our physical environments become increasingly packed with new, computerized, devices that increase our comfort and productivity and augment our everyday experience. These devices maintain a wealth of new and existing types of sensors into our surroundings and offer new channels of communications between humans and machines (voice, gestures), between machines themselves (new wireless protocol standards) and between machines and their motherships in the cloud.
The coexistence of these new devices and interaction models with our "legacy" IT infrastructure have not escaped the eyes of the digital world's most early adopters – the hackers. In their minds, we've just created so many more gateways into our corporate networks with new types of sensorial data to collect (AKA steal) and subvert, and new protocols and formats to abuse in the process of getting access to corporate assets.
As we researched the potential effect of this trend on enterprise cybersecurity we focused on one specific, much hyped, type of interaction: voice. In particular, we examined the voice interaction capabilities that are most prominent in an enterprise environment – those of Microsoft's voice activated assistance Cortana.
During our research, which will be detailed in this session, we were able to fully demonstrate the following scenarios:
Using voice as a gateway into enterprise: We will expose a previously unknown vulnerability in Microsoft Cortana's voice interface (responsibly disclosed to Microsoft and now patched) that allows close proximity attackers to take over an unattended locked Windows 10 computer.
Using voice for lateral movement: We will show how this attack can be further amplified to allow remote attackers to move laterally within the victim's network.
Systematically subverting information produced and used by sensorial systems: We will analyze, in technical details, the protocol Cortana uses to talk to its cloud and will expose the "Newspeak" tool that utilize this knowledge to fiddle with the protocol for fun (pranks!) and profit (additional custom functionality!), or just monitor it for security purposes.
We will conclude our presentation with some practical suggestions regarding defending against this new breed of threats against enterprise networks and assets.
4. COMFORT COMPUTING
• Many new devices
• Comfort of access becomes #1 priority
• Dedicated devices vs. Layover
• New Input method vs. new API
• We hope to find vulnerabilities
• Introduction of new input method into existing model
• Current inspection mechanisms are oblivious
5. VOICE ACTIVATION
• Amazon Echo vs. Windows 10
• Cortana + Speech Recognition
• Locked computers respond to voice
• Current anti-malware technology does not inspect voice messages
7. THE VOE ATTACK
Default Windows 10 Environment
• Cortana is on
• Cortana triggers on “Hey
Cortana” by anyone
• Cortana triggers on locked
machine
• Cortana can access some data
on locked machine
Effects
• Proximity attack to get initial
foothold
• Lateral movement after some
initial compromise
9. HIGH LEVEL CORTANA MECHANICS
• Most of the processing is done in the cloud
• Two phases
• Audio processing
• wss://websockets.platform.bing.com/ws/cu/v3
• Binary + JSON
• Semantic processing
• https://www.bing.com/speech_render
• GET request, HTML response
17. SEMANTIC PROCESSING PHASE
• Correlation to previous phase
• X-FD-ImpressionGUID -> X-Search-IG
• Rendered by Cortana client
• Javascript launches local programs / processes
• Ambiguity may require an extra iteration
• http://www.bing.com/DialogPolicy
• Response depends on whether machine is reported to be locked or
unlocked
19. INVOKING BROWSING ACTIVITY
• “GOTO someserver DOT COM”
• Two options
• “Normal” sites – launch browser process, send query to Bing with domain name
• “Privileged” sites – launch browser, navigate to selected site
• Activity is performed even when machine is locked
• For some “privileged” sites access is NOT SSL protected
• CNN.COM
19
20. VOE ATTACK – INITIAL COMPROMISE
• Evil Maid Attack
• Plug in a USB network device
• Network device can be selected on a locked machine
• “GOTO CNN DOT COM”
• Invoke insecure browsing
• Intercept request, respond with malicious code
• Exploit browser vulnerabilities
• Capture domain credentials
• Probably better to serve the actual code from an SSL protected service (e.g. Amazon S3)
20
21. THE VOE ATTACK: EVIL MAID (LOCAL)
21
I’m in! but the
computer is locked!
Hi Cortana!
Go to cnn.com
Browse
http://www.cnn.com
I’m CNN and here’s my
malicious payload!
http://www.cnn.com
23. VOE ATTACK – LATERAL MOVEMENT
• Use initial compromise to install agent on compromised machine
• Launch ARP spoofing tool
• Play sound file – “GOTO CNN DOT COM”
• Intercept traffic of affected machines
23
24. THE VOE ATTACK: REMOTE BUTLER(LOCAL)
24
I’m in! but I want to
move around!
Hi Cortana!
Go to cnn.com
Browse
http://www.cnn.com
I’m CNN and here’s my
malicious payload!
http://www.cnn.com
25. AFTER MATH AND OBSERVATIONS
• Reported to Microsoft on July 2017
• Mitigated on August 2017
• Mitigation required no patching for Windows OS
• No direct browsing is now allowed when machine is locked
• Environment mismatch
• Voice input method is available and responding when machine is locked
• Voice control introduced into laptops / desktops as though they are “hands free” devices (e.g.
Mobile phones)
• Initial compromise requires almost no code
25
26. NEWSPEAK TOOL
• Intercepting proxy
• TLS/SSL certificate must be installed on monitored devices
• In many organization already exists for web gateway monitoring, DLP
• Can monitor all Cortana requests and responses
• Originating device
• Request audio and audio processing results
• Semantic processing results (“action to be performed”)
• Can block or modify all Cortana requests and responses
26
27. NEWSPEAK PROXY: ANGEL OR DEVIL?
27
I’m a bad proxy!
Hi Cortana!
Go to cnn.com
Browse
http://www.cnn.com
Browse
http://www.foxnews.com
29. NEWSPEAK PROXY: ANGEL OR DEVIL?
29
I’m a good proxy!
Hi Cortana!
Go to cnn.com
Browse
http://www.cnn.com
Browse
https://www.cnn.com
30. FURTHER RESEARCH
• Introduction of new input methods / interaction mechanism introduces not
only “new code” vulnerabilities but new attack concepts
• Extend research to other environments (Siri)
• Find more “dangerous” Cortana commands
• Extend the concept of voice attacks
• Vocal Malware
• Cross site speaking
30
I am tired of my voice, the voice of Esau. My kingdom for a drink. On." —James Joyce, Ulysses, episode 9
Many new devices that we are trying to fit into our life seamlessly.
Trying to create a “universal access method” for all devices. A mouse is not universal since it does not connect to mobile devices. Touch is not universal as it is not comfortable with stationary device.