O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Things I wished I knew before building my first WebRTC app - RTE2020

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 21 Anúncio

Things I wished I knew before building my first WebRTC app - RTE2020

Baixar para ler offline

Alberto Gonzalez Trastoy, was among the speakers at Agora’s Real-Time Engagement 2020 Conference. His presentation was about what makes building a live video application more complicated than a regular web app. Isn’t WebRTC supposed to handle everything for you? Alberto describes some of the unexpected nuances and challenges a web developer may encounter building real-time engagement and communications applications. This includes networking, interoperability, scalability and security. He also discusses other complexities in building WebRTC applications and offers tools and alternatives to solve them.

Alberto Gonzalez Trastoy, was among the speakers at Agora’s Real-Time Engagement 2020 Conference. His presentation was about what makes building a live video application more complicated than a regular web app. Isn’t WebRTC supposed to handle everything for you? Alberto describes some of the unexpected nuances and challenges a web developer may encounter building real-time engagement and communications applications. This includes networking, interoperability, scalability and security. He also discusses other complexities in building WebRTC applications and offers tools and alternatives to solve them.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Things I wished I knew before building my first WebRTC app - RTE2020 (20)

Anúncio

Mais recentes (20)

Things I wished I knew before building my first WebRTC app - RTE2020

  1. 1. Things I Wished I Knew Before Building My First WebRTC App @lbertogon Alberto Gonzalez Trastoy WebRTC.ventures
  2. 2. 2015 2018 2020
  3. 3. Open source project Real Time Communications framework Secure Updated frequently Used in many major platforms and applications Available on all modern browsers and native clients WebRTC Basics
  4. 4. “This app doesn’t work on iPhone” “Hi, hello! Can you hear me? I can hear you, but you can’t hear me” “Yes I can see you, but video looks blurry” “My microphone is not working, wait a second, I will restart my computer” “I can’t connect, I think it is because I have very slow internet” But mistakes can cause user responses like…
  5. 5. Interoperability Scalability Networking Security Testing Debugging/Troubleshooting WebRTC Challenges
  6. 6. Interoperability WebRTC works everywhere, any browser and device* Advanced features like screenshare are supported in some browsers and devices
  7. 7. Interoperability V.S
  8. 8. Scalability Mesh? Media Server Scaling Media Servers
  9. 9. Some open Source Alternatives:Some commercial alternatives: Scalability
  10. 10. Networking issues: Restrictive Review your checklist, monitor and NAT Traversal Proxy and firewall rules - Proxy authentication from clients required? - Proxy blocking access to IP addresses - Firewall rules - NAT Solution
  11. 11. Networking issues: Congested Network congestion - Too many hosts in a local network - Low bandwidth - Interference from outside sources or faulty cabling WebRTC has error resilience mechanisms but there is a limit. In that case, optimize, monitor and keep track of the logs Solution
  12. 12. Video Room Type Minimal Available Bandwidth required (at the client side) 8 participants video room with Lo-res Video (240x180) + HD Audio ~2 Mbps 8 participants video room with SD Video (640x480) + HD Audio ~8 Mbps 8 participants video room with HD Video (1280x720) + HD Audio ~22 Mbps WebRTC Video Bandwidth Requirements
  13. 13. How To Overcome Those Limits? • Minimize the number of videos the client subscribes to • Use VP8 Simulcast for large conferences or broadcasting* • Minimize video resolution and frame-rate • Optimize based on device type • Keep audio as first-class citizen *With codecs we always need to compromise. If most users use Safari and have good internet connection, H264 codec might be the way to go
  14. 14. App Optimization Example Layout example where the main speaker appears in the pink/red square and the other participants appear at the right
  15. 15. WebRTC Security
  16. 16. WebRTC E2EME Challenge
  17. 17. WebRTC E2EME Solution Using Insertable Streams API to use secure frames mechanism Demo available here: https://webrtc.github.io/samples/src/content/peerconnection/endtoend-encryption/ (experimental)
  18. 18. Web Testing vs WebRTC Testing Both need functional testing Compatibility testing ≠ Interoperability testing N x Performance testing
  19. 19. WebRTC Testing and Debugging Tools And other testing and debugging proprietary applications… Testing Debugging Chrome WebRTC Internals
  20. 20. “This app doesn’t work on iPhone”, said someone using chrome on iPhone “Hi, hello! Can you hear me? I can hear you, but you can’t hear me”, said someone failing to accept or blocking access to the microphone “Yes I can see you, but video looks blurry”, said someone using a video app that doesn’t use SVC or simulcast “My microphone is not working, wait a second, I will restart my computer”, said someone with faulty headphones “I can’t connect, I think it is because I have very slow internet”, said someone about an app that doesn’t prioritize audio and optimize available bandwidth Back To The User Issues
  21. 21. Thank You Alberto Gonzalez Trastoy WebRTC.ventures @lbertogon

Notas do Editor

  • I wanted to introduce this talk with an accurate representation working on RTC apps:
    From a naïve version of me 5 years ago to a more experience one.
    During that process I discovered how WebRTC is not just another browser API, it has its own community of experts
    Today after dozens of apps built for different use cases, and more now to help interacting in a global pandemic.
  • So why we use WebRTC, well it is the open source standard to go for low latency streaming!
  • It does sound familiar. And I left out the “I can’t hear you. You are muted” because it is more of a UI/UX thing
    But those are user problems that can happen due to mistakes in the implementation. An implementation that doesn’t lack challenges…
  • WebRTC is build to be easy to use but it is also different from using any other browser API.
    This is because it converges hardware. telephony and software. What makes building a live video application more complicated than a regular web app?
    Interoperability, which just simply refers to “How good some devices interact with each other”
  • Interoperability, which just simply refers to “How good some devices interact with each other” is a common challenge.
    Basic one to one communication using WebRTC works in most desktop and mobile scenarios.
    But some more advanced features like screensharing or managing multiple peer connections are also supported by most. But….Not all
    In some situations hardware and OS system also plays a role limiting some functionalities.
    Cameras and microphones are not equal in each device and are a common source of user problems.
    Handling those errors properly will be key for a good user experience
    (Debugging some of this types of issues might require of debugging tools like WebRTC internals or wireshark.)
  • Browser or OS! The main functions are working on all major browsers. Being safari and edge the latest to support WebRTC
    But there are still some general interoperability issues:
    Different codec preferences for each browser
    Older browser versions with specific bugs. Major browser upgrades WebRTC in edge . Edge was rebuilt on top of Chromium makes it easier now
    Others: Safari
    -Screenshare
    -Safari WebRTC on mac* using h264
    -1-1 audio/video call, the integration with this major browsers is quite easy, the problems start to appear in more complex scenarios….
    iOS implementation has some bugs/restrictions
    Forget about using other browsers other than Safari
    Some restrictions on autoplay rules (Guide to Safari webrtc in WebRTC Hacks has some very useful info)
    Safari iOS is not ready for WebRTC screen sharing

    In a recent project for many to many video proctoring and additional one to one calls we encountered all those issues. For example, if you want to send more than one media stream then the previous video/audio is muted

    Since Edge was rebuilt on top of Chromium, having MS Edge using working consistently using WebRTC is not a struggle anymore, example using a multiparty WebRTC app

    Firefox, is also in sync with the WebRTC implementation and there aren’t any major differences between Chromium and Firefox that I am aware of today

  • Scalability doesn’t lack it’s challenges. Mesh video call doesn’t work well beyond 4-5 participants… (CPU/BW). We need media servers for:
    1) Scalability, Multiple Participants in a Video Call (helps reduce the number of streams a client needs to send,usually to one)
    2) Integration with Other Communication Technologies (PSTN via SIP trunking or streaming through RTMP to services)
    3) Processing of Media Streams (processing of video and audio streams at a very low level, like being able to run computer vision models)
    Server can handle hundreds of media streams, but limit to vertical scaling. Horizontal scaling with geolocation is a common approach for production rtc apps. To scale media servers horizontally one common approach is to build a dispatcher distributing requests from participants to different media servers. Slightly more advanced, doing geographical cascading, which can reduce latency between participants in different regions by letting each participant send and receive video from closest media server

    New codecs and standards like SVC (Scalable Video Coding) are helping to scale from the client side to send better quality at lower bitrates and the right quality for each participant

    But of course there are some limitations if we compare with VOD…
  • OSS:
    Jitsi SFU and implements ints own signaling using Jingle(XMPP)
    Janus  general purpose WebRTC server that can be setup as an SFU. Plugin architecture: SIP Gateway, VP9-SVC Video Room, live streaming…
    Kurento can also be configured to function as SFU or MCU, or both, in a single instance. OpenVidu, a new platform to facilitate the use of Kurento functionalities from a higher-level client in your web or mobile applications
    We have worked with all of them for production projects or, at least, demos.

    Also, there are other popular platforms that weren’t originally developed to be WebRTC media servers but have WebRTC media server capabilities:
    Asterisk, FreeSwitch: Mostly used in telephony applications it also supports WebRTC and it is frequently used in conjunction with JsSIP or SIP.js

    Pion: New stack for Web Real-Time Communications. Pion is built on Go and allows developers to use the WebRTC stack as small pieces of lego. Can be used to build a SFU

    CPaaS
    Will scale probably to millions of connections without you having to handle the distribution between servers/maintenance or geolocation. You just need to use their SDK and you are good to just focus on the client solution
  • Checklist of proxy and firewall rules:
    -TCP Ports like 443 should be allowed
    -UDP ports used for RTP connection 1025-65535 should be open too. If not at least UDP 3478 for TURN
    -Persistent WSS should be allowed for the signaling
    -NAT essentially hides a home or office's internal network from the public internet

    (Tech note) NAT Traversal:
    Nat traversal (reaching client IP address hidden by NAT) is achieved using the WebRTC build in ICE gathering (protocols are STUN and, as a last resort, TURN (less than 1/3 of calls need that but chances are that you will need it if you are in a restrictive network))
    But still you will need a TURN server to skip network limitations. You can deploy it yourself using coturn or use some 3rd party provider (CPaaS will handle this for you)

    Monitoring is usually built in some CPaaS but there are also some 3rd party platforms like callstats that handle it.
    (Or you can build it yourself strong webrtc errors in a logging database)
    ----
    More on NAT traversal:
    Clients are typically situated on networks designed to protect them from public requests and may not have a public IP address => this often introduce complicated hurdles.
    Connecting to a simple web server is as easy as making an HTTP request VS WebRTC needs to use ICE, which provides a multitude of connection types, each of which may be tried in order to establish a successful connection.
  • It in your network can be caused by faulty cabling, interference from outside sources or as the result of a collision.
    Also, too many hosts in a domain or not enough bandwidth (internet pipe size) can generate congestion and overload the network
    Network congestion => High error rate/packet loss & Might cause low quality media

    How to optimize?
  • First, you’d need to know your use case and architecture. Is it a webinar, video chat, panel? And from there measure a minimal available bandwidth required
    As an example here I calculated an 8 party video room using video bitrate estimations based on resolution (at 24fps). Assuming a SFU media server (one video uplink and 7 downlink)
    And although it will change depending on the codec used (VP9 better, AV1 even better) that’s the idea…
    Since not everyone has 22Mbps available how can we handle HD quality? How do we overcome these limits?
  • Collaboration/presentation use cases might not need to display all the participants in a grid. We can show the dominant speaker and the rest as thumbnails
  • Collaboration/presentation use cases might not need to display all the participants in a grid. We can show the dominant speaker and the rest as thumbnails

    Also mobile phones have less CPU so if you want to keep the best experience for mobile keep the number of displayed videos small
  • Encrypted end-to-end.
    Core protocols defined by the IETF for providing WebRTC security: SRTP for media traffic and DTLS-SRTP for key negotiation
    This is an ideal scenario that gets more complicated if we need to support multiparty with media servers in between
  • we have an intermediate participant, the media server, which would decrypt and re-encrypt the media. Obviously, that’s not great if you don’t trust the media server
    Media streams are temporarily decrypted within the cloud servers and then immediately re-encrypted before being sent through the internet to the subscribing client. This decryption is necessary for managing group calls, other types of media exchange, intelligent quality control, and session recording
  • E2EE with insertable streams demo from webrtc-samples where Middlebox represents what the media server would see. Insertable Streams is not supported by default in Chrome yet, so you might need to enable that in chrome://flags in Canary.

    Kudos to cosmoSoftware, google and the rest of the open source community building this encryption mechanism. Called Secure Frames
    Using Insertable Streams API
  • Compatibility testing on a basic web app would be mainly focused on display in different devices and resolutions. Different operating systems display certain app elements differently. 
    VS
    Interoperability testing needs, IN ADDITION to test for compatibility between different browsers and OS. How the RTC communication behaves, codecs used, etc.
    N times what a basic web app would need…
    For performance, while basic web apps will focus on the single page load, CPU usage of the server and so on. For RTC, and in specific, WebRTC, you will need to test the bandwidth limitations when sending and receiving media with different number of participants.
    Also stress testing is easy to do for basic web apps, just open new tabs. But for WebRTC, you can reach the client bandwidth and CPU limit quick, so you will need multiple devices or VMs to properly stress test, a device isn’t enough.
  • For testing and debugging network or interoperability challenges mentioned before using KITE webrtc specific selenium based framework will help identify problems in your app
    Some proprietary apps that we have used are BrowserStack or testRTC for testing or callstats for monitoring/debugging

    For debugging WebRTC internals it is a quick way to identify WebRTC problems: it can be used to debug the flow of WebRTC sessions to determine issues during development
    Wireshark will be a more advanced alternative to get more granularity, down to seeing the packets one by one
  • KITE for interoperability testing which uses selenium to launch browsers to check if video is sent or receives, and also goes into other details such as if the ICE gathering was successful.
    In this image we are testing with 4 browsers, for testing with Safari you need to have a Safari device or VM.
  • WebRTC internals: it can be used to debug the flow of WebRTC sessions to determine issues during development
    For example we can see here the outbound video and audio streams. And Video stopped being sent after a few seconds (could be user or pli packets stopped due to hardware?)
  • Back to the user issues, now, based on what I explained we could guess what could have been the problem for each user…
    Because applications today have a high standard and things are supposed to work, always. I hope you learned and won’t make the same mistakes I did in the past.

×