Telepresence - The o Technical Guide For Beta Testers
Telepresence - The o Technical Guide For Beta Testers
Telepresence - The o Technical Guide For Beta Testers
V2.1.0 (2013-08)
telepresence the open source SIP TelePresence System Technical Guide for beta testers
by
Mamadou DIOP
diopmamadou {AT} doubango[DOT]org
V2.1.0 (2013-08)
License
telepresence the open source TelePresence System version 2.1.0. Copyright 2013 Mamadou DIOP Copyright 2013 Doubango Telecom <http://www.doubango.org>, <http://conf-call.org>, <https://code.google.com/p/telepresence/>, < https://groups.google.com/group/opentelepresence>.
GPLv3
telepresence is a free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. telepresence is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public Licence along with telepresence. If not, see <http://www.gnu.org/licenses/>.
Commercial
The commercial version is an alternative to GPLv3 if opening your code is a problem. The commercial version also comes with high priority support. For more information, please contact us.
V2.1.0 (2013-08)
Versioning
Date June 11, 2013 June 13, 2013 August 19, 2013 Version 2.0.0 2.0.1 2.1.0 SVN revision alpha - closed source alpha closed source beta open source Authors Mamadou DIOP Mamadou DIOP Mamadou DIOP Comments Initial version Update OpenAL build script to use release version instead of git Fix all issues reported on the public group. Add new configuration parameters: video-listener-par video-speaker-par presentation-sharing-enabled presentation-sharing-process-local-port presentation-sharing-base-folder presentation-sharing-app Add http and https transports. Add support for presentation sharing (technical description, configuration entries and instructions to install OpenOffice/LibreOffice). Add support for mute/unmute (part of call session management, to be completed with more features). Requires Doubango r989 or later
V2.1.0 (2013-08)
Table of Contents
1 Foreword............................................................................................................................................7 2 Scope................................................................................................................................................. 8 3 Main features..................................................................................................................................... 9 4 Building and installing the product................................................................................................. 10 4.1 Preparing the system................................................................................................................ 10 4.2 Building thirdparties libraries.................................................................................................. 10 4.2.1 Building libsrtp.................................................................................................................. 10 4.2.2 Building OpenSSL.............................................................................................................10 4.2.3 Building libogg, libvorbis and libtheora............................................................................ 10 4.2.4 Building libspeex and libspeexdsp.................................................................................... 11 4.2.5 Building YASM................................................................................................................. 11 4.2.6 Building libvpx.................................................................................................................. 11 4.2.7 Building opencore-amr...................................................................................................... 11 4.2.8 Building libopus................................................................................................................ 12 4.2.9 Building libgsm................................................................................................................. 12 4.2.10 Building g729.................................................................................................................. 12 4.2.11 Building iLBC................................................................................................................. 12 4.2.12 Building x264.................................................................................................................. 12 4.2.13 Building libfreetype......................................................................................................... 13 4.2.14 Building libfaac............................................................................................................... 13 4.2.15 Building FFmpeg............................................................................................................. 13 4.2.16 Building OpenAL Soft..................................................................................................... 14 4.2.17 Building OpenOffice/LibreOffice................................................................................... 14 4.2.18 Building Doubango......................................................................................................... 15 4.3 Building the Telepresence system............................................................................................ 15 4.3.1 Building the source code................................................................................................... 15 4.3.2 Installing the configuration and fonts files ........................................................................ 16 5 Technical details.............................................................................................................................. 17 5.1 SA versus AS modes................................................................................................................ 17 5.2 Client Requirements................................................................................................................ 17 5.3 Bandwidth management and congestion control ..................................................................... 17 5.4 Stereoscopic (spatial) 3D audio............................................................................................... 17 5.5 Selecting the speaker and listeners.......................................................................................... 18 5.6 Audio mixer design.................................................................................................................. 18 5.7 Video mixer design.................................................................................................................. 19 5.7.1 Encoders and decoders...................................................................................................... 19 5.7.2 Overlays............................................................................................................................. 19 5.8 Audio quality............................................................................................................................19 5.9 Video quality............................................................................................................................ 19 5.9.1 Packet loss recovery ......................................................................................................... 19 5.9.2 Jitter buffer........................................................................................................................ 20 5.9.3 Zero-artifacts..................................................................................................................... 20 5.9.4 AVPF tail length.................................................................................................................20 5.9.5 FEC (Forward Error Correction)....................................................................................... 20 5.9.6 RED (Redundant video data).............................................................................................20 5.10 NAT and firewall traversal..................................................................................................... 20 5.10.1 Symmetric RTP............................................................................................................... 20 5.10.2 ICE................................................................................................................................... 21 5.10.3 RTCP-MUX..................................................................................................................... 21 5.11 Security.................................................................................................................................. 21 telepresence the open source SIP TelePresence System
V2.1.0 (2013-08)
5.11.1 Signaling.......................................................................................................................... 21 5.11.2 Media............................................................................................................................... 21 5.12 Protecting a bridge with password......................................................................................... 21 5.13 Recording conference to a file............................................................................................... 22 5.14 Presentation sharing............................................................................................................... 23 5.14.1 Publishing the document................................................................................................. 23 5.14.2 Receiving feedbacks........................................................................................................ 24 5.14.3 Fetching the presentation................................................................................................. 25 5.14.4 Closing the presentation.................................................................................................. 25 5.15 Desktop or screen sharing...................................................................................................... 26 5.16 Call session management....................................................................................................... 27 5.16.1 Muting/unmuting............................................................................................................. 27 6 Configuration................................................................................................................................... 28 6.1 Debugging the system.............................................................................................................. 28 6.2 Network transports................................................................................................................... 29 6.3 Security.................................................................................................................................... 29 6.4 SIP registration.........................................................................................................................30 6.5 NAT / Firewall traversal.......................................................................................................... 30 6.6 RTP buffer size.........................................................................................................................31 6.7 AVPF tail length....................................................................................................................... 31 6.8 Audio/Video codecs................................................................................................................. 32 6.9 Recording conference to a file................................................................................................. 32 6.10 Presentation sharing............................................................................................................... 32 6.11 Audio...................................................................................................................................... 33 6.11.1 Pivot settings.................................................................................................................... 33 6.12 Video...................................................................................................................................... 33 6.12.1 Bandwidth and congestion control.................................................................................. 33 6.12.2 Output size, pixel aspect ratio and letterboxing .............................................................. 34 6.12.3 Jitter Buffer...................................................................................................................... 35 6.12.4 Zero-artifacts................................................................................................................... 35 6.12.5 Mixing type..................................................................................................................... 35 6.12.6 Overlays........................................................................................................................... 36 6.12.7 Patterns............................................................................................................................ 36 6.13 Bridge configuration.............................................................................................................. 37 7 Testing the system........................................................................................................................... 38 8 Known issues................................................................................................................................... 39 9 Tips.................................................................................................................................................. 40 9.1 Lowering CPU......................................................................................................................... 40 9.2 Lowering bandwidth................................................................................................................ 40 9.3 Improving audio quality...........................................................................................................40 9.4 Improving video quality...........................................................................................................40 9.5 Lowering recorded video file size........................................................................................... 41
Table of Figures
Figure 1: Audio resampling................................................................................................................ 18
V2.1.0 (2013-08)
Configuration 2: Network transports..................................................................................................29 Configuration 3: Setting SSL certificates........................................................................................... 29 Configuration: 4 SRTP settings.......................................................................................................... 30 Configuration 5: Enabling/disabling SIP registration.........................................................................30 Configuration 6: Enabling/disabling NAT traversal features............................................................. 31 Configuration 7: Setting RTP buffer size........................................................................................... 31 Configuration 8: Setting AVPF tail length..........................................................................................31 Configuration 9: Setting audio/video codecs..................................................................................... 32 Configuration 10: Recording conference to a file.............................................................................. 32 Configuration 11: Presentation sharing.............................................................................................. 32 Configuration 12: Audio settings........................................................................................................33 Configuration 13: Setting video bandwidth and congestion control.................................................. 34 Configuration 14: Setting output mixed video size............................................................................ 34 Configuration 15: Setting Pixel Aspect Ratio.................................................................................... 35 Configuration 16: Enabling or disabling video jitter buffer............................................................... 35 Configuration 17: Enabling or disabling zero-artifact........................................................................35 Configuration 18: Video overlays.......................................................................................................36 Configuration 19: Bride settings........................................................................................................ 37
V2.1.0 (2013-08)
1 Foreword
telepresence refers to a set of technologies which allow a person to feel as if they were present, to give the appearance of being present, or to have an effect, via telerobotics, at a place other than their true location (source: Wikipedia). SIP stands for Session Initiation Protocol and is a signaling protocol defined by the IEFT in RFC 3261. SIP is widely used today to manage VoIP ( Voice over IP) communication sessions and has been chosen as signaling protocol for Next Generations Networks such as IMS (IP Multimedia Subsystem) or LTE (Long Term Evolution). The protocol has quickly become the de facto standard used to interconnect the IP world (Internet) with the PSTN (circuit-switched telephone networks). Our SIP TelePresence product is a free, open source, smart and powerful system to be used on any SIP network without any change for complete integration. This is a good alternative to the wellknown commercial products that cost several thousand dollars. The system could be used to connect any SIP endpoint. This is not a yet another video conference system for the simple reason that we target Full and Ultra HD real-time video @120fps and provide special rooms (XXL screens/monitors, devices, cameras, chairs) built for amazing experience.
V2.1.0 (2013-08)
2 Scope
This technical guide is a reference document for beta testers and explains why you need our telepresence system and how to leverage its power. For sure, this technical guide will not be enough to give you a complete help and this is why you can ask on our developer group for any issue. Please note that during the beta phase the group will be locked and youll need to be invited to be able to post messages. Another way to report issues is to open a ticket on our tracker or ask on the Doubango public group.
V2.1.0 (2013-08)
3 Main features
This is a short but not exhaustive list of supported features on this beta version: Powerful MCU (Multipoint Control Unit) for audio and video mixing Stereoscopic (spatial) 3D and stereophonic audio Full (1080p) and Ultra (2160p) HD video up to 120fps Conference recording to a file (containers: *.mp4, *.avi, *.mkv or *.webm) Revolutionary way to share presentations: documents are streamed in the video channel to allow any SIP client running on any device to participate. Smart adaptive audio and video bandwidth management Congestion control mechanism SIP registrar 4 SIP transports (WebSocket, TCP, TLS and UDP) SA (direct connection to SIP clients) and AS (behind a server, such as Asterisk, reSIProcate, openSIPS, Kamailio) modes Support for any WebRTC-capable browser (WebRTC demo client at http://conf-call.org/) Mixing different audio and video codecs on a single bridge (h264, vp8, h263, mp4v-es, theora, opus, g711, speex, g722, gsm, g729, amr, ilbc) Protecting a bridge with PIN code Unlimited number of bridges and participants Connecting any SIP endpoint Easy interconnection with PSTN NAT traversal (Symmetric RTP, RTCP-MUX, ICE, STUN and TURN) RTCP Feedbacks (NACK, PLI, FIR, TMMBN, REMB) for better video experience Secure signalling (WSS, TLS) and media (SDES-SRTP and DTLS-SRTP) Continuous presence Smart algorithm to detect speakers and listeners Different video patterns/layouts Multiple operating systems (Linux, OS X, Windows) 100% open source and free (no locked features) Full documentation and many others This short list is a good starting point to help you to understand what you could expect from our Telepresence system.
10
V2.1.0 (2013-08)
sudo yum update sudo yum install make libtool autoconf subversion git wget cmake gcc gcc-c++ pkgconfig
4.2
4.2.1 Building libsrtp libsrtp is optional unless you want to use WebRTC SIP clients. Its highly recommended. The WebRTC Telepresence demo client requires a system with SRTP enabled.
git clone https://github.com/cisco/libsrtp/ cd libsrtp CFLAGS="-fPIC" ./configure --enable-pic && make && make install
You should not use any libsrtp package because the latest dev version is required and building the source by yourself is highly recommended. 4.2.2 Building OpenSSL OpenSSL is required if you want to use TLS, WSS (Secure WebSocket) or DTLS-SRTP (also requires libsrtp). OpenSSL version 1.0.1 is required if you want support for DTLS-SRTP which is mandatory for WebRTC implementation from Mozilla (Firefox Nightly or Aurora). This section is only required if you dont have OpenSSL installed on your system or using version prior to 1.0.1 and want to enable DTLS-SRTP. A quick way to have OpenSSL may be installing openssl-devel package but this version will most likely be outdated (prior to 1.0.1). Anyway, you can check the version like this: openssl version.
wget http://www.openssl.org/source/openssl-1.0.1c.tar.gz tar -xvzf openssl-1.0.1c.tar.gz cd openssl-1.0.1c ./config shared --prefix=/usr/local --openssldir=/usr/local/openssl && make && make install
If you have any error with "_SSL_CTX_set_tlsext_use_srtp" when you try to run the telepresence system then, this means you have more than one openssl versions installed. This happens because the configure script detected support for DTLS-SRTP (because of the new openssl) but at runtime your linker try to load the old openssl libraries. The easiest way to fix the issue is to remove the old openssl version (look for libssl). 4.2.3 Building libogg, libvorbis and libtheora These libraries are optional unless you want to use *.webm or *.mkv containers. You can install the devel packages ( recommended):
sudo yum install libogg-devel libvorbis-devel libtheora-devel
11
V2.1.0 (2013-08)
4.2.4 Building libspeex and libspeexdsp libspeex (audio codec) is optional but libspeexdsp (audio resampler, jitter buffer) is required. You can install the devel packages:
sudo yum install speex-devel
4.2.5 Building YASM YASM is only required if you want to enable and build VPX (VP8 video codec) or x264 (H.264 codec). Its highly recommended.
wget http://www.tortall.net/projects/yasm/releases/yasm-1.2.0.tar.gz tar -xvzf yasm-1.2.0.tar.gz cd yasm-1.2.0 ./configure && make && make install
4.2.6 Building libvpx libvpx adds support for VP8 and is optional but highly recommended if you want support for video when using Google Chrome or Mozilla Firefox. libvpx is required if you want to use *.webm container or our WebRTC SIP Telepresence client. You can install the devel packages:
sudo yum install libvpx-devel
4.2.7 Building opencore-amr opencore-amr is optional. Adds support for AMR audio codec.
git clone git://opencore-amr.git.sourceforge.net/gitroot/opencore-amr/opencore-amr git://opencore-amr.git.sourceforge.net/gitroot/opencore-amr/opencore-amr autoreconf --install && ./configure && make && make install
12
V2.1.0 (2013-08)
4.2.8 Building libopus libopus is optional but highly recommended as its an MTI codec for WebRTC. Adds support for Opus audio codec.
wget http://downloads.xiph.org/releases/opus/opus-1.0.2.tar.gz tar -xvzf opus-1.0.2.tar.gz cd opus-1.0.2 ./configure --with-pic --enable-float-approx && make && make install
4.2.9 Building libgsm libgsm is optional. Adds support for GSM audio codec. You can install the devel packages ( recommended):
sudo yum install gsm-devel
4.2.10 Building g729 G729 is optional. Adds support for G.729 audio codec.
svn co http://g729.googlecode.com/svn/trunk/ g729b cd g729b ./autogen.sh && ./configure --enable-static --disable-shared && make && make install
4.2.11 Building iLBC iLBC is optional. Adds support for iLBC audio codec.
svn co http://doubango.googlecode.com/svn/branches/2.0/doubango/thirdparties/scripts/ilbc cd ilbc wget http://www.ietf.org/rfc/rfc3951.txt awk -f extract.awk rfc3951.txt ./autogen.sh && ./configure make && make install
4.2.12 Building x264 x264 is optional but highly recommended and adds support for H.264 video codec (requires FFmpeg). x264 is required if you want to use *.mp4 container.
wget ftp://ftp.videolan.org/pub/x264/snapshots/last_x264.tar.bz2 tar -xvjf last_x264.tar.bz2
13
V2.1.0 (2013-08)
# the output directory may be difference depending on the version and date cd x264-snapshot-20121201-2245 ./configure --enable-shared --enable-pic && make && make install
4.2.13 Building libfreetype libfreetype is required and used for video overlays. You can install the devel packages ( recommended):
sudo yum install freetype-devel
4.2.14 Building libfaac libfaac is optional unless you want support for AAC audio codec or *.mp4 container for recording.
wget http://downloads.sourceforge.net/faac/faac-1.28.tar.bz2 tar -xvjf faac-1.28.tar.bz2 cd faac-1.28 && ./configure && make && make install
Note: building the tests could fails but you can safely ignore it. 4.2.15 Building FFmpeg FFmpeg is required even if you dont want support for video.
# [1] checkout source code git clone git://source.ffmpeg.org/ffmpeg.git ffmpeg cd ffmpeg # [2] grap a release branch git checkout n1.2 # [3] configure source ./configure \ --extra-cflags="-fPIC" \ --extra-ldflags="-lpthread" \ \ --enable-pic --enable-memalign-hack --enable-pthreads \ --enable-shared --disable-static \ --disable-network --enable-pthreads \ --disable-ffmpeg --disable-ffplay --disable-ffserver --disable-ffprobe \ \ --enable-gpl \
14
V2.1.0 (2013-08)
--enable-libfaac
4.2.16 Building OpenAL Soft OpenAL Soft is optional. Adds support for Stereoscopic (spatial) 3D audio.
wget http://kcat.strangesoft.net/openal-releases/openal-soft-1.15.1.tar.bz2 tar -xvjf openal-soft-1.15.1.tar.bz2 cd openal-soft-1.15.1/build cmake .. make && make install
4.2.17 Building OpenOffice/LibreOffice OpenOffice (or LibreOffice) are optional and add support for presentation sharing. For information about this feature, check section 5.14. Version 4.0 or later is required. Both the application and SDK are required. This section explain how to install (building would take hours) OpenOffice. LibreOffice could also be used but not recommended (not fully tested). IMPORTANT: These instructions are for Linux x86-64 and you must change the paths if youre using a 32-bit system. Run uname -m to get your CPU type. All rpms could be found at http://www.openoffice.org/download/other.html. Install OpenOffice application and SDK:
## Application (x64) ## wget http://sourceforge.net/projects/openofficeorg.mirror/files/4.0.0/binaries/enUS/Apache_OpenOffice_4.0.0_Linux_x86-64_install-rpm_en-US.tar.gz mkdir -p OpenOfficeApplication && tar -zxvf Apache_OpenOffice_4.0.0_Linux_x8664_install-rpm_en-US.tar.gz -C OpenOfficeApplication rpm -Uvih OpenOfficeApplication/en-US/RPMS/*rpm ## SDK (x64) ## wget http://sourceforge.net/projects/openofficeorg.mirror/files/4.0.0/binaries/SDK/Apache_Op enOffice-SDK_4.0.0_Linux_x86-64_install-rpm_en-US.tar.gz mkdir -p OpenOfficeSDK && tar -zxvf Apache_OpenOffice-SDK_4.0.0_Linux_x86-64_install-
15
V2.1.0 (2013-08)
Both OpenOffice application and SDK should be installed into need to edit the script used to prepare the SDK headers. Prepare the SDK headers:
/opt/openoffice4.
Please note that the destination folder must be named Install java runtime (required):
yum install java-1.7.0-openjdk
includecpp.
Note: Installing OpenOffice application will not add the binary ( soffice) in your $PATH environment variable. The TelePresence system will try to start the program in the background using a relative path unless you have changed presentation-sharing-app configuration entry. You can change your $PATH environment variable to avoid editing presentation-sharing-app but this is not recommend if youre testing different OpenOffice versions. Its also highly recommended to append the folder containing the binary AFTER $PATH. Appending the folder before $PATH will force using shared libraries (e.g. libssl, libcurl) installed with OpenOffice instead of yours. Correct: export
PATH=$PATH:/opt/openoffice4/program PATH=/opt/openoffice4/program: $PATH
/usr/bin/ld: skipping incompatible /opt/openoffice4/sdk/lib/libuno_sal.so when searching for -luno_sal: CPU type mismatch (e.g. installed 64-bit libraries on 32-bit OS).
4.2.18 Building Doubango Doubango VoIP framework 2.0 SVN r989 or later is required.
svn checkout http://doubango.googlecode.com/svn/branches/2.0/doubango doubango cd doubango && ./autogen.sh && ./configure --with-speexdsp --with-ffmpeg make && make install
Only few options are used to configure the source code and force enabling mandatory libraries. Any optional library is automatically detected. For example, use --with-opus to force using Opus audio codec or --without-opus to avoid automatic detection. You can also specify a path where to search for a library (e.g. --with-opus=/usr/local). Use configure 4.3
--help for
16
V2.1.0 (2013-08)
If no prefix is defined then, the binaries will be installed into /usr/local/sbin. 4.3.2 Installing the configuration and fonts files This is only required for first-time installations and will override any existing configuration file.
make samples
We highly recommend using our WebRTC SIP telepresence client to test the system.
17
V2.1.0 (2013-08)
5 Technical details
5.1 SA versus AS modes The server supports two modes: SA (stand-alone) and AS (application server). These modes are not exclusive and no special configuration is needed. The SA mode allows any SIP client to directly connect to the system without any intermediate node. If the client requires to be registered then, you can enable this option as explained in section 6.4 because the SIP registrar mode is OFF by default. The AS mode is useful if you already have your own SIP network/server and want to integrate the Telepresence system as an application server. This mode has been tested against Asterisk, reSIProcate, openSIPS and Kamailio. This is as easy as forwarding any INVITE (based on some criterias, e.g. [domain name equal to @conf-call.org]) received by the SIP registrar to the Telepresence system. 5.2 Client Requirements
5.3
In this current beta version we only focus on video bandwidth. The upload and download video bandwidth settings have to be defined using the configuration file as explained at 6.12.1. The maximum download bandwidth is signaled to the remote endpoints using the SDP (b=AS:X attribute as per RFC 3556) and RTCP-REMB packets (as per draftalvestrand-rmcat-remb-02). In this beta version, RTCP-TMMBN (RFC 5104) packets are deserialized but not processed by the system. The congestion control manager could be enabled using the configuration file as per section 6.12.1. In this beta version, draft-alvestrand-rtcweb-congestion-03 is not fully implemented yet and were using our own algorithms to compute the bandwidth usage. The computed maximum bandwidth (periodically) when the congestion control is enabled will never be higher than the maximum allowed values defined in your configuration file (this is kind of safe guard). 5.4 Stereoscopic (spatial) 3D audio
Our 3D audio mixer is based on OpenAL Soft and supports up to 256 sources. Your OpenAL version must be at least 1.15.1 and implements ALC_SOFT_loopback extension. To have a 3D audio, each SIP client should signal its position in the virtual room. The signaling is done using two SIP headers: TP-AudioPosition and TP-AudioVelocity. These two SIP headers must contain an array of three floating numbers.
TP-AudioPosition: [0.0f, 0.0f, 0.0f] TP-AudioVelocity: [0.0f, 0.0f, 0.0f]
Using our WebRTC TelePresence client, the 3D settings could be defined at http://confcall.org/settings.htm. From OpenAL documentation, AL_POSITION specifies the current location of the object in the world coordinate system. Any 3-tuple of valid float values is allowed. Implementation behavior on encountering NaN and infinity is not defined. The object position is always defined in the world telepresence the Open Source SIP TelePresence System
18
V2.1.0 (2013-08)
coordinate system. From OpenAL documentation, AL_VELOCITY specifies the current velocity (speed and direction) of the object, in the world coordinate system. Any 3-tuple of valid float/double values is allowed. The object AL_VELOCITY does not affect the source's position. OpenAL does not calculate the velocity from subsequent position updates, nor does it adjust the position over time based on the specified velocity. Any such calculation is left to the application. For the purposes of sound processing, position and velocity are independent parameters affecting different aspects of the sounds. 5.5 Selecting the speaker and listeners
5.6
The audio mixer is part of the MCU engine. The audio mixer supports mixing several streams with different settings (rate, channels, bits per sample or ptime). For example, a bridge can host a conference with two endpoints, one using g711 (8khz, mono, 20ms) and the other using opus (48khz, stereo, 30ms). As you may expect, its not technically possible to mix two streams with different settings without resampling. In the audio mixer there is a notion of pivot settings. pivot settings is the audio parameters to which any stream is resampled to, before mixing. The pivot settings are defined using the configuration file as per section 6.11. The Doubango framework use libspeexdsp for the resampling while the MCU uses libswresample (from FFmpeg). Both libraries are required. Its very important to understand the notion of pivot settings because using wrong values could lead to poor audio quality and high CPU usage.
Audio stream (from ONE endpoint) with settings negotiated using SDP
Doubango Framework (RTP receiver) Resample audio to match pivot settings (only if different) Forward audio samples to the MCU
MCU Mix audio samples with other streams using pivot settings Forward mixed audio samples to Doubango
Doubango Framework (RTP sender) Resample mixed audio to match negotiated settings (only if different) Send the audio samples Figure 1: Audio resampling
19
V2.1.0 (2013-08)
From the above figure, you can easily see that the incoming audio samples from an endpoint to the MCU could be resampled up to two times if your pivot and negotiated codec settings mismatch. To minimize the number of audio resampling processes your codec settings have to be as close as possible to those used as pivot. If the settings (pivot, codecs) match, then no resampling will be done. In this beta version, we support 2d and 3d mixing types. The type of mixing is defined using the configuration file as explained at section 6.11. The 2d mixing is linear (monophonic or stereophonic) and very basic. No additional thirdparties library is required for this. The 3d mixing is stereoscopic (spatial) and requires OpenAL Soft. 5.7 Video mixer design
The video mixing and scaling is completely managed using FFmpeg. In this beta version, only 2D mixing is supported. Well consider support for libyuv in the next versions. 5.7.1 Encoders and decoders To minimize CPU utilization, all endpoints with the same video codec use the same encoder but different decoders. For example, if you have 7 endpoints using VP8 codec then, the MCU will have 1 encoder and 7 decoders. Using a single encoder gives better CPU performances but more bandwidth than whats needed because if one endpoint requests an IDR then, the prediction chain is restarted for all other peers. 5.7.2 Overlays The overlays are configured as per section 6.12.6. The mixed video contains many overlays displayed using FFmpeg filters. There are two kinds of overlays: texts and images. The text overlays use drawtext filter which requires libfreetype to be enabled when building FFmpeg. The image overlays (watermarks) use movie filter and accept any PNG or JPEG file as input. JPEG images are natively supported by FFmpeg (thanks to MPEG4) while PNG requires zlib to be enabled. The configuration file could be used to define custom font types for the text overlays. The font types must be TrueType-compatible. The default fonts come from ftp://ftp.gnu.org/pub/gnu/freefont. 5.8 Audio quality
5.9
Video quality
This section explains how to make sure to have a good video quality when using the system. 5.9.1 Packet loss recovery This feature requires the jitter buffer to be enabled. When a RTP packet is lost then, we request the remote party to send it again using RTCP-NACKs as per RFC 5104. Support for this feature is indicated using the SDP (attribute a=rtcp-fb: * nack). If the remote peer cannot honor the request in a reasonable delay then, the packet is considered as definitely lost. If were very lucky then, losing a packet would just introduce artifacts. Otherwise (not lucky), this will break the prediction chain and any attempt to decode the video would fail until the next IDR frame. Enabling zero-artifacts feature would fix artifact and prediction issues. Increasing the internal RTP buffer size as per section 6.6 could also help to lower packet loss. telepresence the Open Source SIP TelePresence System
20
V2.1.0 (2013-08)
5.9.2 Jitter buffer The video jitter buffer could be enabled as per section 6.12.3. Enabling the video jitter buffer introduce small delay (~100ms) but worth it. Buffering the video packets allows requesting missing packets using RTCP-NACK (RFC 5104) and reordering them based on the RTP sequence numbers. Its also up to the jitter buffer to consume any delay or burst to have smooth video according to the frame rate. The video frame rate is negotiated using the SDP but this value will be updated based on the RTP timestamps. 5.9.3 Zero-artifacts This feature is enabled or disabled as per section 6.12.4. A video stream contains artifacts when some RTP packets are lost. The MCU try its best to avoid packet loss (see section 5.9.1) but sometimes it fails and this leads to visual artifacts. As the video streams are mixed then, the artifacts will be propagated to all endpoints. When zero-artifact is enabled then, the MCU pauses the rendering on the stream and sends RTCP-FIR (RFC 5104) to request new IDR frame to repair the prediction chain. Only the stream with the missing RTP packets is paused until the next IDR frame is received. Support for RTCP-FIR is signaled to the remote endpoints using the SDP (attribute a=rtcp-fb: * fir). The MCU sends IDR frames when it receives RTCP-FIR or RTCP-PLI from one of the endpoints. 5.9.4 AVPF tail length As already explained, RTCP-NACKs are used to ask a peer to send a packet again. In order to be able to honor these requests we need to save the outgoing RTP packets in a queue. The AVPF tail length option defines the minimum and maximum lengths for the queue. The higher these values are the better will be the video quality. The default queue length will be equal to the minimum value and its up to the MCU to increase this value depending on the number of unrecoverable packet loss. The final value will be at most equal to the maximum defined in the configuration file. Unrecoverable packet loss occurs when the MCU receives an RTCP-NACK for an already removed sequence number (very common when network RTT is very high or bandwidth very low). Setting the AVPF tail length (min, max) is done as per section 6.7. 5.9.5 FEC (Forward Error Correction)
--This section intentionally left blank--
5.10 NAT and firewall traversal This section explains how to tackle NAT and firewall traversal issues. 5.10.1 Symmetric RTP This feature is enabled or disabled as per 6.5. The Telepresence system fully supports RFC 4961. An RTP/RTCP stream is symmetric if the same port is used to send and receive packets. This helps for NAT and firewall traversal as the outgoing packets open a pinhole for the ongoing ones. The local/outgoing stream (MCU endpoints) is always symmetric. If both parties (remote and local) have successfully negotiated ICE candidates then, none will be forced to use symmetric RTP/RTCP. Lets imagine your Telepresence instance is on a public network and the SIP client/endpoint on private network: 1. Telepresence: Public IP address is 1.1.1.1 telepresence the Open Source SIP TelePresence System
21
V2.1.0 (2013-08)
2. Client: Private IP address is 2.2.2.2 and public IP address is 1.1.1.2 3. The SDP from the client to the Telepresence system will contain clients private IP address (2.2.2.2) which is not reachable 4. The RTP/RTCP packets from the client to the server will be received with source IP address equal to the clients public IP address ( 1.1.1.2) 5. If rtp-symetric-enabled option is used then, the Telepresence system will send RTP/RTCP packets to 1.1.1.2 (learnt from the received packets) instead of 2.2.2.2 which is private and unreachable. 5.10.2 ICE This feature is enabled or disabled as per section 6.5. The Telepresence system fully supports RFC 5245. ICE is negotiated only if this feature is enabled and incoming SDP (SIP endpoint MCU) contains candidates. ICE is mandatory for WebRTC endpoints. 5.10.3 RTCP-MUX This feature is enabled or disabled as per section 6.5. The Telepresence system fully supports RFC 5761. RTCP-MUX is used to minimize the number of ports and help for NAT traversal and administration. 5.11 Security This section explains how to secure both signaling and media plans. 5.11.1 Signaling Two secure signaling protocols are supported: TLS and WSS. WSS is WebSocket secured using TLS. For more information on how to enable these transport protocols using the configuration file, please refer to section 6.2. Both transports require OpenSSL which have to be enabled when building the Doubango framework only. More information on how to configure the SSL certificates at section 6.3. 5.11.2 Media Both SRTP-SDES (RFC 4568) and SRTP-DTLS (RFC 5763, RFC 5764) are supported. Check section 6.3 for more information on how these features have to be configured.
5.12 Protecting a bridge with password This feature is configured as per section 6.13. There are two ways for a SIP client to authenticate to a protected bridge: DTMF or TP-BridgePin SIP header. If authentication fails, a SIP 403 response will be returned with a short description. The DTMF method doesnt require changing your SIP client but is not supported yet in this beta version (on the roadmap for the release version). The second method (using the SIP header) requires some modifications on you SIP client to include this new header. If you are using our WebRTC SIP telepresence the Open Source SIP TelePresence System
22
V2.1.0 (2013-08)
telepresence demo client then, no modification is needed. 5.13 Recording conference to a file This feature is configured as per section 6.9. We support almost any container depending on how you built FFmpeg and which codecs are enabled. Right now we recommend only *.avi and *.mp4 as they are fully tested. *.mkv and *.webm will also work but not fully tested yet. The audio and video mixing is done using libavformat from FFmpeg.
*.avi (recommended): *.mp4:
requires FFmpeg with MPEG4 video codec and AC3 audio codec
requires FFmpeg with H.264 (libx264 thirdparty library) video codec and AAC audio codec. There is a built-in experimental AAC codec in FFmpeg but the code is intentionally designed to not accept such codec because of random crashes. For AAC audio codec, youll need to build FFmpeg with libfaac or any other third-party AAC library. Please note that all AAC libraries are not free.
*.avi is
recommended instead of *.mp4 for the simple reason that the first one consume less CPU.
The output file will have the bridge identifier as name and container type as extension, e.g. +336000000.avi. The file is locked and invalid until the last user quit the bridge. We highly recommend using VLC to play the output files.
23
V2.1.0 (2013-08)
5.14 Presentation sharing Presentation sharing allows any SIP client to share PowerPoint documents with any client connected to the same bridge. The only technical requirements are support for HTTP(S) (sharer only) and SIP INFO. This is a revolutionary way to share presentation as the documents are streamed in the video channel which means any client supporting video could see it. The slides are extracted from the document using OpenOffice (or LibreOffice) as JPEG pictures, re-encoded (H.264, VP8 or whatever) and mixed in the current video stream. Technically, OpenOffice is not integrated in the system but forked as new process and this is why both the SDK and application are required. The OpenOffice process will be started by the TelePresence system at boot time and added to the same job group as the current process to make sure the child will exit when the parent unexpectedly die. The default port used for the inter-process commutation is 2083 and could be changed using the configuration as per section 6.10. The TelePresence system supports CORS which means the request could be sent from any domain. This feature could be tested using our online WebRTC SIP client. Steps: 1. Publish the PowerPoint document to the bridge using HTTP(S) POST requests. 2. Receive feedbacks from the MCU (SIP INFO messages). 3. Move from slide to slide using SIP INFO messages. 4. Close the presentation session using SIP INFO message.
5.14.1 Publishing the document To start sharing a PowerPoint document you must have an active video session. You can only share ONE presentation at time. The document is sent to the MCU in TWO HTTP(S) requests using the same connection. The first request sends information about the document and the second the content. You must not mix the document information and content. The TelePresence system must be configured with an http or https transport (or both) as explained at section 6.2 . The first request structure: Element Request type Request URL Content-Type Value HTTP(S) POST /presentation application/json Availability Mandatory Optional Mandatory
First request content (JSON): Field name action name Field value req_presentation_upload <user defined> Type String String Availability Mandatory Mandatory
<user defined> <user defined> <user defined> <user defined> <user defined>
POST /presentation HTTP/1.1 Host: 192.168.0.37:20065 Connection: keep-alive Content-Length: 174 Origin: http://conf-call.org User-Agent: Mozilla/5.0 (Windows Chrome/28.0.1500.95 Safari/537.36 Content-type: application/json Accept: */* Referer: http://conf-call.org Accept-Encoding: gzip,deflate,sdch Accept-Language: en-US,en;q=0.8 {"action":"req_presentation_upload","name":"db_pres_01.ppt","type":"application/vnd.ms powerpoint","size":752128,"bridge_id":"100600","bridge_pin":"1234","user_id":"johndoe" } NT 6.0) AppleWebKit/537.36 (KHTML, like Gecko)
The second request structure: Element Request type Request URL Content-Type Content Value HTTP(S) POST /presentation application/file <Binary> Availability Mandatory Optional Mandatory Mandatory
5.14.2 Receiving feedbacks Once the presentation is published the TelePresence system will send SIP INFO messages to give feedbacks about the session state. The SIP INFO messages always contain JSON content. JSON content: Field name action name state Field value req_presentation_upload <server defined> opened exported closed telepresence the Open Source SIP TelePresence System Type String String String Availability Mandatory Mandatory Mandatory
V2.1.0 (2013-08)
5.14.3 Fetching the presentation Once the presentation is opened (see previous section), you can navigate through it using SIP INFO messages. For security reasons the presentation is tied to your SIP connection to be sure no one else could control it. JSON content: Field name action page_index id Field value req_presentation_goto <user defined> <user defined> Type String Integer Integer Availability Mandatory Mandatory Optional
5.14.4 Closing the presentation At any time you can end the presentation session using SIP INFO request. JSON content: Field name action id Field value req_presentation_close <user defined> Type String Integer Availability Mandatory Optional
26
V2.1.0 (2013-08)
27
V2.1.0 (2013-08)
5.16 Call session management A call session is managed using SIP INFO messages with JSON content. For now only muting/unmuting the session is supported. Next versions will add support for ejecting a participant, getting the list of participants, getting the call state (packet loss, RTT, audio/video quality) 5.16.1 Muting/unmuting The MCU could detect that a session is muted based on the RTP packets but its highly recommended to also send a SIP INFO message for confirmation. For audio-only sessions, muting a session without sending a SIP INFO could be interpreted as a crash or network issue which automatically disconnects the call. When the hangout video pattern is selected the MCU renders the speakers video with the highest quality and size. Detecting a speaker could be problematic when the participants are in a noisy environment. Manually muting/unmuting your session is a way to avoid such issues. JSON content: Field name action enabled Field value req_call_mute <user defined> Type String Boolean Availability Mandatory Mandatory
28
V2.1.0 (2013-08)
6 Configuration
The Telepresence system is configured using cfg files. The main cfg file is named telepresence.cfg and should be on the same directory where the binary is installed unless you run the app with --config=PATH argument. The source code contains a sample configuration file to use to get started. The sample configuration is installed after successfully building the system and running make samples. The configuration files are parsed using code generated with Ragel tool. A configuration file contains comments, sections and entries: Comments A comment starts with #
# Im a comment Age = 25 # Im another comment
Sections A section name must be enclosed by square brackets. The section name is case insensitive.
# this is a bridge section [bridge]
Entries An entry is a key-value-pair and must be tied to a section. Both the key and the value are case insensitive. The key must not start with a SPACE.
[product] # Im the section version = 1.2 # Im an entry with floating number value name = telepresence # Im an entry with a string value
6.1
As a developer, the first action is to edit your configuration file to change the ERROR to INFO.
from
When you connect to the MCU you always have video stream back (you see yourself) but your audio stream is never sent back. For debugging purposes, it could be useful to ask your audio stream back. Hearing your own sound helps testing that everything work. Without loopback audio, you must connect at least two endpoints to test audio (encoding, decoding, streaming, resampling).
debug-level = INFO debug-audio-loopback = yes
and
debug-level-loopack
- whether to enable audio loopback for testing (see above for more
information). We require having your debug-level equal to INFO when reporting/sharing issues. telepresence the Open Source SIP TelePresence System
29
V2.1.0 (2013-08)
6.2
Network transports
The system support many network transports for the SIP signaling, presentation uploading, call control... The network connections are declared using transport configuration entries.
transport = udp;*;20060;* transport = ws;*;20060;4 transport = wss;*;20062;* transport = tcp;*;20063 transport = tls;*;20064;* transport = https;*;20065 transport = https;*;20066
Format: protocol-value;ip-address-value;port-value;ip-version-value. protocol-value must be udp , tcp , tls , http, https, ws or wss ws protocol defines WebSocket and wss the secure version (requires OpenSSL). At least one WebSocket transport must be added to allow a web browser (WebRTC SIP client) to connect to the system. The other protocols (tcp, tls and udp) are used for SIP-legacy devices or PSTN. local-ip-value is any valid IPv4/IPv6 address or FQDN. Use star ( *) to let the system choose the best local IP address to bind to. Examples: udp;*;5060 or ws;*;5061 or wss;192.168.0.10;5062 local-port-value is any unused local port to bind to. Use star (*) to let the system choose the best unused port to bind to. Examples: udp;*;* or ws;*;* or wss;*;5062 ip-version-value defines the IP version to use. Must be 4, 6 or *. Star (*) is used to let the system choose the best one. Using star ( *) only make sense if local-ip-value is a FQDN instead of IP address. A transport configuration entry must have at least a protocol, IP address (or star) and port (or star). The IP version is optional. udp , tcp , tls , ws and wss transports are used to transport SIP messages while http and https are used to upload presentations. 6.3 Security
The configuration file allows setting the SSL certificate files to be used for TLS and WSS signaling protocols. The certificates are also used for DTLS-SRTP. The Doubango framework must be built with OpenSSL enabled as explained in section 5.11.
ssl-private-key = /tmp/ssl.pem ssl-public-key = /tmp/ssl.pem ssl-ca = /tmp/ssl.pem ssl-mutual-auth = no
the full path to the PEM file. the full path to the PEM file.
- whether the incoming connection requests must fail if the remote peer certificates are missing or do not match the local ones. This only applies to TLS or WSS and is useless for DTLS-SRTP as certificates are always required.
30
V2.1.0 (2013-08)
SRTP is required. Based on the mode, the SDP on the outgoing INVITEs will be formed like this: none: o profile will be equal to RTP/AVP o no crypto lines or certificate fingerprints will be added optional: o profile will be equal to RTP/AVP o two crypto lines will be added if srtp-type includes fingerprints if srtp-type also includes dts.
sdes,
plus certificate
mandatory: o profile will be equal to RTP/SAVP if srtp-type is equal to SDES or UDP/TLS/RTP/SAVP if srtp-type is equal to dtls o two crypto lines will be added if srtp-type is equal to fingerprints if srtp-type is equal to dtls
srtp-type - defines the list if the srtp-mode value is sdes
or certificate
of all supported SRTP types. Defining multiple values only make sense equal to optional which means we want to negotiate the best one. Supported values are sdes and dtls. DTLS-SRTP requires valid SSL certificates and Doubango source code must be compiled with OpenSSL version 1.0.1 or later. 6.4 SIP registration
Many SIP clients require to be registered (logged in) before being able to make calls. By default, any REGISTER request to the gateway will be rejected. entry defines whether to accept incoming SIP REGISTER requests or not (acting as SIP registrar).
accept-sip-reg accept-sip-reg = yes # no to disable
When the Telepresence system is behind a SIP registrar (e.g. Asterisk) then, this configuration entry is useless as the REGISTER requests will not be forwarded to the MCU. 6.5 NAT / Firewall traversal
This section shows how to enable or disable symmetric RTP (section 5.10), ICE (section 5.10.2) and RTCP-MUX (section 5.10.3).
rtp-symmetric-enabled = yes # no to disable
31
V2.1.0 (2013-08)
- whether to enable symmetric RTP (RFC 4961) for NAT and firewall
traversal
ice-enabled -
whether to enable ICE (RFC 5245) for NAT and firewall traversal.
- whether to use STUN to gather reflexive addresses or not. This option is useful when the server is on a public network or all peers are on the same local network. In these cases, disabling STUN for ICE will speed up the call setup. Disabling icestun is also useful when the system is installed on a PC without access to internet.
icestun-enabled
- defines the STUN/TURN server to use to gather reflexive addresses for the ICE candidates. If no server is defined then, a default one will be used. The default STUN/TURN server is numb.viagenie.ca:3478.
stun-server
Format: server-fqdn-value; server-port-value; user-name-value; user-password-value server-fqdn-value: A valid IPv4/v6 address or host name. server-port: A valid port number. user-name-value: The login to use for TURN authentication. Use star (*) to ignore. user-password-value: The password to use for TURN authentication. Use star (*) to ignore.
rtcp-mux-enabled -
6.6
configuration entry is used to define the internal buffer size to use for RTP sockets. The higher this value is the lower will be the RTP packet loss. Please note that the maximum value depends on your system (e.g. 65535 on Windows). A very high value could introduce delay on video stream and its highly recommended to also enable video jitter buffer option.
rtp-buffersize
Code usage:
setsockopt(SOL_SOCKET, SO_RCVBUF, rtp-buffsize-value); setsockopt(SOL_SOCKET, SO_SNDBUF, rtp-buffsize-value);
Configuration:
rtp-buffersize = 65535
6.7
configuration entry defines the maximum and minimum queue length used to store the outgoing RTP packets. The stored packets are used to honor incoming RTCP-NACK requests. See section 5.9.4 for more information.
avpf-tail-length avpf-tail-length = 200;500 # min;max
32
V2.1.0 (2013-08)
6.8
Audio/Video codecs
configuration entry defines the list of all supported codecs. Only G.711 and G.722 are natively supported and all other codecs have to be enabled when building the Doubango VoIP framework source code. Each codec priority is equal to its position in the list. First codecs have highest priority. Supported values are: opus, pcma, pcmu, amr-nb-be, amr-nb-oa, speex-nb, speex-wb, speex-uwb, g729, gsm, g722, ilbc, h264-bp, h264-mp, vp8, h263, h263+, theora and mp4v-es.
codecs codecs = pcma;pcmu;vp8;h264-bp;h264-mp
6.9
The configuration file is used to specify whether to record the sessions, which container to use and where to store the output file.
record = yes record-file-ext = mp4
the sessions. the container to use. Almost any container ( avi, mp4, webm, mkv) could be used but this depends on how you built FFmpeg. For more information, please check section 5.12. We highly recommend using VLC to play the output file. 6.10 Presentation sharing A presentation is any PowerPoint document and it could be shared from any SIP client running on any device. The presentation is uploaded to the TelePresence system using HTTP(S) POST request which means a http (or https) transport must be configured as explained in section 6.2. More technical details could be found in section 5.14.
presentation-sharing-enabled = yes presentation-sharing-process-local-port = 2083 presentation-sharing-base-folder = ./presentations
whether to enable presentation sharing. Default is yes. The application must be built with OpenOffice (recommended) or LibreOffice SDK to support this feature. This feature will be silently disabled if both SDKs are missing. presentation-sharing-process-local-port - some implementations requires a third-party application (e.g. OpenOffice or LibreOffice) to export the presentation. The third-party application will be forked to run in the background and the local port ([1024-65535]) is used to communicate with the TelePresence system. For example, if the third-party application is OpenOffice and the local port is equal to 2083 then: command string would be: soffice -norestore headless -nofirststartwizard
presentation-sharing-enabled -
-invisible accept=socket,host=localhost,port=2083;urp;StarOffice.ServiceManager"
"-
and
the
connection
string
would
be:
uno:socket,host=localhost,port=2083;urp;StarOffice.ServiceManager
33
V2.1.0 (2013-08)
is the OpenOffice application binary. Your $PATH environment variable must reference the folder containing the binary or presentation-sharing-app must contain a full path (e.g. /opt/openoffice4/program/soffice). presentation-sharing-base-folder - base folder where to store uploaded presentations and temporary exported jpeg images. For example, a document named mypres.ppt uploaded by bob who is connected to a bridge with number equal to 100600 would have a path equal to <the base folder>/100600/bob/ mypres.ppt. presentation-sharing-app third-party application name. Could be full (e.g. "/opt/openoffice4/program/soffice") or relative ("soffice") path. Relative path requires having the folder containing the application in your $PATH environment variable. 6.11 Audio This section explains how to use settings related to the audio. 6.11.1 Pivot settings The notion of pivot settings is explained in section 5.6.
audio-channels = 1 audio-bits-per-sample = 16 audio-sample-rate = 8000 audio-ptime = 20 audio-volume = 1.0f audio-dim = 2d audio-max-latency = 200
number of audio channels to use. Supported values are 1 and 2. number of bits for each audio sample. Supported values are 8, 16 and 32. Almost any value is supported:
audio-bits-per-sample audio-sample-rate:
- number of milliseconds for each audio frame. The value should be multiple of 10. Supported values: [1 255]
audio-volume audio-dim
attenuation (or gain) to apply to the mixed audio. Supported values: [ 0.0f - 1.0f].
2d
- mixer dimensions. The value must be the system with OpenAL Soft.
audio-max-latecncy
(Linear) or
3d
(Spatial).
3d
requires building
- maximum audio delay (because of clock drift) before resetting the jitter buffer. The value could be any positive value. Unit: milliseconds. 6.12 Video This section explains how to use settings related to the video. 6.12.1 Bandwidth and congestion control There are two kinds of video bandwidths: upload and download. Upload: Bandwidth (kbps) used by the video stream (RTP + RTCP) from the MCU to a single endpoint. Download: Bandwidth (kbps) used by the video stream (RTP + RTCP) from the one endpoint to the MCU. This user-defined value will be forwarded to the remote endpoint using the SDP and RTCPREMB and its up to this one to respect it or not. For more information, check section 5.3. telepresence the Open Source SIP TelePresence System
34
V2.1.0 (2013-08)
The configuration file allows setting the maximum upload and download bandwidths to use. If these values are undefined then, the upload bandwidth is computed following this formula:
video-max-upload-bandwidth (kbps) = ((video-width * video-height * video-fps * motionrank * 0.07) / 1024)
For example, 720P video stream @15 frames per second with medium (2) motion rank will consume 1280*720*15*2*0.07 = 1935360 bps = ~1890 kbps unless video-max-upload-bandwidth entry is defined.
Congestion-ctrl-enabled = yes video-max-upload-bandwidth = -1 # in kbps, <=0 means undefied video-max-download-bandwidth = -1 # in kbps, <=0 means undefied video-motion-rank = 2 # 1(low), 2(medium) or 4(high) video-fps = 15 # [1 - 120]
whether to enable draft-alvestrand-rtcweb-congestion-03 and draftalvestrand-rmcat-remb-01. Check section 5.3 for more information. video-max-upload-bandwidth - defines the maximum bandwidth (kbps) to use for outgoing video stream (per endpoint). If congestion control is enabled then, the bandwidth will be updated based on the network conditions but these new values will never be higher than what you defined in your configuration file video-max-download-bandwidth - defines the maximum bandwidth (kbps) to use for incoming video stream (per endpoint). If congestion control is enabled then, the bandwidth will be updated based on the network conditions but these new values will never be higher than what you defined in your configuration file video-motion-rank - defines the video type. Supported values: 1 (low, e.g. home video security systems), 2 (medium, e.g conference call) or 3 (high, e.g. basketball game). video-fps - defines the video framerate for the mixed stream regardless the input fps. Supported values: [1 120].
Congestion-ctrl-enabled
To check available bandwidth: http://www.speedtest.net/ To check bandwidth usage: iftop. 6.12.2 Output size, pixel aspect ratio and letterboxing The output (MCU endpoints) mixed video size is independent of the input sizes (from the endpoints). video-mixed-size configuration entry is used to set the preferred value. Accepted values are: sqcif(128x98), qcif(176x144), qvga(320x240), cif(352x288), hvga(480x320), vga(640x480), 4cif(704x576), svga(800x600), 480p(852x480), 720p(1280x720), 16cif(1408x1152) , 1080p(1920x1080), 2160p(3840x2160). If no value is defined then, the mixed video size is assumed to be equal to vga (640x480). 720p, 1080p and 2160p are commonly named HD, Full HD and Ultra HD.
video-mixed-size = vga
In this beta version, its not allowed to set arbitrary values because of backward compatibility. The final version will probably allow this. To draw the speaker and listeners video on the output mixed video we need to resize these frames to feet the destination. The video frames are linearly resized following a specific Pixel Aspect Ratio (PAR) before being letterboxed. telepresence the Open Source SIP TelePresence System
35
V2.1.0 (2013-08)
A PAR equal to 1:1 means skip the linear resizing and a value of resizing and letterboxing. Common PAR values: 16:9 or 4:3.
0:0
6.12.3 Jitter Buffer video-jb-enabled configuration entry is used to enable or disable video jitter buffer. It's highly recommended to enable video jitter buffer because it's required to have RTCP-FB (NACK, FIR, PLI... as per RFC 5104) fully functional. Enabling video jitter buffer gives better quality and improves smoothness. For example, no RTCP-NACK messages will be sent to request dropped RTP packets if this option is disabled. Its also up to the jitter buffer to reorder RTP packets. For more information, check section 5.9.2.
video-jb-enabled = yes # no to disable
6.12.4 Zero-artifacts Its up to the MCU to decode all video streams from the endpoints, mix them before sending the result. If RTP packets are lost on one stream then, artifacts will be introduced on the mixed frame (result). Enabling zero-artifact feature fix this issue. There are some requirements on the endpoints to have this feature fully functional. For more information, check section 5.9.3.
video-zeroartifacts-enabled = yes # no to disable
36
V2.1.0 (2013-08)
6.12.6 Overlays The configuration file allows managing the overlays (font size and type, position, watermark).
overlay-fonts-folder-path = ./fonts/truetype/freefont overlay-copyright-text = Doubango Telecom overlay-copyright-fontsize = 12 # full path: ./fonts/truetype/freefont/FreeSerif.ttf overlay-copyright-fontfile = FreeSerif.ttf overlay-speaker-name-enabled = yes overlay-speaker-name-fontsize = 16 # full path: ./fonts/truetype/freefont/FreeMonoBold.ttf overlay-speaker-name-fontfile = FreeMonoBold.ttf overlay-speaker-jobtitle-enabled = yes overlay-watermark-image-path = ./images/logo35x34.jpg
- defines the base folder path where to look for the font types. The default fonts come from ftp://ftp.gnu.org/pub/gnu/freefont. For more fonts (not free), we recommend http://www.dafont.com/. overlay-copyright-text - defines the copyright text to display on the mixed video. Comment the line to disable this feature. overlay-copyright-fontsize - defines the font size to use to draw the copyright text. overlay-copyright-fontfile - defines the font file to use to draw the copyright text on the mixed video. The full path to the TruType file will be overlay-fonts-folder-path+"/"+overlay-copyrightfontfile. overlay-speaker-name-enabled - whether to draw the speaker's name on the mixed video. overlay-speaker-name-fontsize - defines the font size to use to draw the speaker's name (and job title) on the mixed video. overlay-speaker-name-fontfile - defines the font file to use to draw the speaker's name on the mixed video. The full path to the TruType file will be overlay-fonts-folder-path+"/"+overlayspeaker-name-fontfile. overlay-speaker-jobtitle-enabled - whether to draw the speaker's job title on the mixed video. overlay-watermark-image-path - defines the full path to the image to use to watermark the mixed video. Comment the line to disable this feature.
overlay-fonts-folder-path
For more information, check section 5.7.2. To test test the overlays we highly recommended using the WebRTC Telepresence client. 6.12.7 Patterns
--This section intentionally left blank
37
V2.1.0 (2013-08)
6.13 Bridge configuration In this beta version, global configuration entries are ignored when reused with a [bridge] section. For example, setting the audio sample rate at global scoop applies to all bridges but redefining it for a [bridge] section will be ignored. In the release version, it will be possible to override almost any configuration entry. Its not required to add a bridge in order to be able to make conference calls. You can add as many [bridge] sections as you want. Supported entries are: id and pin-code.
[bridge] id=10060 pin-code=1234 [bridge] id=10061 pin-code=0000
defines the bridge identifier (a SIP client would call sip:<the id>@domain to connect to this bridge).
pin-code A 4-digit
38
V2.1.0 (2013-08)
39
V2.1.0 (2013-08)
8 Known issues
This is a short list with all known issues (to be fixed before the end of the beta phase). 1. The audio quality on the recorded files is not as good as we expect. This looks like an issue on the PTS and DTS. 2. Mixed video looks stretched when the SIP clients is a mobile in portrait (e.g. iDoubs and IMSDroid). No issue when device is in landscape or using our WebRTC demo client. 3. The algorithm to detect the current speaker and listeners is buggy. 4. 3D (spatial) mixed audio quality is not as good as we expect.
40
V2.1.0 (2013-08)
9 Tips
9.1 Lowering CPU 1. Avoid audio resampling To avoid audio resampling, the SIP clients connecting to a bridge have to use a codec with the sample rate, channels and bits per sample as the pivot settings. For more information, check section 5.6. In you configuration file, enable codecs with same settings as the pivot. 2. Only record sessions if needed Do not enable recording if its not important to you or use *.avi container which consume less CPU than *.mp4 (because of AAC encoder from libfaac). 3. Use common video codec All SIP clients with the same video codec will share a single encoder. Try to use common video codec for all clients. For example, if you have two clients, A and B, with A supporting both H.264 and VP8 and B only H.264 then, you should make sure that A will offer H.264 with highest priority. For more information, check section 5.7.1. In your configuration file, enable a single video codec if you cannot control the SIP clients. 4. Use 2d audio mixing Enable 2D audio mixing instead of 3D. 5. Lower mixed video size and fps If you have a weak CPU then, consider using a reasonable video size (e.g. VGA) and fps ([15 30]). 6. Multi-threading and ASM Make sure to enable YASM and pthread when building FFmpeg, x264 and VP8. 9.2 1. 2. 3. 4. Lowering bandwidth Use slow motion rank (see section 6.12.1) Use small mixed video size (see section 6.12.2) Set the maximum upload and download bandwidth (see section 6.12.1) Use small video frame rate (see section 6.12.1)
To test your available bandwidth, we recommend http://www.speedtest.net/. To check bandwidth usage, we recommend iftop. 9.3 Improving audio quality 1. Use Opus (or G.722) audio codec if supported by the SIP clients (see section 6.8). 2. Avoid audio upsampling and downsampling (see section 5.6). 3. If the pivot settings use a sample rate (SR) equal to S then, try to use codecs with a SR equal to S << n or S >> n. 9.4 1. 2. 3. 4. Improving video quality Use Google Chrome as SIP client (check our WebRTC demo client at http://conf-call.org/). Enable Zero-Artifacts feature (see section 5.9.3 and 6.12.4) Use a client supporting something close to 16/9 video size to avoid stretching issues Avoid video upsampling and downsampling telepresence the Open Source SIP TelePresence System
41
V2.1.0 (2013-08)
9.5