H323 analysis

From TD-er's Wiki
(Redirected from H323)
Jump to navigationJump to search

H.323 is a standard that specifies the components, protocols and procedures that provide multimedia communication services: real-time audio, video, and data communications or any combination of these elements over packet networks, including Internet protocol (IP) based networks.

H.323 is an International Telecommunications Union (ITU) standard that provides specification for computers, equipment, and services for multimedia communication over networks that do not provide a guaranteed quality of service. H.323 is part of a family of ITU-T recommendations called H.32x that provides multimedia communication services over a variety of networks.

H.323 was originally created to provide a mechanism for transporting multimedia applications over LANs but it has rapidly evolved to address the growing needs of VoIP networks. One strength of H.323 was the relatively early availability of a set of standards, not only defining the basic call model, but in addition the supplementary services, needed to address business communication expectations. H.323 was the first VoIP standard to adopt the Internet Engineering Task Force (IETF) standard Real-Time Control Protocol (RTCP) to transport audio and video over IP networks, with additional protocols for call signaling, and data and audiovisual communications.


The H.323 standard specifies four kinds of components, which, when networked together, provide the point-to-point and point-to-multipoint multimedia-communication services:


An H.323 terminal is an endpoint in the LAN that participates in real-time, two-way communications with another H.323 terminal, gateway, or multipoint control unit (MCU). A terminal must support audio communication and can also support audio with video, audio with data, or a combination of all three.

H.323 terminal options include support user equipment interfaces, audio and video CODECs, the T.120 data protocols, MCU capabilities and an H.225 component called RAS (Registration/Admission/Status) which is a protocol used to communicate with a Gatekeeper.

  1. User equipment interfaces: The set of cameras, screens, microphones, speakers and data applications with their respective interfaces.
  2. Audio Codec: Each terminal will have an audio codec, to codify and to decodify vocal signals (G.711), and they could transmit and receive a-law and μ-law. Optionally, a terminal, could codify and decodify vocal signals. The H.323 terminal can, also optionally, send more of one audio channel at the same time, for example, in order to transmit 2 different languages
  3. Video Codec: In H.323 terminals is optional.
  4. Data Channel: One or more data channels are optional. They could be in one direction or in both.
  5. Reception Path latency: It includes the latency added to the packets to maintain the synchronization, and to consider the fluctuation in the packet arrivals. Usually it is not used in transmission but in the reception, to add the necessary latency, for example, to obtain the synchronization between the movement of the lips and the audio signal in a videoconference.
  6. System Control Unit: Gives the terminal signaling. There are three different functions: H.245 Control function, H.225 call signalling function and RAS signaling function.
    • H.245 Control function: The H.245 logical channel is used to take messages end-to-end of the H.323 protocol. It takes care to negotiate the capacities (bandwidth), to set up and set down the logical channels and to send the flow control messages. In each call, there will be any number of logical channels of each type (audio, video or data) but only one control logical channel, channel 0.
    • H.225 Call signalling function: It uses one logical channel of signaling to send and receive establishment and end session messages between two H.323 end points. The channel is independent of the H.245 control channel. The H.245 logical channel opening and closing procedures are not used to establish the signaling channel. It is opened before the establishment of the H.245 control channel and any other logical channel. It can settle down terminal to terminal or terminal to gatekeeper.
    • RAS Control function (Register, Admission, Status): It uses a logical channel of RAS signaling to carry out registry, admission, situation and bandwidth changes procedures between end points (terminal, gateway..) and gatekeeper. It is only used when a gatekeeper is present. The RAS signaling channel is independent of the call signaling channel, and of the H.245 control channel. The H.245 logical channel opening and closing procedures are not used to establish the RAS signaling channel. The RAS control function channel is opened before the establishment of the any other logical channel
  7. H.225 Layer: It is in charge to give format to the video, audio, and transmitted data packets. In addition, it also takes care of the packet alignment, the sequential number and the error detection.
  8. Packet network interface: It is specific of each implementation. This must provide the services described in the H.225 recommendation. That means that the reliable end-to-end service (for example, TCP) is mandatory for the H.245 control channel, the data channels and call signaling channels.

The non reliable end-to-end service (UDP, IPX) is mandatory for the audio channels, the video channels and the RAS channel These services can be duplex or simplex and unicast or multicast depending on the application, the terminals capacities and the network configuration.


H.323 conference gateways make H.323 terminals on a LAN available to H.323 terminals on a wide area network (WAN) or another H.323 gateway.

These devices provide many services, including the needed translation function between H.323 conferencing endpoints on the LAN and other ITU-compliant terminals on other ITU-compliant circuit-switched and packet-switched networks.

These services include the translation mechanism for call signaling (for example H.245 to H.242), data transmission (for example H.225.0 to H.221), and audio and video transcoding.

The Gateway also does the translation between audio and video CODECs if this is needed as well as performs call setup and clearing.

Gateways satisfy part of the interoperability vision of H.32x products due to the ability to connect to each other.

Gateways can serve the following purposes:

  • To bridge an H.323 call to another type of call, such as a telephone.
  • To bridge H.323 calls to H.320, which is audio and video transmission over Integrated Services Digital Network (ISDN) .
  • To bridge H.323 calls to H.324, which is audio and video transmission over standard telephone lines.
  • To bridge different networks; an organization could put a bridge on a firewall to connect an internal corporate network with external networks to accept incoming calls.

In this case, gateway functions are similar to an MCU for connecting people over networks. Typically, though, the gateway is the translation mechanism in a point-to-point connection, where only one endpoint is an H.323 device. On the other hand, an MCU typically connects many H.323 devices in a multipoint conference.


Gatekeepers provide network services to H.323 terminals, MCUs, and gateways. H.323 devices register with gatekeepers to send and receive H.323 calls. Gatekeepers give permission to make or accept a call based on a variety of factors.

Gatekeepers can provide network services such as:

  • Controlling the number and type of connections allowed across the network.
  • Helping to route a call to the correct destination.
  • Determining and maintaining the network address for incoming calls.

Gatekeepers perform two important functions. The first is address translation services between aliases for terminals and gateways and IP or IPX addresses.

The second Gatekeeper function is bandwidth management. For instance, if a LAN manager-specified threshold for the number of simultaneous conferences on the LAN has been reached, the Gatekeeper can refuse to make any more connections.

The collection of all Terminals, Gateways, and Multipoint Control Units managed by a single gatekeeper is known as an H.323 Zone.

A Gatekeeper takes care of the following services if is present (it is not always necessary):

  • Admission control: The gatekeeper can deny calls from a terminal because they do not have permission or they do not have permission at that moment of the day or some other criteria. Admissions Control may also be a null function which admits all requests.
  • Bandwidth control: The Gatekeeper controls the number of H.323 terminals which can be simultaneously connected and to deny calls if the bandwidth is too low.
  • Zone Management: The Gatekeeper provides the above functions for terminals, MCUs, and Gateways which have registered within its Zone of control.

Multipoint Control Unit

A Multipoint Control Unit (MCU) in an H.323 conference, also called conferencing servers or conferencing bridges, is an endpoint on the H.323 zone that allows three or more H.323 terminals and gateways to connect and participate in a multipoint conference.

An MCU includes both multipoint controllers, which manage the H.323 terminal functions and capabilities in a multipoint conference, and multipoint processors, which process the audio, video, and data streams between H.323 terminals.

An MCU can also connect two terminals in a point-to-point conference that can later develop into a multipoint conference.

Multipoint Controller

A multipoint controller is a component of H.323 that provides negotiation capacity with all the terminals to carry out different communication levels. Also, it can control conference resources such as video multicasting.

Multipoint Processor

A multipoint processor is a hardware and specialized software component of H.323. It mixes, exchanges and processes audio, video and/or data flow for the participants of a multipoint conference, in such a way that the terminal processors are not used heavily. The multipoint processor can process an only average flow or multiple average flows depending on the supported conference.

H.323 Proxy

An H.323 proxy server is a proxy specifically designed for the H.323 protocol that examines packets between two communicating applications. Proxies are able to determine the destination of a call and perform call-connection steps, if necessary.

H.323 proxies perform the following key functions:

  • Voice terminals that do not support Resource Reservation Protocol (RSVP) can connect through remote access or local area networks (LANs) with relatively reliable quality of service (QoS) to the proxy. Pairs of proxies can then be employed to develop tunnels across the IP network.
  • Proxies support routing H.323 traffic separately from ordinary data traffic using application-specific routing (ASR).
  • Proxies are compatible with network address translation functions in gateways or gatekeepers, enabling H.323 to be deployed in networks using private address space.

H.323 Protocol stack


The most known protocols used in H.323 are:

  • RTP/RTCP (Real-Time Transport Protocol / Real-Time Transport Control Protocol): Internet-standard protocol for the transport of real-time data, including audio and video. RTP is used in virtually all voice-over-IP architectures, for videoconferencing, media-on-demand, and other applications. A thin protocol, it supports content identification, timing reconstruction, and detection of lost packets.
  • RAS (Registration, Admission and Status): A protocol for Registration, Admission and Status. In an H.323 audio or video system, the RAS is a control channel over which H.225.0 signaling messages are sent.
  • H225.0: Protocol used to describe call signaling, the media (audio and video), the stream packetization, media stream synchronization and control message formats
  • H.245: Control protocol for multimedia communication, describes the messages and procedures used for opening and closing logical channels for audio, video and data, capability exchange, control and indications

It manages the following functionalities:

  1. Interchange of capacities: Terminals define the codecs they have and they send it to the other end point.
  2. Opening and closing logical channels: H.323 audio and video channels are point to point and unidirectional. Therefore they will have to create at least two of these channels. This is responsibility of H.245.
  3. Flow Control when there is a problem.
  4. A lot of different small functions.
  • Q.931: A protocol for Call Signaling, consisting of Setup, Teardown and Disengage. Q.931 is included in the H.225.0 Recommendation
  • RSVP (Resource ReSerVation Protocol): Protocol for reserving network resources to provide guaranteed application QoS (Quality of Service)
  • T.120: Standard for data conferencing and conference control for interactive multimedia communication - multipoint & point-to-point.

The following codecs are recommended by the H.323 standard (for more details see H323 Audio and Video Codecs):

  • G.711: ITU-TSS recommendation "Pulse code modulation (PCM) of voice frequencies". This audio standard is mandatory for all video conferencing systems. It requires a data rate of 56 or 64 kbit/s.
  • H.261y H.263: Video codecs of H.323 standard. However, other ones can be used.

H.323 clients

For some possible H323 (and SIP) VoIP clients you can check this list The only two that support both H323 and linux are:

Another option is the commandline ohphone, directly derived from the OpenH323 project protoclo stack by the same developers team. It has no graphical interface but it's recommended for testing purposes.


Ekiga can be downloaded from the ekiga page. The OPAL and PWLIB libraries have to be installed first. This is a tedious and time comsuming process (lots of compiling needed). Our own Ekiga experiences can be found at Ekiga


Kphone is another SIP client used to shortly test the basic setup of Asterisk.


Gnomemeeting is the predecessor of Ekiga. Configuration in Gnomemeeting section.


Very straightforward download from SJ Labs (VoIP software). You get the following interface.


Note: At least the Windows version of SJ-phone is still using SIP, even when this is disabled in the configuration. When using 2 instances of SJphone, to call eachother, the connection is true H323.

H323 and TrixBox

We tried to look up some information on H323 and TrixBox. From the information on the web it seems that this is a tricky topic. Many questions, few answers. People seem to have trouble with it. I haven't been through all the correspondence on the TrixBox forum but it seems that most people have a hard time even setting up a connection (lots of configuring) and when they do succeed the connection dies after 15-30 seconds. We will look into this a bit more tomorrow.

Example H.323 call


In the following example we could see a H.323 call. The different colors show different protocolos.


A H.323 call has 4 different processes:


  • Terminal 1 register itself with the gatekeeper using the RAS protocol (Register, admisaion, status) sending an ARQ message and receiving an ACF message.
  • Using H.225 protocol (used for setup and release of the call) terminal T1 sends a SETUP message to T2 requesting a connection. This message contains the IP address, port and alias of the calling user or the IP address and port of the called user.
  • T2 sends a CALL PROCEEDING message warning on the attempt to establish a call
  • Now, T2 terminal must register itself in the gatekeeper as T1 previously do.
  • Alerting message indicates the beginning of tone generation phase.
  • And finally, CONNECT message shows the beginning of the connection.

Control Signalling

In this phase a negotiation using H.245 protocol is opened (conference control), the interchange of the messages (request and answer) between both terminals establishes who will be the master and who the slave, the capacities of the participants and the audio and video codecs to be used. When the negotiation finishes the communication channel is opened (IP addresses, port).

The main H.245 messages used in this step are:

  • TerminalCapabilitySet (TCS). Message capabilities supported by the terminals that take part in a call
  • OpenLogicalChannel (OLC). Message to open the logical channel which contains information to allow the reception and codification of the data. It contains information of the data type that will be sent.


Terminals start the communication using the RTP/RTCP protocol.

Call Release

  • The calling or the called terminal can initiate the ending process using the CloseLogicalChannel and EndSessionComand messages to finish the call using again H.245.
  • Then using H.225 the connection is closed with the RELEASE COMPLETE message.
  • And finally the registration of the terminals in the gatekeeper are cleared using RAS protocol.

H323-modes explained

Whiteboard Fabio.jpg

H323-related links