proposal for PAN SCAN region definition support in ndi

This page contains a technical proposal which is not currently a published or formal standard. If you have comments or feedback on the content of this proposal PLEASE contact us with your contribution.


The published NDI IP Video protocol SDK does not currently offer any guidance on definition of panscan subregions in an NDI Stream. However, NDI has multiple real time metadata mechanisms which could easily carry this information. This document is a proposal to standardise support for sub regions in order to prevent different implementations by each vendor by way of a recommended practice technical note.


Basic Premise:

The focus is based on the need to communicate multiple sub-regions defined within a single NDI video stream.  The regions can move around dynamically if desired - and hence the metadata is sent with every video frame. An NDI Receiver can accept the stream, and this associated metadata and thus distribute the subregions appropriately.  For example - a video wall system which uses overlapped or non contiguous zones within the master image can have the zones defined by the NDI Source.  Another example might be a subregion generated by an AI system analysing a locked off camera in a lecture hall - and following the speaker.   Other applications may be to communicate a safe-region within a stream. Finally this protocol is used to define 'social media' aspect variations, to allow an upstream system to define how a primary 16x9 stream should be subcut to provide variations such as portrait or square versions of the stream - focussed on the important content.


The documented XML metadata is passed within the *video frame based* metadata of any NDI video frame. It allows the definition and labelling of multiple sub regions. These regions may be defined manually by the sender, or they may be generated dynamically by an AI image analyser - for example following a human face speaking in an auditorium, or following a football on a sports pitch.



Example :

ndi_video_frame.p_metadata =




<WIDTH>1280px | 66.66pc</WIDTH>

<HEIGHT>720px | 66.66pc</HEIGHT>

<OFFSET_X>320px | 16.66pc<OFFSET_X>

<OFFSET_Y>180px | 16.66pc</OFFSET_Y>

<CONTROLLER> | ndi://MYCAMERA (video out) | visca://</CONTROLLER>




























Multiple regions can be defined within one NDI video frame

ID is a unique ID within the system, by default this can be an index.

LABEL is a human readable identifier for the sub-region

WIDTH,HEIGHT, OFFSET_X and OFFSET_Y can be pixels in the form :  NNNNpx or percent, in the form NN.nnpc (devices should tolerate a variable number of digits before or after the decimal point.

CONTROLLER allows the definition of a communications point for the control of the pan scan.  Initially 3 options are defined:

 -  which would be the address of an HTTP page for pan scan control, possibly a GUI, and optionally with a REST API

 - ndi://HOSTNAME (DEVICENAME) is the ndi address which could be used via published NDI PTZ control to adjust the pan scan.

 - visca:// is the address for Visca over IP control of the panscan with the /1 defining the device ID at that address.

AUX_ATTR is a field for application specific additional metadata.



If you have any questions, or you would like to engage Sienna for NDI Consultancy or Custom Development, please contact info @ sienna.tv


appsupport @ gallery.co.uk