Computation Graphs

Overview

Scanner represents applications as computation graphs. The nodes in computation graphs are Scanner operations (check out the operations guide for more information) and the edges between operations represent streams of data consumed and produced by operations. Nodes in a computation graph are one of four types:

  • Input nodes (sc.io.Input): read data from stored streams (Stored Streams), such as from videos or previously generated metadata.

  • Operation nodes (sc.ops.XXX): represent functions that transform their inputs into new outputs, such as performing a resize of a frame.

  • Stream operation nodes (sc.streams.XXX):

  • Output nodes (sc.io.Output): write data to empty stored streams.

For example, let’s look at a computation graph with an operation that resizes frames:

import scannerpy as sp
import scannertools.imgproc

sc = sp.Client()
video_stream = sp.NamedVideoStream(sc, 'example', path='example.mp4')
input_frames = sc.io.Input([video_stream])
resized_frames = sc.ops.Resize(frame=input_frames, width=[640], height=[480])
output_stream = sp.NamedVideoStream(sc, 'example-output')
output = sc.io.Output(resized_frames, [output_stream])

Here, the sc.io.Input, sc.ops.Resize, and sc.io.Output operations are nodes in a three node graph. sc.ops.Resize is connected to sc.io.Input through passing its output, input_frames, to the resize operation, frame=input_frames. Likewise for sc.io.Output and sc.ops.Resize, but Output() operations also bind an edge from a computation graph to an empty stored stream (output_stream here) to be filled in with the sequence of elements produce from that edge. Importantly, note that we have not processed any data at this point: we have only defined a computation graph that we can tell Scanner to execute. Let’s do that now:

sc.run(output)

This operation will kick-off a Scanner job that will read all the elements in the input video_stream and write outputs to output_stream.

The rest of this guide goes into further detail on the capabilities of computation graphs.

Multiple inputs and outputs

Computation graphs can have any number of inputs and outputs. Here’s an example graph with one input and two outputs:

video_stream = sp.NamedVideoStream(sc, 'example', path='example.mp4')
input_frames = sc.io.Input([video_stream])

large_frames = sc.ops.Resize(frame=input_frames, width=[1280], height=[720])
small_frames = sc.ops.Resize(frame=input_frames, width=[640], height=[480])

large_stream = sp.NamedVideoStream(sc, 'large-output')
large_output = sc.io.Output(large_frames, [large_stream])

small_stream = sp.NamedVideoStream(sc, 'small-output')
small_output = sc.io.Output(small_frames, [small_stream])

sc.run([large_output, small_output])

Notice how we pass both outputs to the run() method. Scanner only runs the portions of the graph needed to produce the streams for the outputs passed to run.

Batch processing of stored streams

Often, one has a large collection of videos that they want to run the same computation graph over. Scanner supports this via batch processing of input and output streams. To process a batch of streams, create a list of Stored Streams representing the input videos and then pass that list to the input operation:

input_streams = [
    NamedVideoStream(sc, 'example1', path='example1.mp4'),
    NamedVideoStream(sc, 'example2', path='example2.mp4'),
    ...
    NamedVideoStream(sc, 'example100', path='example100.mp4')]
input_frames = sc.io.Input(input_streams)
resized_frames = sc.ops.Resize(frame=input_frames,
                               width=[640, 1280, ..., 480],
                               height=[480, 720, ..., 360])

Note that this is different from having multiple inputs or outputs to a computation graph. This graph still has only one input because each video in the batch is processed independently. Conceptually, you can think of batch processing as executing a separate instance of the graph for each input stream in a batch. Notice the other change that we made to this graph: the width and height arguments to Resize are now lists of the same length as input_streams. This is because height and width are stream config parameters to Resize: each input stream gets its own set of parameters. Check out the Stream Config Parameters section to learn more about how stream config parameters work.

We also need a corresponding output stream for each of our input streams:

output_streams = [
    NamedVideoStream(sc, 'example1-resized'),
    NamedVideoStream(sc, 'example2-resized'),
    ...
    NamedVideoStream(sc, 'example100-resized')]
output = sc.io.Output(resized_frames, output_streams)

When executing this graph, Scanner will read and process each input stream independently to produce the output streams. If Scanner is running on a multi-core machine, multi-GPU machine, or on a cluster of machines, the videos will be processed in parallel across any of those configurations.

Stream Operations

Most operations are restricted to produce a single output element for each input element they receive. However, sometimes an application only needs to process a subset of all of the input elements from a stored stream. Scanner supports this using stream operations. For example, if an application only requires every third frame from a video, we can use a Stride() operation:

input_frames = sc.io.Input([video_stream])
resized_frames = sc.ops.Resize(frame=input_frames, width=[640], height=[480])
sampled_frames = sc.streams.Stride(resized_frames, [3])

If video_stream is of length 30, then sampled_frame will be a sequence of length 10 with the frames at indices [0, 3, 6, 9, … 27]. Scanner also supports other types of stream operations, such as Gather(), which selects frames given a list of indices:

sampled_frames = sc.streams.Gather(resized_frames, [[0, 5, 7, 29]])

To see the full list of stream operations, check out the methods of StreamsGeneator.