7.1 Multithreading and Multiprocessing
3Delight can render an image using both multithreading and multiprocessing. Additionally, 3Delight is able to render an image using any number of machines that are reachable on the network (and potentially using many threads or processes on each machine). This very complete set of functionalties makes 3Delight usable on any multiprocessor hardware configuration(42).
This section explains threading and multiprocessing in more detail and gives hints on when to use one or the other.
| 7.1.1 Multithreading | ||
| 7.1.2 Multiprocessing | ||
| 7.1.3 Network Rendering | ||
| 7.1.4 Performance and Implementation Notes | ||
| 7.1.5 Licensing Behavior |
7.1.1 Multithreading
By default, 3Delight starts the render using as many threads as there are processors. A thread is different from a process in that it runs in the same memory space as the parent process, meaning that using many threads on a single image won't affect memory use significantly (unlike multiprocessing as explained in Multiprocessing).
One can override the number of threads used by passing the `-p' option to renderdl. For example,
% renderdl -p 3 frame1.rib
will use three threads to render an image. More about the `-p' option is explained in Using the RIB Renderer - renderdl.
The number of threads can also be specified inside the RIB using a dedicated RiOption. For example,
Option "render" "nthreads" 2
will use two threads to render the image.
3Delight will assign a small region of the image to each thread started. Typically, each region will have a size of 2x2 buckets.
7.1.2 Multiprocessing
A process is different from a thread in that it runs in its own memory space; this means that using multiprocessing will generally use more memory resources than a multithreaded render. The advantage of using processes instead of threads is that there is very little synchronization overhead between processes and this might lead to faster renders compared to multithreaded renders.
Multiple processes can be launched using `-P' command line option. For example,
renderdl -P 2 image.rib
will start a render using two processes.
The way renderdl splits the image in a multiprocess render is different from the multithreading case: the image is split in large areas (or tiles), each tile is assigned to one process. 3Delight doesn't know in advance how to efficiently cut the image into such tiles, for example, a process might be assigned to an empty region thereby leaving other processes to do more work and thereby wasting time. This is why 3Delight uses a load balancing strategy that records rendering times for each process and tries to better tile the image for the next render. This means that two consecutive renders of a frame will use different tiling strategies and the second render should use less time than the first one (refer to Performance and Implementation Notes for more details).
The tiling strategy can be manually forced using the `-tiling' command line option (Using the RIB Renderer - renderdl). Three different tiling strategies are supported: vertical, horizontal and mixed. For example, to use the mixed tiling mode, one would write:
% renderdl -P 4 -tiling m frame1.rib
Thes best splitting strategy depends on the image being rendered: choose the tiling so that the complexity is well distributed among the processes. Most outdoor scenes should be rendered using vertical tiling so that every process gets its share of the "sky" region where complexity is usually very low. If the complexity is uniformly distributed inside the scene, tiling orientation won't make much of a difference.
7.1.3 Network Rendering
The `-P' and `-p' options start processes on the local machine only; if rendering on many hosts is desired then renderdl should be provided with a list of usable machines through the `-hosts' option(43). But before using the remote rendering feature, it is essential that the following points are checked:
- Properly configured
rshorssh - By default, 3Delight uses
rshto start a rendering process on remote machines, this means that the machine that starts the render should have the required permissions on all used remote hosts(44). Ifrshis judged insecure, one could use SSH by providingrenderdlwith the `-ssh' option. But this requires passwordless SSH authentication and requires more setup than thershmethod. - 3Delight must be exectuable from each rendering machine
- This should be the case if the package is installed on a NFS drive. Note that it is important that the `.cshrc' or `.tcshrc' file contain all the environment variables necessary to run 3Delight. An easy way to verify that everything is setup properly on a remote machine is to invoke
rsh(orssh) manually:% rsh hostname renderdl somefile.rib
- The RIB file should be accessible from each remote machine.
- This means that the RIB should reside on a shared drive, such as a NFS drive.
Once everything is setup properly, one can perform network rendering using the `-hosts' option. For example, to compute an image using four machines (the current machine, origin1,leto and harkonnen), one would issue the following command (this will also open a framebuffer window):
% renderdl -hosts origin1,leto,harkonnen,localhost -id frame1.rib
Note that by default, every machine will use as many threads as possible; so if `origin1' has two CPUs, both will be used to render the part of the image that has been assigned to the machine. Additionally, a machine could be specified more than once, this will launch that many processes on the specific machine. For convenience, renderdl also accepts a text file describing the list of hosts:
% renderdl -hosts workers.txt,leto -id frame1.rib
This will read all the machines listed in the file `workers.txt' (each host name on a single line) and use those for rendering as well as the `leto' machine.
It is not required that the current machine is part of the hosts set: one could launch a single processor render on a different machine (eg. if the current machine is too loaded):
% renderdl -hosts worker -id frame1.rib
Note that different machine architectures can be used to compute a single image: it is possible to use both MacOS X and Linux machines to cooperate on a single image.
7.1.4 Performance and Implementation Notes
It is recommanded to use multithreading when rendering medium or complex scenes on a local machine. The number of threads should be set to the number of physical processors available. Some "hyperthreaded" machines show two processors even if there is only one physical CPU installed on the motherboard; using two processes on those architectures won't necessarily improve performance (and certainly won't double it). It is sometimes desirable to set one more thread than available when the rendering generates a lot of IO (such as texture or network access): having one more thread will give the operating system the oportunity to switch to an alternate rendering thread while another one is waiting for an IO operation to complete.
Using multiprocessing is recommanded for scenes where shading time is not important and most processing is spent in visibility computations. This usually happens in scenes with very simple shaders. Normal production scenes force the renderer to spend much more time in the shading stage (especially when using ray tracing) and multithreading is the right choice for these.
When rendering over a network, 3Delight uses the TCP/IP protocol to gather data from the remote hosts. This could put a heavy load on the network and on the machine that starts the render. When using a large pool of machines, one should make sure that the master machine is fast enough to handle all connections. This is particularly important when rendering deep shadow maps over the network.
There is a fundamental design difference between multiprocessing and multithreading: multiprocessing is implemented in the renderdl executable while multithreading is implemented in the 3Delight library. While this doesn't make any difference for users that render RIBs using the renderdl command, users that link with the 3Delight library can only use multithreading.
7.1.5 Licensing Behavior
In the multithreading case, the behavior is as follows:
- If enough licenses are available, the render will proceed normally.
- If there are fewer licenses available than requested threads, 3Delight will start the render with as many threads as possible and will dynamically add threads as more licenses are made available.
- If there are no licenses available, 3Delight will wait for at least one license and will automatically start the render. Threads will be added dynamically as more licenses are available.
In the multiprocessing case, each launched process waits for a license to be available.
3Delight 8.5. Copyright 2000-2009 The 3Delight Team. All Rights Reserved.