25.3
Synthetic data

Capture synthetic data with virtual cameras

Synthetic data can be used to support path planning, track objects, and for visualization purposes in robotic applications.

In order to capture synthetic data, e.g. localization and geometric information of work pieces, the Wandelbots NOVA extension API enables you to configure camera objects in NVIDIA Isaac Sim™ and capture images.

Data types
Color image
Image normals
Depth
Point cloud
Bouding boxes
Segmentation data

Prerequisites

  1. Make sure that the NVIDIA Isaac Sim™ extension is installed and connected to the RobotPad.
    If the scene is not automatically displayed, click on the 3D view icon and select the NVIDIA Omniverse™ icon.
  2. Open the Wandelbots NOVA extension API (Omniservice API) by clicking on the Wandelbots NOVA tab in the NVIDIA Omniverse™ menu bar.
    You'll need to indicate object prim paths for almost every API call. You'll find the prim path in the NVIDIA Isaac Sim™ scene properties.

Configure camera

  1. You need to make sure that the scene you're using has a camera object:
    Use camera > GET /omniservice/camera/ to get an overview on all configured cameras in the scene.

You can configure the camera to replicate the desired camera used in the physical robot cell based on the camera's intrinsic parameters. Doing so will override the previously set properties, so in case you want to reuse the previous configuration, use camera > GET /omniservice/camera/{camera_name}.

Some camera manufacturers, e.g. Zivid, provide an SDK that can be used to retrieve the required camera matrix. Alternatively, use the camera's specification sheet or calibration methods to adjust the camera parameters to the desired values.

via API

  1. To change the camera's parameters, set the desired camera as the active camera with camera > POST /omniservice/camera/active.
    The scene view will adjust to the active camera.

  2. Create a .json file containing the desired configuration. Refer to the configuration examples below:

    Pinhole camera with a resolution of 1280 x 720:

    {
        "identifier": "<CAMERA_NAME>", // the camera object name, e.g. "pinhole_camera"
        "prim_path": "<CAMERA_PATH>", // the camera's prim path in NVIDIA Isaac Sim, e.g. "/World/Camera"
        "cam_params": {
            "camera_model": "pinhole",
            "resolution": [1280, 720],
        }
    }

    Distortion cannot be set for pinhole cameras.

  3. Use camera > POST /omniservice/camera/ to change the active camera's configuration.
    To verify that the camera was configured correctly, use camera > GET /omniservice/camera/{camera_name}.
    The scene view will adjust to the new camera configuration.

via Wandelscript

Use Wandelscript in Robot Pad's Wandelscript editor to configure your camera, e.g. a pinhole camera:

def configure_camera(camera_name):
    url = "https://<INSTANCE_ADDRESS>:<OMNIVERSE_PORT>/omniservice/camera/"
    options = {
        method: "POST",
        body: {identifier: camera_name, prim_path: "/World/Camera", cam_params: {camera_model: "pinhole", resolution: [1280, 720]}}
    }
    response = fetch(url, options)
    if response.status_code==404:
        print(response.data.detail)
 
configure_camera("camera")

Define semantic labels

Once the camera is configured, you have to tell the camera which objects in the scene are of interest by assigning semantic labels. When capturing point clouds, bounding boxes and segmentation data, only data which is assigned a label will be captured.

  1. Use object > GET /object/get_all_semantic_labels to get an overview of all existing labels in the scene.
  2. To add a new label, use object > PUT /object/set_semantic_label.

Make sure that the labels are correctly assigned to the objects in the scene. Otherwise, the captured data might return empty. Read more on semantic labelling (opens in a new tab).

Capture data

Once the camera knows which objects to capture, you can start capturing data. Make sure that the camera has all desired objects in its field of view.

Color image

Captures an RGB image. The output can be a color image for visualization or raw numerical pixel values for further processing.

camera > GET /omniservice/camera/{camera_name}/color

Image normals

Captures surface normal vectors of objects in the scene, providing directional information for each pixel. The output can be a colorized normal map for visualization or raw numerical normal vectors, useful for shading and physics calculations.

camera > GET /omniservice/camera/{camera_name}/normals

Depth

Captures absolute depth information for each pixel in the camera’s view, representing the distance from the camera to scene objects. The output can be a grayscale depth map for easy visualization or raw depth values for precise scene understanding. The depth values are expressed in scene units (mm, cm, m) as defined in the NVIDIA Isaac Sim™ stage settings.

camera > GET /camera/{camera_name}/depth

Point cloud

Captures a 3D spatial representation of the scene by extracting depth, color, and normal data from the camera view. The output consists of point cloud data, including 3D coordinates, surface normals, and color information, enabling applications in 3D space.

To optimize data size, a downscale factor can be applied to reduce points, e.g., 0.5 to reduce points by 50%.

camera > GET /omniservice/camera/{camera_name}/pointcloud

Bounding boxes

To capture bounding box data, all labels configured in the scene are used. To only capture data for a specific label, indicate the object_class.

2D

Captures 2D bounding boxes around detected objects which helps to determine an object's location within an image. The output can be a visualized bounding box overlay or numerical coordinates (x_min, y_min, x_max, y_max) for each detected object.

camera > GET /omniservice/camera/{camera_name}/bounding-box-2d

3D

Captures 3D bounding boxes that define object positions, orientations, and dimensions in 3D space. The output can be a colorized bounding box visualization or structured bounding box data (x_min, y_min, x_max, y_max, z_min, z_max).

camera > GET /omniservice/camera/{camera_name}/bounding-box-3d

Segmentation data

To capture segmentation data, all labels configured in the scene are used. To only capture data for a specific label, indicate the object_class.

Instance

Differentiates individual objects in the scene by assigning a unique label to each detected entity.

The output can be a color-coded instance map where each object appears in a unique color or raw numerical instance IDs for precise object tracking.

camera > GET /omniservice/camera/{camera_name}/instance-segmentation

Semantic

Performs semantic segmentation, classifying pixels based on object labels defined. The output can be a colorized segmentation map where each object class is visually distinguishable or raw segmentation labels where each pixel corresponds to a numerical class ID.

camera > GET /omniservice/camera/{camera_name}/semantic-segmentation

Use data

Once you've collected the data, you can use it for various applications, such as a pick and place scenario:

# ---------------- Configuration ----------------
BASE_URL = "" # e.g. "http://localhost:8011/omniservice/"
GP8_CONTROLLER = get_controller("gp8")
GP8_ROBOT = GP8_CONTROLLER[0]
GRIPPER = "gripper"
 
# ---------------- Reference positions ----------------
HOME = (380, 0, 300, -2.22, -2.22, 0)
DESTINATION_POSE = (485, 0, -100, 0, 3.14, 0)
POSE_OFFSET = (0, 0, 50, 0, 0, 0)
 
# ---------------- Prims ----------------
WORKPIECE_PATHS = ["/World/cell/workspace_yaskawa_GP8/workpiece_01",
                "/World/cell/workspace_yaskawa_GP8/workpiece_02"]
WORKPIECE_LABEL = "workpiece"
CAMERA_NAME = "camera"
CAMERA_PATH = "/World/Camera"
 
# ---------------- Common Functions ----------------
def encode_path(url):
    return replace(url, "/", "%2F")
 
 
# ---------------- Synthetic Data Functions ----------------
def configure_camera(camera_name):
    url = BASE_URL + "camera/"
    options = {
        method: "POST",
        body: {identifier: camera_name, prim_path: "/World/Camera", cam_params: {camera_model: "pinhole", resolution: [1280, 720]}}
    }
    response = fetch(url, options)
    if response.status_code==404:
        print(response.data.detail)
 
def set_semantic_label(path, label):
    url = BASE_URL+ "object/set_semantic_label?prim_path="+encode_path(path)+"&label="+label
    fetch(url, {method: "PUT"})
 
def get_object_prims(camera_name, label):
    url = BASE_URL+"camera/" + camera_name + "/bounding-box-2d"
    options = {
        method: "GET",
    }
    boxes_data = fetch(url, options)
    prim_paths = []
    for idx in 0..<len(boxes_data.data):
        prim_paths = prim_paths+[boxes_data.data[idx].prim_path]
 
    return prim_paths
 
def get_pose(path):
    url = BASE_URL+"object/get_pose?prim_path="+encode_path(path)
    pose = fetch(url, {method: "GET"})
    object_pose = pose.data.pose
    return object_pose
 
# ---------------- Tool Functions ----------------
def close_gripper():
    sync
    write(GP8_CONTROLLER, "10010#0001", False)
    write(GP8_CONTROLLER, "10011#0002", True)
    sync
    wait(2000)
 
def open_gripper():
    write(GP8_CONTROLLER, "10010#0001", True)
    write(GP8_CONTROLLER, "10011#0002", False)
    sync
    wait(2000)
 
 
# ---------------- Robot Functions ----------------
def move_robot_to_home():
    tcp= frame(GRIPPER)
    do with GP8_ROBOT:
        velocity(100)
        move tcp via p2p() to HOME
 
 
def pickup_object(pose):
    tcp= frame(GRIPPER)
    do with GP8_ROBOT:
        velocity(100)
        move tcp via p2p() to POSE_OFFSET::pose
        velocity(20)
        move tcp via line() to pose
 
    close_gripper()
    do with GP8_ROBOT:
        velocity(100)
        move tcp via line() to POSE_OFFSET::pose
 
 
def drop_object(robot):
    tcp= frame(GRIPPER)
    do with GP8_ROBOT:
        velocity(100)
        move tcp via p2p() to POSE_OFFSET::DESTINATION_POSE
        move tcp via line() to DESTINATION_POSE
 
    open_gripper()
    do with GP8_ROBOT:
        move tcp via line() to POSE_OFFSET::DESTINATION_POSE
 
 
# ---------------- Main code ----------------
# configure a camera in omniverse
configure_camera(CAMERA_NAME)
move_robot_to_home()
 
# set semantic labels (required only once to assign labels)
for i=0 ..<len(WORKPIECE_PATHS):
    set_semantic_label(WORKPIECE_PATHS[i], WORKPIECE_LABEL)
 
 
# get prims of workpieces in camera field of view
prim_paths = get_object_prims(CAMERA_NAME, WORKPIECE_LABEL)
 
open_gripper()
 
# fetch poses for each prim
for i=0 ..<len(prim_paths):
    pose = get_pose(prim_paths[i])
 
    # move robot to this pose, pick the object, move it to a destination pose and return back home
    obj_pose = (pose[0], pose[1], pose[2], pose[3], pose[4], pose[5])
    print("Picking object " + prim_paths[i] + " at " + string(pose))
 
    pickup_object(obj_pose)
    move_robot_to_home()
    drop_object()
    move_robot_to_home()