Background

There is a need to detect defects in parcels at the e-commerce picking stations for the following reasons:

  • Proof of quality in customer complaint processes
  • Stopping damaged packages from being shipped out

A customer has requested the potential use of the Aivero software service to capture, transport and store 2d and 3d images and video, as well as support for the most popular AI annotation tools to enable live inference on trained data.

Aivero has helped set up a demo station running the Aivero software (client+ server) with an edge device and a Realsense D455 camera. This demo station is working fine from a Software point of view now that this feature was added (as of March 23rd 2023):

  • Immediate triggering of recordings.

Scope of this report

The next stage in the project is finding a suitable camera that can provide sufficient quality and detect damages and defects to the handled goods.

Camera candidates

We have reviewed the following six popular AI/3D cameras:

Mechmind Pro (5,000 USD)

Mechmind Pro

Zivid Camera (7,000 USD)

Zivid

Intel RealSense D405 (300 USD)

RealSense D405

Intel RealSense D455 (400 USD)

Zed 2 (450 USD)

Zed 2

Microsoft Azure Kinect (700 USD)

Microsoft Azure Kinect

Only the first three cameras above are presented in this report since those generated the best results:

  • Mechmind Pro
  • Zivid camera (Zivid 1 and 2 have almost identical quality)
  • Realsense D405

Tests sample

A test sample was selected: A TP-link router box.

Front

Back

Defect/damage categories

We performed our test by adding two damage categories to the parcel:

Three different dent classes in the package with increasing depth/damage impact:

Slight damage to the plastic foil covering the parcel:

Camera test

All tests have been carefully tweaked with optimal lighting and camera settings to ensure maximum quality.

Mechmind Pro Enhanced Camera
Defect category 1

Front view:

Side view:

We could easily detect all three dent categories:

Defect category 2

Front view:

Side view:

Plastic foil damage is just slightly visible in the point cloud. Likely hard to detect:

Zivid Camera
Defect category 1

Front view:

Side view:

All dent classes are visible

*Normal depth-only view Zivid app proves helpful:

The three defects are very easy to detect with this filter. In addition, minor defects were visible top left.

Defect category 2

Front view:

Side view:

The plastic foil has some visible signs of damage, which could be challenging to detect.

*Standard depth view (depth only) shows the damage better than the point cloud:

Intel RealSense D405
Defect category 1

Front view:

Side view:

The D405 has an impressive point cloud quality compared to its cost. It is a little behind the Zivid, provided that the camera has the right light conditions and is positioned around 8-10cm above the object (which could match well alongside a gripper). The high FPS makes it possible to select the best frames for analysis, and the robot arm does not need to stop to take pictures.

Defect category 2

Front view:

This camera generates a much more precise point cloud than other Realsense cameras when moving it closer to objects. The below images are from an approximate 9cm distance.

Side view:

The D405 shows the category two defects better than the Mechmind and the Zivid cameras in the point cloud view.

Training neural networks

The intention is to train neural networks to detect damage via 3D vs 2D images (or both):
One could experiment on a combination of neural networks trained on 2D and 3D images to provide the best possible ways to detect defects.

Further work is needed to conclude the best approach for detection. The complexity of the training is likely to be less with 3d cameras since you have a flat surface as a reference and only need to detect deviations (for boxes)

Conclusion

There is still a compromise to deal with when selecting 3d cameras. If choosing a 3D camera with a high-quality point cloud comes at the cost of low framerate and a high purchase cost. On the other hand, a high-framerate 3d camera usually comes at a lower price but with a significantly lower quality point cloud.

The high-quality point cloud cameras (notably the Zivid camera) could easily show the defect category 1 and 2 with a high probability for an AI to detect and classify those categories. The downside to this camera is a slow framerate (which will impact cycle time) and a high purchase price.

We tested the complete Realsense range of cameras and found that the D405 can increase point cloud precision when operating close to the object. The camera could show category 1 dent class 2 and 3 and category 2 defects in the 8-10cm range from the observed objects. Suitable for on-arm operation alongside a gripper.

Another benefit of the D405 is that it has a framerate of 30FPS, which makes it easy to capture video and pick frames. The robot will not need to stop to take pictures. This camera will likely not impact cycle time.

The benefit of a depth camera is that it will likely not require a lot of training. It will only need to look for deviations from a flat surface, which is only relevant for packages with flat surfaces.

The Aivero software supports capture, stream and recording of both 2D and 3D frames and video, and has interfaces to widely used annotation and inference tools and is therefore flexible to accommodate this application. Because of the low cost, low weight and small size, we recommend further testing with the D405 to determine whether it matches the customer’s picking stations. A combination with a 2D camera could also be an option to increase flexibility for training but will add complexity in terms of HW and training of networks.