iCub-main
|
The YARP Gaze Interface provides an abstract layer to control the iCub gaze in a bio-plausible way, moving the neck and the eyes independently and performing saccades, pursuit, vergence, OCR (oculo-collic reflex), VOR (vestibulo-ocular reflex) and gaze stabilization relying on inertial data.
Essentially, the interface exports all the functionalities given by the iKinGazeCtrl module directly from within the code without having to be concerned with the communication protocol done via YARP ports. Thus, the user is warmly invited to visit the iKinGazeCtrl page where more detailed descriptions can be found.
In order to use the Gaze Interface, make sure that the following steps are done:
Launch:
The server will load its parameters from the default configuration file (which can be overridden by the context and the from command-line options) containing information on the camera intrinsic parameters which are required to use some of the methods provided by the Gaze Interface, and specifically the lookAtMonoPixel() and lookAtStereoPixel() method.
Furthermore, by using the installed copy of $ICUB_ROOT/main/app/iCubStartup/scripts/iCubStartup.xml application the user is also able to launch the gaze server contextually with the iCubInterface module.
The Gaze Interface can be opened as a normal YARP interface resorting to the PolyDriver class:
When you are done with controlling the robot you can explicitly close the device (or just let the destructor do it for you):
The YARP documentation contains a full description of Gaze Interface. It makes use of three different coordinate systems that enable the user to control the robot's gaze as detailed hereafter.
This coordinate system is the same as the one the Cartesian Interface relies on and complies with the documentation; it lets the user to give the target location where to gaze at with respect to the root reference frame attached to the robot's waist.
The following snippet of code shows how to command the gaze in the Cartesian space and to retrieve the current configuration.
We have also here sync and non-sync methods for yielding gazing movements, therefore we refer the reader to the Cartesian Interface to get insights on their peculiarities.
This coordinate system is an absolute head-centered angular reference frame as explained below:
For instance, to specify a desired location where to look at in this coordinate system, one can write:
Also the relative angular coordinate system is available which is basically the same as the absolute one but refers to the current configuration of the head-centered frame. Hence we can use:
The user can also command the gaze by specifying a location within the image plane of one camera as follows:
What the interface will internally execute is to retrieve the corresponding 3D point through a call to get3DPoint() to then use this result to command an equivalent Cartesian displacement as in the following:
The benefit of lookAtMonoPixel() method is clearly that the time for the rpc communication is saved, hence it is particularly designed for control in streaming mode; on the other hand, the knowledge of the Cartesian point remains hidden to the user, unless it is retrieved at the end of the motion through a call to getFixationPoint().
Alternatively, one can think to rely on the vergence angle given in degrees in place of the the component z, accounting for a different way of expressing distances from the eyes. To this end, the user can call the proper method lookAtMonoPixelWithVergence(camSel,px,ver).
The same thing can be achieved also by exploiting the stereo vision:
Of course, the matching problem between pixels of different image planes is left to the user, who has also to provide a continuous visual feedback while converging to the target.
It might be useful sometimes to perform an homography in order to retrieve the projection of a pixel into a plane specified in the 3D space: let's think for instance to the context of the robot presented with a number of objects that all lie on a table. The table introduces a constraint that allows determining the component z of the point in case the monocular approach is used.
It is possible to execute a triangulation to find from the pixels in the images the corresponding 3D point in the space. This problem is solved through a least-squares minimization.
Importantly, the triangulation is strongly affected by uncertainties in the cameras alignment, so that, unless these unknowns are perfectly compensated, to fixate in stereo mode it is advisable to rely on the method lookAtStereoPixels().
The user has the possibility to enable the fast saccadic mode that employs the low level position control in order to generate very fast saccadic movements of the eyes. To achieve that, one can simply do the following:
After a saccade is executed, the next saccadic movement can be performed only after a given inhibition period of time, that in turn can be retrieved and/or tuned relying on specific methods (i.e. get/setSaccadesInhibitionPeriod()).
The controller chooses to perform a saccade only if the angular distance of the target from the straight-ahead line overcomes a given threshold the user might profitable tune relying on dedicated methods (i.e. get/setSaccadesActivationAngle()).
Caveat: vision processing algorithms that assume the continuity of the images flow might be heavily affected by saccades.
The controller can be instructed to do its best in order to keep the fixation point unvaried under the effect of external disturbances/movements. The gaze stabilization makes use of inertial data to accomplish the job and entails that corrections will be sent to the eyes too (therefore, it is not a mere head stabilization). To enable the gaze stabilization, call the following method:
The stabilization is always active also during point-to-point motion or while tracking a moving reference, regardless of the current setting for the above mode and unless the stabilization has been purposely disabled (via command-line option), so that any disturbance occurring during that motion will be compesated for. In this respect, the difference between the stabilization mode and the tracking mode is that in the former modality the fixation point is stabilized in the "world" coordinate system, thus this mode turns to be particularly suitable for robot balancing and walking, while in the latter modality the fixation point will be always tracked in the robot's root reference frame, which might be moving with respect to the world as well.
When the stabilization is active and the robot is purely compensating for external disturbances, the neck and eyes limits customized by the user are not taken into account, thus spanning their whole admittable range.
We define here the context as the configuration in which the controller operates: therefore the context includes the current tracking mode, the eyes and neck trajectory execution time and so on. Obviously the controller performs the same action differently depending on the current context. A way to easily switch among contexts is to rely on storeContext() and restoreContext() methods as follows:
Unless the user needs the interface just for logging purposes, it's a good rule to store the context at the initialization of his module in order to then restore it at releasing time to preserve the controller configuration.
Note that the special context tagged with the id 0 is reserved by the system to let the user restore the start-up configuration of the controller at any time.
The Gaze Interface provides also an easy way to register events callbacks. For example, an event might be represented by the onset of the movement when the controller gets activated by receiving a new target or the end of the movement itself. The user can then attach a callback to any event generated by the interface.
It is required to inherit from the specific class GazeEvent that handles events and overrides the gazeEventCallback() method.
To know which events are available for notification, the user can do:
The class GazeEvent contains two structures: the gazeEventParameters and the gazeEventVariables. The former has to be filled by the user to set up the event details, whereas the latter is filled directly by the event handler in order to provide information to the callback.
For instance, to raise a callback at the middle point of the path, one can do:
The special wildcard "*" can be used to assign a callback to any event, regardless of its type, as done below.