The Cartesian Interface
Ugo Pattacini


The YARP Cartesian Interface has been conceived in order to enable the control of the arms and the legs of the robot directly in the operational space, that is instead of commanding new configuration for the joints, user can require for instance the arm to reach a specified pose in the Cartesian space, expressed as a combination of a 3d point to be attained by the end-effector (the center of the palm in this case) along with the desired orientation given in axis-angle representation.

Before proceeding, the reader might find helpful to recap some motor control fundamentals (see here) as well as to be introduced to the underneath topic of forward and inverse kinematics by going through the iKin documentation and tutorials.

The architecture of the Cartesian Interface is based on three components: 1) a complex nonlinear optimizer that handles the inverse kinematics problem; 2) a biologically inspired controller that represents the server part of the interface and is responsible for generating the human-like velocity profiles to move the system from its initial state to the joints configuration found by the solver; 3) a client part that exposes a collection of C++ methods to abstract the interface from its specific implementation.
Alternatively, the client can act as a pure wrapper over the solver coping with the inverse kinematics, whereas the user might consider developing his own controller meeting some specific requirements. However, also in this configuration, the server part needs to be launched (even if not employed) as a gateway to let the client communicate with the solver.

If you're going to use this controller for your work, please quote it within any resulting publication: U. Pattacini, F. Nori, L. Natale, G. Metta, G. Sandini, "An Experimental Evaluation of a Novel Minimum-Jerk Cartesian Controller for Humanoid Robots", IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, 2010.


In order to use the Cartesian Interface, make sure that the following steps are done:
[Note that the term cluster refers to the set of computers directly connected to the robot network, whereas icub-head indicates the hub board mounted on the robot]

  1. Install the required software dependencies.
  2. On the cluster: compile the repository with the switch ENABLE_icubmod_cartesiancontrollerclient enabled. This will make the client part of the interface available.
  3. On icub-head: compile the repository with the switch ENABLE_icubmod_cartesiancontrollerserver enabled. This will make the server part of the interface available on the hub.

iCubStartup Application Launcher
Henceforth rely on the installed copy of $ICUB_ROOT/main/app/iCubStartup/scripts/iCubStartup.xml.template application to launch both the Cartesian solvers (and other useful tools as well).
The Cartesian Solvers are required to invert the kinematics using Ipopt and they are meant to run on the cluster to avoid overloading the icub-head core. Make sure that the machines where the solvers will be running are configured properly to work with the specific robot (e.g. the YARP_ROBOT_NAME correctly points to the name of the robot) in order to load at start-up the robot dependent kinematics.
The solvers need the robotInterface to connect to the robot to get useful data to operate (e.g. the joints bounds) and also the robotInterface needs them to open the server side of the Cartesian Interface. Therefore the chicken-egg problem is solved by applying proper connection timeout to the solvers and to keep them as close as possible to the robotInterface by using the application. Moreover, once everything is launched from within the application, if the user accidentally stops one of these modules, it turns to be mandatory to restart all of them from the beginning; this comes from the requirement to keep the communication protocol as light as possible, avoiding requests for integrity check and reinitialization.

Opening and Closing the Cartesian Interface

The Cartesian Interface can be opened as a normal YARP interface resorting to the PolyDriver class:

Property option;
PolyDriver clientCartCtrl(option);
ICartesianControl *icart=NULL;
if (clientCartCtrl.isValid()) {

As you might have noticed, the stem-name of the Cartesian server controller (taken as standard) is "/<robot_name>/cartesianController/<part_name>". Thus, similarly to a usual motor controller, a state port streaming out the Cartesian pose of the limb's end-effector is also available whose name is "/<robot_name>/cartesianController/<part_name>/state:o". As result you can do:

yarp read /read /icub/cartesianController/right_arm/state:o
bool read(yarp::os::Searchable &cfgtotal, pc104Data &pc104data)
Definition: ethParser.cpp:92
Copyright (C) 2008 RobotCub Consortium.

getting the current pose of the right hand end-effector expressed with seven double \( \left[x,y,z,a_x,a_y,a_z,\theta\right] \), where the first three components are the 3d location of the center of the palm, whilst the final four describe the rotation of the frame attached to the end-effector with respect to the root reference frame in axis/angle notation (see Coordinate System).

When you are done with controlling the robot you can explicitly close the device (or just let the destructor do it for you):


Coordinate System

Central for a correct usage of the Cartesian controller is the knowledge of the coordinate system that shall be adopted to describe any desired limb pose.

Positions (in meters) and orientation refer to the root frame attached to the waist as in the documentation.
The root x-axis points backward, the y-axis points rightward, while the z-axis points upward.

To specify a target orientation of the end-effector, you have to provide three components for the rotation axis (whose norm is 1) and a fourth component for the rotation angle expressed in radians as required by the axis/angle notation (see here).
The axis/angle format compacts the notation and protects from some singularities that might appear when the Euler angles are adopted, but obviously one can still use his most preferable representation since there exist transformation formulas from one domain to another (e.g. look at yarp::math::dcm2axis or yarp::math::axis2dcm documentation).

We want to represent the orientation of the right hand when the arm is in the rest position (the torso and arm joints are zeroed), for which the x-axis attached to the center of the palm points downward, the y-axis points backward, while the z-axis points leftward. Hence, to transform the root reference frame in the end-effector frame, one solution is to rotate of pi/2 around the y-axis and then of -pi/2 around the z-axis.

In formulas:

Vector oy(4), oz(4);
oy[0]=0.0; oy[1]=1.0; oy[2]=0.0; oy[3]=+M_PI/2.0;
oz[0]=0.0; oz[1]=0.0; oz[2]=1.0; oz[3]=-M_PI/2.0;
Matrix Ry=yarp::math::axis2dcm(oy); // from axis/angle to rotation matrix notation
Matrix Rz=yarp::math::axis2dcm(oz);
Matrix R=Rz*Ry; // compose the two rotations keeping the order
Vector o=yarp::math::dcm2axis(R); // from rotation matrix back to the axis/angle notation
#define M_PI
Definition: XSensMTx.cpp:24

The outcome is: o = [0.57735 0.57735 -0.57735 2.094395].
Alternatively, one can specify directly the matrix R as a collection of column vectors representing the axes of the goal pose expressed in the root reference frame:

Matrix R(3,3);
// pose x-axis y-axis z-axis
R(0,0)= 0.0; R(0,1)= 1.0; R(0,2)= 0.0; // x-coordinate in root frame
R(1,0)= 0.0; R(1,1)= 0.0; R(1,2)=-1.0; // y-coordinate "
R(2,0)=-1.0; R(2,1)= 0.0; R(2,2)= 0.0; // z-coordinate "
Vector o=yarp::math::dcm2axis(R);

Using the Cartesian Interface

The YARP documentation contains a full description of Cartesian Interface. Moreover, some examples are also reported hereafter to gain a deeper insight into the Cartesian device.

Imagine that we want to get the actual arm pose from within our code. What we have to write is just:

Vector x0,o0;

The current translational position is returned in x0 while o0 contains now the current orientation.

Then we want to move the hand a bit farther from the body, let's say just 10 cm away along -x, keeping the orientation constant:

Vector xd=x0; xd[0]+=-0.1;
Vector od=o0;
icart->goToPose(xd,od); // send request and forget

The goToPose() method does not wait for any acknowledgement that the server has received the request, but it just lets the client module go straight to the next instruction in the code flow. Therefore, the intended use of this method is for streaming input (e.g. while tracking).

If we would like to test whether the new pose has been achieved, it's as follows:

icart->goToPoseSync(xd,od); // send request and wait for reply
bool done=false;
while (!done) {
Time::delay(0.04); // or any suitable delay
bool done
Definition: main.cpp:42

The same can be achieved also by the following streamlined snippet of code:

icart->goToPoseSync(xd,od); // send request and wait for reply
icart->waitMotionDone(0.04); // wait until the motion is done and ping at each 0.04 seconds

Here the method goToPoseSync() guarantees that the server has received the request and initiated the movement before allowing the code flow to proceed. If we had used the analogous non-sync method, then the result of first check would have been unpredictable.

To tune the trajectory execution time for point-to-point movements, you can call the proper function:

icart->setTrajTime(1.5); // given in seconds

To let the controller determine when to declare the movement accomplished, there is a dedicated tolerance acting both at the level of position and orientation errors. To set this tolerance you can write the following:


This means that the movement will be intended finished whenever norm(xd-x)<0.001, where x comprises the position and the orientation of the end-effector.

While moving to a target pose, you can query the controller to find out which will be the final joints configuration as determined by the solver:

Vector xdhat, odhat, qdhat;

Of course due to the optimization process \( \hat{x}_d \) differs from the commanded \( x_d \), so as \( \hat{o}_d \) from \( o_d \); \( \hat{q}_d \) is such that \( \left[\hat{x}_d,\hat{o}_d\right]=K\left(\hat{q}_d\right) \), where \( K\left(\cdot\right) \) is the forward kinematic map.
By contrast, if you want to solve for a given pose without actually moving there, you can use the ask-based methods as follows:

Vector xd(3);
Vector xdhat, odhat, qdhat;

The ask-based methods such as askForPosition() and askForPose() allow the user to employ the Cartesian Interface just as a simple layer that wraps around the inverse kinematics solver.

One useful feature is the possibility to enable/disable some degrees of freedom on-line. For example, the arm comes with 10 possible DOF's (the torso is included) that the user may or may not want to use at the same time while reaching.
In the following snippet of code we ask to enable the torso pitch and yaw joints (which are off by default).

Vector curDof;
cout<<"["<<curDof.toString()<<"]"<<endl; // [0 0 0 1 1 1 1 1 1 1] will be printed out
Vector newDof(3);
newDof[0]=1; // torso pitch: 1 => enable
newDof[1]=2; // torso roll: 2 => skip
newDof[2]=1; // torso yaw: 1 => enable
cout<<"["<<curDof.toString()<<"]"<<endl; // [1 0 1 1 1 1 1 1 1 1] will be printed out

To sum up, a value of '1' makes the joint a real degree of freedom, a value of '0' switches it off, while the special value of '2' indicates that the joint status won't be modified (to skip it and proceed forward).

Two things deserve your attention:

  • The torso joints order is the one defined by the kinematic description available from the wiki: [pitch,roll,yaw]. Usually the motor control interfaces stream the torso joints in the reversed order (i.e. [yaw,roll,pitch]) since they assume a different kinematic origin (the base of the neck in this case).
  • The shoulder joints are considered all together to be a super-joint (in order to take into account the shoulder's cable length issue) so that you are not allowed to enable/disable one of them differently from the others.

Furthermore, as you can easily figure out, the torso joints are shared by both arms. This tells us to take care of a misuse of the Cartesian device that might arise when one arm is controlled simultaneously to the other one: if this happens, a supervisor that enables/disables the torso joints according to the needs should be put in place.

When the torso is being controlled by an external module, there exists also the possibility for the Cartesian device to counteract to the induced movements of the uncontrolled joints in order to maintain the final end-effector pose stable.
To enable this feature one should call the proper method:


Usually the Cartesian device brings up in non-tracking mode: this means that once the limb has reached the desired pose, the controller gets disconnected until the next command is yielded and the limb can be moved by other external agents.
Conversely, in tracking mode the controller stays connected and tries to compensate for externally induced movements on the uncontrolled joints (that are seen as disturbs on the actual end-effector position) such as the torso ones (in case they have been previously disabled).

Another advanced feature you can benefit from is the chance to change dynamically the rest position the solver uses to specify a secondary task while converging to the solution.
Briefly (a detailed description is given in the solver page), the Cartesian solver implements a nonlinear constrained optimization that can be described in this way:

\[ \begin{array}{c} \mathbf{q}=\arg\min\limits_{\mathbf{q} \in \mathbb{R}^{n} }\left(\left\|\mathbf{\alpha}_d-\mathit{K_{\alpha}}\left(\mathbf{q}\right)\right\|^2+\mathit{w}\cdot\left(\mathbf{q}_r-\mathbf{q}\right)^{\top}\mathit{W}_r\left(\mathbf{q}_r-\mathbf{q}\right)\right) \\ \text{s.t.} \begin{cases} \left\|\mathbf{x}_d-\mathit{K_x}\left(\mathbf{q}\right)\right\|^2=0 \\ \mathbf{q}_L<\mathbf{q}<\mathbf{q}_U \end{cases} \end{array}. \]

The term \( \left(\mathbf{q}_r-\mathbf{q}\right)^{\top}\mathit{W}_r\left(\mathbf{q}_r-\mathbf{q}\right) \) represents the aforementioned secondary task, where \( \mathbf{q}_r \) is used to keep the solution as close as possible to a given rest position in the joint space, weighting component by component with the diagonal matrix \( \mathit{W}_r \).
Thereby, to access the elements of \( \mathbf{q}_r \) and \( \mathit{W}_r \) one should call the following methods:

Vector curRestPos;
cout<<"["<<curRestPos.toString()<<"]"<<endl; // [0 0 0 0 0 0 0 0 0 0] will be printed out
Vector curRestWeights;
cout<<"["<<curRestWeights.toString()<<"]"<<endl; // [1 1 1 0 0 0 0 0 0 0] will be printed out

Above we've asked for the default configuration of the secondary task which tries to minimize against the rest position (0,0,0) that considers only the torso joints in the computation. This prevents the robot from leaning out too much if the target position can be reached easily with just the arm.

Nonlinear Problem Design

In this small section a couple of considerations that motivated the design of the inversion problem are briefly summarized as taken from the article on the Cartesian controller and perhaps more importantly as they came up from discussions with some meticulous users.

One of the main aspects of the nonlinear problem as handled by the Cartesian solver is that the reaching for position is a constraint, whereas the reaching in orientation appears directly in the cost function. As result, the former is treated with higher priority with respect to the latter to reflect the requirement of achieving the task in a very localized zone of the space. For example, this particular preference is of major interest for grasping, where having an hand perfectly oriented with respect to the object to be grasped but with a position of even a couple of centimeters away from the target will cause the task to fail much more frequently than having the hand perfectly located in position and with even ten degrees out of the alignment. One behavior emerging from this design that the user may experience effortlessly is the following: when a out-of-range position is commanded together with a given orientation, it turns that the robot will stretch its body up to its maximum extension (depending on the parts currently enabled) almost irrespective of the final orientation. As consequence, bear always in mind to enable the torso to enlarge as much as possible the dexterous space of the iCub.

Conversely, if for any reason the orientation is meant to be handled with higher priority with respect to position, user can then declare the reaching in orientation as a contraint and the reaching in position as an objective, respectively. This swap is performed by the following call:


To switch back to the normal configuration where position is again predominant:


Context Switch

We define here the context as the configuration in which the controller operates: therefore the context includes the actual number of DOF's, their angular bounds, the resting positions and weights, the trajectory execution time, the current tracking mode, the pose priority and so on. Obviously the controller performs the same action differently depending on the current context. A way to easily switch among contexts is to rely on storeContext() and restoreContext() methods as follows:

icart->setDOF(newDof1,curDof1); // here the user prepares the context
int context_0;
icart->storeContext(&context_0); // latch the context
icart->setDOF(newDof2,curDof2); // at certain point the user may want the controller to
icart->goToPose(x,o); // perform some actions with different configuration
icart->restoreContext(context_0); // ... and then retrieve the stored context_0
icart->goToPose(x,o); // now the controller performs the same action but within the context_0

Unless the user needs the interface just for logging purposes, it's a good rule to store the context at the initialization of his module in order to then restore it at releasing time to preserve the controller configuration.
Note that the special context tagged with the id 0 is reserved by the system to let the user restore the start-up configuration of the controller at any time.

Task-Space Reference Velocity Command

The user is further provided with the method setTaskVelocities() that allows commanding a reference velocity for the end-effector in the operational space as follows:

Vector xdot(3); // move the end-effector along y-axis at specified velocity
xdot[0]=0.0; // [m/s]
Vector odot(4); // no rotation is required
odot=0.0; // [rad/s]

The solver still plays a meaningful role in this command mode so that the whole set of constraints active in the inverse kinematics problem are taken into account. Moreover, the task-space reference velocity will be tracked according to the reactivity level of the internal controller which is basically determined by the current trajectory time; thus, for executing high task-space velocity the trajectory time should be tuned towards lower values, whereas it should be increased to track low task-space reference velocity.

Defining a Different Effector

The Cartesian controller can be also instructed to control a different effector that can be suitably expressed with respect to the standard end-effector reference frame. This is thus equivalent to rigidly attaching a tip to the standard end-effector. Therefore, to achieve such extension, it is required to call the attachTipFrame() method of the interface as done in the following example, where the aim is to control the tip of the index finger rather than the center of the palm:

IEncoders *iencs;
int nEncs;
Vector encs(nEncs);
Vector joints;
iCubFinger finger("right_index"); // relevant code to get the position of the finger tip
finger.getChainJoints(encs,joints); // wrt the end-effector frame
Matrix tipFrame=finger.getH((M_PI/180.0)*joints);
Vector tip_x=tipFrame.getCol(3);
Vector tip_o=yarp::math::dcm2axis(tipFrame);
icart->attachTipFrame(tip_x,tip_o); // establish the new controlled frame
icart->getPose(x,o); // now we get the pose of the finger tip frame
icart->goToPose(xd,od); // so as the target will be attained with the finger tip
icart->removeTipFrame(); // to restore the center of the palm as actual end-effector

Importantly, the tip frame is part of the current context, hence it might happen that several tip frames are attached or removed as result of the user context switching requests.

Events Callbacks

The Cartesian Interface provides also an easy way to register events callbacks. For example, an event might be represented by the onset of the movement when the controller gets activated upon the reception of a new target or the end of the movement itself. The user can then attach a callback to any event generated by the interface.
It is required to inherit from the specific class CartesianEvent that handles events and overrides the cartesianEventCallback() method.

class MotionDoneEvent : public CartesianEvent
cartesianEventParameters.type="motion-done"; // select here the event type we want to listen to
virtual void cartesianEventCallback()
cout<<"motion complete"<<endl;
MotionDoneEvent event;
icart->registerEvent(event); // the tag "motion-done" identifies the event to be notified
icart->goToPosition(x); // as soon as the motion is complete, the callback will print out the message

To know which events are available for notification, the user can do:

Bottle info;

The class CartesianEvent contains two structures: the cartesianEventParameters and the cartesianEventVariables. The former has to be filled by the user to set up the event details, whereas the latter is filled directly by the event handler in order to provide information to the callback.
For instance, to raise a callback at the middle point of the path, one can do:

class MotionMiddleEvent : public CartesianEvent
cartesianEventParameters.motionOngoingCheckPoint=0.5; // middle point is at 50% of the path
virtual void cartesianEventCallback()
cout<<"attained the "<<100.0*cartesianEventVariables.motionOngoingCheckPoint;
cout<<"% of the path"<<endl;

The special wildcard "*" can be used to assign a callback to any event, regardless of its type, as done below.

class GeneralEvent : public CartesianEvent
virtual void cartesianEventCallback()
cout<<"event of type: "<<cartesianEventVariables.type.c_str()<<endl;
cout<<"happened at: "<<cartesianEventVariables.time<< "[s] (source time)"<<endl;