Backends

Selecting the backend at assignment

Once an expression-object is constructed, the evaluation of the expression as a whole can be triggered through an assignment. The expression though is not necessarily evaluated using the glas expression template mechanism. The evaluation mechanism of each expression can be chosen. Every expression can be chosen to be dispatched to a specific backend. A backend is an expression-evaluation engine that might rely on third-party libraries and/or other backends to evaluate the expression. If the chosen backend is unable to evaluate the expression, the expression dispatch mechanism will fallback (at compile-time) to the generic glas-backend.

For example, considering v and w to be dense vectors, the expression:

  plus_assign< blas > (v, scalar_vector_mult( 3.0,w ) ) ;

will be dispatched to a daxpy call in the BLAS backend. On the other hand the expression:

  assign< blas > (v, scalar_vector_mult( 3.0, w ) ) ;

has no direct blas equivalent. The expression comes down to a daxpy call but first all elements of v need to be set to zero by means of a dscal or a memset for instance. Thus the blas backend might be unable to evaluate the expression and the expression will than be dispatched to the generic glas backend instead. On the other hand a more intelligent blas backend might accept the expression and evaluate the expression using any of the mechanisms described above. Depending on the context, calling BLAS twice might be more approriate (performance-wise, memory-wise,...) as using the glas backend or not. That is exactly the reason for supporting multiple backends, each relying on different third-party libraries and implementing different policies.

Difference between backends and explicit calls to 3rd party libraries

A lot of intelligence can be put into these backends. Backends can e.g. be provided that provide optimal performance given a set of third-party libraries and type of hardware. Additionally a default-backend can be configured which will be called if no backend is specified in the assignment for maximal transparancy.

During the development of an application the glas-interface can be used without taking third-party libraries that might be available into account. Once the application need to be compiled for a given target platform, a backend can be provided that intelligently dispatches the most frequently used expressions to the available third-party libraries. On many platforms a vendor-tuned BLAS library is available but many vendor-tuned libraries also provide additional functionality. ACML provides fast-math routines, MKL provides sparse solvers and VML etc.

The backend evaluation mechanism

The free function assign is defined as follows:

  template <class BackEnd, class X, class E>
  void assign( X& x, E const& e ) 
  { assign_backend_selector<BackEnd, X, E>() ( x, e ) ; }

The assign function only serves to easily create the assign_backend_selector functor. The reason for calling this functor is to be able to partially specialise the functor in the different backend on the expressions passed in the assign function.

GLAS backend

The glas backend is a bit special. The glas-backend does not need to match the expression to any predefined function. The glas-backend instead just needs to trigger the expression-template mechanism to execute the assignment. In case both the lhs is a vector and the rhs is a vector-expression for instance, the glas backend just will iterate over all elements of the rhs, evaluate them and assign them to the corresponding element on the lhs.

BLAS backend

The blas backend on the contrary needs to match the expression to well defined functions. For this purpose the assign_backend_selector can be fully or partially specialised on specific expressions. Each of these specialisations can then recover the data from the expression-object and call the corresponding BLAS function.

Consequences of the split of creation and evaluation of expressions

The expression mechanism is the same for scalars, vectors and matrices, i.e. an expression is created and only evaluated at the assign operation. In addition, results of expressions are not stored in temporaries in the glas backend. This may have bizarre and often unwanted side effects. For example,

  assign(y, scalar_vector_mult(norm_2(v),w)) ;

is equivalent to the loop:

  norm_2_expression<V> norm_2_e( norm_2(v) ) ;
  for ( size_type i=0; i<w.size(); ++i ) {
    y(i) = norm_2_e() * w(i) ;
  }

This implies that norm_2(v) is not stored in a temporary but evaluated for each i. This is something that we usually do not want. There are two workarounds: first, explicitly store norm_2(v) in a temporary:

  s = norm_2(v) ;
  assign(y, scalar_vector_mult(s,w) ) ;

or use a backend that does this for you.

Intermezzo: Conversions during assignment (of scalar-expressions)

Scalar expressions provide a conversion operator to allow to assign the scalar expression to a scalar. However relying on this conversion operator has a few side-effects. The assignment of an Expression into a Scalar or Vector can happen in different ways. Suppose we want to assign norm_1(v) to s. The most generic way is the assign function:

  assign( s, glas::norm_1(v) ) ;
  assign<backend::blas>( s, glas::norm_1(v) ) ;

assign assumes that the result_type of glas::norm_1(v) can be implicitly converted to s. For example,

  dense_vector<int> v(5) ;
  std::complex<double> s ;
  assign( s, glas::norm_1(v) ) ;
  assign<backend::blas>( s, glas::norm_1(v) ) ;

works, since int can be converted to complex<double>.

A more intuitive way is to use conversion to a scalar type:

  s = glas::evaluate( glas::norm_1(v) ) ;
  s = glas::evaluate<backend_blas>( glas::norm_1(v) ) ;

evaluate assumes that the result_type of glas::norm_1(v) can be implicitly converted to s.

The shortest syntax is:

  s = glas::norm_1(v) ;

where it is assumed that norm_1_expression<V> can be converted to s. This condition is stricter than the use of assign or evaluate. In the case that implicit conversion is not possible, assign or evaluate should be used. For example,

  dense_vector<int> v(5) ;
  std::complex<double> s ;
  s = glas::norm_1(v) ) ;

does not work: the conversion from norm_1_expression< dense_vector<int> > to int is possible, or to double as well, since this requires at most two implicit conversions. However, the conversion to std::complex<double> requires more than two implicit conversions and C++ does not allows this.

The nice thing about conversion is that GLAS objects can be assigned to non GLAS objects. In order to do that, one should specialize the glas::backend::glas::assign_functor for the left hand side object type. For example, we could assign glas::dense_vector<T> to std::vector<T>.