AVIR
High-quality pro image resizing library
 
Loading...
Searching...
No Matches
avir::float4 Class Reference

SIMD packed 4-float type. More...

#include <avir_float4_sse.h>

Public Member Functions

 float4 (const __m128 s)
 
 float4 (const float s)
 
 float4 (const float4 &s)
 
float hadd () const
 
 operator float () const
 
float4 operator* (const float4 &s) const
 
float4operator*= (const float4 &s)
 
float4 operator+ (const float4 &s) const
 
float4operator+= (const float4 &s)
 
float4 operator- (const float4 &s) const
 
float4operator-= (const float4 &s)
 
float4 operator/ (const float4 &s) const
 
float4operator/= (const float4 &s)
 
float4operator= (const __m128 s)
 
float4operator= (const float s)
 
float4operator= (const float4 &s)
 
void store (float *const p) const
 
void storeu (float *const p) const
 
void storeu (float *const p, int lim) const
 

Static Public Member Functions

static void addu (float *const p, const float4 &v)
 
static void addu (float *const p, const float4 &v, const int lim)
 
static float4 load (const float *const p)
 
static float4 loadu (const float *const p)
 
static float4 loadu (const float *const p, int lim)
 

Public Attributes

__m128 value
 Packed value of 4 floats.
 

Detailed Description

SIMD packed 4-float type.

This class implements a packed 4-float type that can be used to perform parallel computation using SIMD instructions on SSE-enabled processors. This class can be used as the "fptype" argument of the avir::fpclass_def class.

Member Function Documentation

◆ addu() [1/2]

static void avir::float4::addu ( float *const p,
const float4 & v )
static

Function performs in-place addition of a value located in memory and the specified value.

Parameters
pPointer to value where addition happens. May be unaligned.
vValue to add.

◆ addu() [2/2]

static void avir::float4::addu ( float *const p,
const float4 & v,
const int lim )
static

Function performs in-place addition of a value located in memory and the specified value. Limited to the specfied number of elements.

Parameters
pPointer to value where addition happens. May be unaligned.
vValue to add.
limThe element number limit, >0.

◆ hadd()

float avir::float4::hadd ( ) const
Returns
Horizontal sum of elements.

◆ load()

static float4 avir::float4::load ( const float *const p)
static
Parameters
pPointer to memory from where the value should be loaded, should be 16-byte aligned.
Returns
float4 value loaded from the specified memory location.

◆ loadu() [1/2]

static float4 avir::float4::loadu ( const float *const p)
static
Parameters
pPointer to memory from where the value should be loaded, may have any alignment.
Returns
float4 value loaded from the specified memory location.

◆ loadu() [2/2]

static float4 avir::float4::loadu ( const float *const p,
int lim )
static
Parameters
pPointer to memory from where the value should be loaded, may have any alignment.
limThe maximum number of elements to load, >0.
Returns
float4 value loaded from the specified memory location, with elements beyond "lim" set to 0.

◆ store()

void avir::float4::store ( float *const p) const

Function stores *this value to the specified memory location.

Parameters
[out]pOutput memory location, should be 16-byte aligned.

◆ storeu() [1/2]

void avir::float4::storeu ( float *const p) const

Function stores *this value to the specified memory location.

Parameters
[out]pOutput memory location, may have any alignment.

◆ storeu() [2/2]

void avir::float4::storeu ( float *const p,
int lim ) const

Function stores "lim" lower elements of *this value to the specified memory location.

Parameters
[out]pOutput memory location, may have any alignment.
limThe number of lower elements to store, >0.