- From: David Schleef <ds_at_schleef.org>
- Date: Wed, 21 Aug 2002 12:57:24 -0700
On Wed, Aug 21, 2002 at 03:26:57PM -0400, Calin A. Culianu wrote: > On Thu, 15 Aug 2002, David Schleef wrote: > > Although it's not currently implemented this way, it is possible > > to make comedi_sampl_to_phys() about 5 times faster than looping > > on a comedi_to_phys() call on i386. > > How is that? Using mmx/3dnow/sse2/funkyjunky extentions? Well, that will be an option eventually. (That kind of stuff is another one of my projects.) Standard C requires that casting from double to int is done in "round to nearest" mode. The i386 ABI also requires that the floating point context is saved and restored by called functions (which was completely reasonable when the ABI was written.) So, every time you want to cast from double to int, the compiler outputs asm instructions to save the floating point context, set it to "round to nearest", do the cast, and then restore the floating point context. Naturally, if you are doing this in a loop, the obvious optimization is to save the context once, do all the conversions, and then restore the context. This is especially useful on newer processors, since the context save/restore causes an instruction pipeline flush. Other architectures have significantly fewer issues in this area, mainly because they typically have bits in the instruction that specify the rounding mode. In addition, there are a number of assembly tricks you can do if you know the input range is limited to 12 or 16 bits, you want to clip the output range or set out-of-range inputs to NaN, like Comedilib does. On powerpc, for example, it's fastest to build a floating point number by shuffling bits around. dave...
Received on 2002-08-21Z18:57:24