NTL Changelog

What's new in NTL 9.2.0

May 25, 2015
  • Completed the transition away from floating-point arithmetic for the implementation of single-precision modular arithmetic. The current implementation should allow 60-bit moduli on 64-bit platforms that support a 128-bit extended integer type (this is the case for current gcc, clang, and icc compilers).
  • One can still revert to the "classical" (pre-9.0) implementation using double-precision arithmetic (which imposes a 50-bit limit), or to the "long double" implementation introduced in v9.0 (60-bit limit).
  • Note that one must compile NTL with GMP to get any of these improvements. It would have perhaps been better to use GMP's longlong.h facility instead of relying on compiler support for extended integer types. However, at the moment, it is a bit inconvenient to use longlong.h as a freestanding header file. This might change in the future.
  • Programming notes: MulMod(a, b, n) is equivalent to mulmod_t ninv = PrepMulMod(n); MulMod(a, b, n, ninv). Compared to the older, floating-point implementation, the relative cost of computing ninv is higher in the new regime. In a loop where n is invariant, the compiler should "hoist" the computation of ninv, so it is only done once. However, it is usually better to precompute and store ninv, and use the second form of MulMod, with ninv passed as a parameter (NTL does this internally quite consistently). The performance of MulMod(a, b, n, ninv) is somewhat faster in the new implementation. Where possible, one should use MulModPrecon, which is faster still (useful in situations where both n and b are invariant).
  • A number of general performance improvements.

New in NTL 9.1.1 (May 18, 2015)

  • Fixed a bug introduced in 9.1.0 that prevented conversions between Vec and Vec.

New in NTL 9.1.0 (May 4, 2015)

  • Added a new configuration switch to enable the PCLMUL instruction on x86 machines. This can speed up GF2X arithmetic significantly (by a factor of 4). This is enabled by configuring with NTL_PCLMUL=on (and the configuration script automatically checks if it actually works on your platform).
  • Note that this is an alternative to building NTL against the gf2x library (the latter is currently not thread or exception safe).
  • Performance improvements to zz_pX and Vec
  • Performance improvements to ZZX: implemented asymptotically fast CRT code for HomMul and more cache-friendly logic. This routine is used for polynomials whose degree is significantly larger than the bit-length of its coefficients. This should make NTL's ZZX multiplication faster across a broader range of parameters, and at least be within a (hopefully not-too-large) constant factor of optimal.
  • Some internal cleaning on the small-prime FFT code. I've made David Harvey's lazy butterfly routine without precomputed tables more competitive with the large-table variant, so now that large tables are used for a smaller range of parameters (this should reduce the overall memory footprint).
  • Laid the groundwork for some future changes; namely, to allow 60-bit modular arithmetic without relying on the esoteric x87 fmul instruction. This should be coming soon (probably v9.2).

New in NTL 9.0.2 (Mar 30, 2015)

  • Made a small change to single-precison MulMod that enables slightly better compiler optimizations (compiler can "hoist" the computation of 1/n out of a loop, so the variant with extra mulmod_t arg becomes somewhat less essential).

New in NTL 9.0.1 (Mar 28, 2015)

  • Fixed a small bug that prevented compilation a certain platforms.

New in NTL 9.0.0 (Mar 27, 2015)

  • With much trepidation, I have introduced a (hopefully minor) backward incompatibility into NTL. The interface to the single-precision modular arithmetic routines has been modified slightly. This interface change allows for more flexible and more efficient implementation of these routines, which play a crucial role at many levels in NTL.
  • Basically, these changes to the interface abstract away some implementation details that arguably should never been there in the first place. By coding to the new interface, NTL clients will be able to benefit from the current and future improvements.
  • In particular, on 64-bit x86/GCC platforms, single precision moduli can now be up to 60 bits, rather than 50 bits. While some operations may in fact be a little slower, the most important ones (like MulModPrecon) should not be. Using larger moduli speeds up a number of things, like ZZ_pX arithmetic, as fewer primes need to be used in Chinese Remaindering steps. Other applications benefit from larger moduli as well.
  • It is expected that most NTL clients will not be affected at all. Moreover, any code that needs to be updated will be detected by the compiler, and the updates should be simple and mechanical. There is also a configuration flag that will enable the legacy interfae (although this is not recommended practice).
  • Other changes:
  • Previous versions of NTL relied (at least by default) on some undefined behavior regarding integer arithemtic (namely, that in a few key code sequences, signed integer overflow just "wraps around"). All of this undefined behavior has been replaced by (much more desirable) implementation-defined behavior (namely, that conversion from unsigned to signed works as expected). As in the past, the NTL_CLEAN_INT can be used to avoid all of these issues (but with the new code, this should truly be academic). For details, look here.
  • By request, added a function xdouble exp(const xdouble& x), which is equivalent to xexp(to_double(x)). For details, look here.

New in NTL 8.1.2 (Feb 2, 2015)

  • Corrected a bug that could affect the log function in a multi-threaded execution.

New in NTL 8.1.1 (Feb 2, 2015)

  • Corrected a syntax error in SmartPtr.h, which most compilers don't seem to complain about, but some do.
  • Added --tag=CXX to the some lines in the makefile to keep libtool happy.

New in NTL 8.1.0 (Jan 10, 2015)

  • Corrected an oversight in the matrix template class. With this new version, one may safely copy and assign objects of type Mat and Mat out of context (i.e., under a different or undefined modulus). More generally, the copy constructor for Mat now relies only on the copy constructor for Vec and the assignment operator for Mat relies only on the assignment operator and copy constructor for Vec.
  • The goal since v6.2 has been to allow all modular types (ZZ_p, etc.) and all types derived from them (vectors, polynomials, matrices, etc.) to be safely copy constructed and assigned out of context. Hopefully, this goal has now been reached.

New in NTL 8.0.0 (Dec 27, 2014)

  • This is another major milestone for NTL, and hence the big version number bump (this will be the last of these big bumps for a while).
  • Prior to this version, error handling consisted of "abort with an error message". To enable exceptions in NTL, configure with NTL_EXCEPTIONS=on. You will also need a C++11 compiler for this to work properly (and if you don't enable exceptions, any old C++98 compiler will work, as always).
  • With exceptions enabled, errors are reported by throwing an appropriate exception. Of course, this was the easy part. The hard part was making NTL's code exception safe, which (among other things) means that no resources (i.e., memory) are leaked when an exception is thrown. This required a very painful top-to-bottom scrub of the whole library.
  • Despite major changes to the code base and many internal interfaces, the external (i.e., documented) interfaces remain completely unchanged.
  • Improved performance of ZZ_pX arithmetic for both classic and GMP-based long integer packages.
  • Made copy constructor and assignment operators for fftRep and FFTRep safe "out of context", which extends to the classes zz_pXModulus and ZZ_pXModulus.
  • Made mechanism for establishing "unique ID's" (used for temporary file name generation and default pseudo-random number seeds) more robust.

New in NTL 7.0.2 (Dec 16, 2014)

  • Fixed bug introduced in v7.0 affecting RR and quad_float input routines, which would leave the RR precision variable in an incorrect state.
  • Fixed a bug introduced in the v6.2 that affected the append routines for ZZ_p and GF2E, which would lead to incorrect memory allocation (which, if triggered, should just have led to an error message and abort, rather than incorrect results). This bug also affected the new Vec constructor introduced in v7.0 (and again, only for ZZ_p and GF2E).

New in NTL 7.0.1 (Nov 15, 2014)

  • Fixed critical bug in new bit-reverse-copy routine. Large degree polynomial multiplication code was buggy in v7.0. Now it's fixed and properly tested.

New in NTL 7.0.0 (Nov 10, 2014)

  • Changes:
  • I changed the stream input behavior to conform to wider C++ practice (and with an eye towards am exception safe future). Previously, if an attempt to read an NTL object failed, the good old Error function was called, printing an error message, and aborting your program. Now, NTL just quietly sets the ``fail bit'' of the stream. The second example here illustrates this. Hopefully, this change will not cause too many problems, but if it does, configure NTL with the NTL_LEGACY_INPUT_ERROR flag on to get the old behavior back.
  • To further simplify future development, I've dropped support for legacy C++ standard header files. That is, NTL always uses rather than . This shouldn't be a problem for anyone by now, as these lagacy header files have been gone from standard C++ since 1998. Also, by default, NTL components are still wrapped in the NTL namespace, but for backward compatibility, you can still put them in the global namespace by configuring NTL with the NTL_LEGACY_NO_NAMESPACE flag.
  • Implemented a cache-friendy "bit reverse copy" routine for doing FFT's. This is the COBRA algorithm from Cater and Gatlin, "Towards an optimal bit-reversal permutation algorithm", FOCS 1998. This does seem to help a bit. Getting rid of "bit reverse copy" would be even better, but this would take more work and break a number of interfaces.
  • Made some minor improvements to ZZX multiplication routines to get better locality of reference. Improvement is nominal.
  • Fixed a small issue in the left-shift ZZ routine: it was allocating one word more than necessary in some cases.
  • Added new Vec constructor

New in NTL 6.2.1 (Aug 27, 2014)

  • Fixed syntax problem in NTL/vector.h

New in NTL 6.2.0 (Aug 23, 2014)

  • Added explicit constructors corresponding to promotions.
  • Objects from classes that represent residue class rings with a dynamically installed modulus may now be used a bit more flexibly.
  • For the classes ZZ_p, ZZ_pE, zz_pE, and GF2E, added explicit "allocation" and "no allocation" contructors (invoked with INIT_ALLOC and INIT_NO_ALLOC) and special member function allocate().
  • Added new classes ZZ_pPush, ZZ_pEPush, zz_pPush, zz_pEPush, GF2EPush.
  • Made the one-arg constructors for all the various "context" classes (e.g., ZZ_pContext) explicit.
  • Added a bunch of typedef's using consistent naming conventions to all of the main arithmetic classes.
  • Removed a few esoteric compilation modes:
  • All files are now C++ files, and should be compiled using a C++ compiler. In older versions, some files could be compiled either as C or C++.
  • The flag NTL_GMP_HACK is no longer supported. GMP may still be used using the NTL_GMP_LIP flag, which is still highly recommended for high-performance applcations.
  • The flags NTL_SINGLE_MUL and NTL_FAST_INT_MUL are no longer recognized. These were really outdated and esoteric.

New in NTL 6.1.0 (Mar 14, 2014)

  • Added support for "user defined" FFT primes for zz_p.

New in NTL 6.0.0 (Jun 6, 2013)

  • Replaced the old template-like macros for vectors, matrices, and pairs with true template classes: Vec, Mat, and Pair.
  • For backwards compatibilty, all the names that were used in previous versions (e.g., vec_ZZ_p, mat_ZZ_p) have been replaced with appropriate typedefs.
  • For many years, I resisted the temptation of using templates, because compiler support was very inconsistent. But that no longer seems to be the case.
  • This change, while rather sweeping, should create very few, if any, incompatibilities with existing software. The biggest issue would be for software that uses the old template-like macros: such macro invocations can simply be replaced with appropriate typedefs.
  • Made the conversion interface more complete and uniform. Also, using template notation, one can and should now write conv(a) instead of to_ZZ(a) (for backward compatibility, all the old names to_XXX are still there, but many new conversions are not available under these old names).
  • There are many new conversions provided. Moreover, whenever there is a conversion from a ring R to a ring S, there is a corresponding, coefficiet-wise conversion from the polynomial ring R[X] to the polynomial ring R[X].
  • In addition, using the template mechanism, there are generic conversions for vectors and matrices. For example, if there is a conversion from S to T, then there is automatically a corresponding component-wise conversion from Vec to Vec.
  • Introduced a more general mechanism for accessing GF2's in packed structures via indexing (see the class ref_GF2 in the GF2 module).
  • Employed ideas from David Harvey to make the single-precision FFT faster (about twice as fast in many cases). This speeds up many higher-level operations.
  • Fixed all known bugs.

New in NTL 5.5.2 (Mar 17, 2010)

  • New routines MulAddTo and MulSubFrom for computing x += a*b and x -= a*b, where x and a are ZZ's and b is a ZZ or a long. In the case where b is a long, this may be much faster than writing mul(t, a, b); add(x, x, t). See ZZ.txt for details. These new routines are used in a number of places in NTL to get faster algorithms (for example, the LLL routine).
  • Fixed a relatively benign indexing bug in GF2EX discovered by Berend-Benjamin Tams using the valgrind tool.