* Support half precision sigmoid activation * introduce a vectorized variant using fast_tanh * move the math to fast_math.h * fixed compile * .raw() -> .to_half() Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
* Support half precision sigmoid activation * introduce a vectorized variant using fast_tanh * move the math to fast_math.h * fixed compile * .raw() -> .to_half() Co-authored-by: Haicheng Wu <haichengw@nvidia.com>