| [AMD] Add fast_tanhf to libdevice (#7780)
This PR added fast_tanhf operator under libdevice for AMD hardwares.
This PR reused the same pass for fast_expf to implement fast_tanhf. A
optimized tanh use below formulation:
tanh(X) = (fast_expf(2X) - 1) / (fast_expf(2X) + 1) | 9 个月前 |