Closed
Description
🚀 The feature, motivation and pitch
Dear PyTorch developers and community,
We have nice tutorial cpp_extension on custom cuda extensions written by Peter Goldsborough. I’m wondering if the same can be done but on AMD GPUs with kernels written using rocm HIP. I mean the following: call custom forward+backward hip kernel from pytorch and include it in deep learning pipeline. Is it currently supported and are there any limitations?
Does somebody have experience of writing custom hip/c++ kernels and using them in pytorch?
cc @sekyondaMeta @svekars @carljparker @NicolasHug @kit1980 @subramen @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport