And the CPU is stressed a bit more since it has to do more work to do many more kernel launches. There are two major consequences coming with this new approach: Octane needs to keep information for every sample that is calculated in parallel between kernel calls, which requires additional GPU memory. there are a lot more kernel calls are happening than in the past. To solve the problem, we split the big task of calculating a sample into smaller steps which are then processed one by one by the CUDA threads. Also OSL and OpenCL are pretty much impossible to implement this way. We changed this for various reasons, the main one being the fact that the integration kernels got really huge and impossible to optimize. Since the beginning of Octane the integration kernels had one CUDA thread calculate one complete sample.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |