Fixing error where jCuda and ND4j/Nd4s interact

If you use jCuda and Nd4s at the same time, you can get an exception (I’m doing this to get memory information, while nd4s is doing the work)

[error] Exception in thread "main" org.nd4j.linalg.exception.ND4JException: CUDA exception happened. Terminating. Last op: [null]
[error]         at org.nd4j.jita.allocator.pointers.cuda.cudaEvent_t.register(cudaEvent_t.java:63)
[error]         at org.nd4j.jita.flow.impl.SynchronousFlowController.registerAction(SynchronousFlowController.java:215)
[error]         at org.nd4j.jita.handler.impl.CudaZeroHandler.memcpyAsync(CudaZeroHandler.java:585)
[error]         at org.nd4j.jita.allocator.impl.AtomicAllocator.memcpyAsync(AtomicAllocator.java:920)
[error]         at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.set(BaseCudaDataBuffer.java:457)
[error]         at org.nd4j.linalg.jcublas.buffer.BaseCudaDataBuffer.setData(BaseCudaDataBuffer.java:581)
[error]         at org.nd4j.linalg.jcublas.buffer.CudaIntDataBuffer.(CudaIntDataBuffer.java:82)
[error]         at org.nd4j.linalg.jcublas.buffer.factory.CudaDataBufferFactory.createInt(CudaDataBufferFactory.java:356)
[error]         at org.nd4j.linalg.factory.Nd4j.createBufferDetached(Nd4j.java:1430)
[error]         at org.nd4j.linalg.api.shape.Shape.createShapeInformation(Shape.java:2045)
[error]         at org.nd4j.linalg.api.ndarray.BaseShapeInfoProvider.createShapeInformation(BaseShapeInfoProvider.java:47)
[error]         at org.nd4j.jita.constant.ProtectedCudaShapeInfoProvider.createShapeInformation(ProtectedCudaShapeInfoProvider.java:64)
[error]         at org.nd4j.linalg.jcublas.CachedShapeInfoProvider.createShapeInformation(CachedShapeInfoProvider.java:26)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.(BaseNDArray.java:163)
[error]         at org.nd4j.linalg.jcublas.JCublasNDArray.(JCublasNDArray.java:335)
[error]         at org.nd4j.linalg.jcublas.JCublasNDArrayFactory.create(JCublasNDArrayFactory.java:257)
[error]         at org.nd4j.linalg.factory.Nd4j.create(Nd4j.java:4231)
[error]         at org.nd4j.linalg.api.shape.Shape.newShapeNoCopy(Shape.java:1230)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3741)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:3796)
[error]         at org.nd4j.linalg.api.ndarray.BaseNDArray.reshape(BaseNDArray.java:4100)
[error]         at indexer.GpuConcepts$$anonfun$score$1.apply(GpuConcepts.scala:260)
[error]         at indexer.GpuConcepts$$anonfun$score$1.apply(GpuConcepts.scala:259)
[error]         at indexer.GpuConcepts$.time(GpuConcepts.scala:42)
[error]         at indexer.GpuConcepts$.score(GpuConcepts.scala:258)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20$$anonfun$apply$21.apply(GpuConcepts.scala:222)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20$$anonfun$apply$21.apply(GpuConcepts.scala:222)
[error]         at scala.collection.immutable.List.map(List.scala:284)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20.apply(GpuConcepts.scala:221)
[error]         at indexer.GpuConcepts$$anonfun$exec$1$$anonfun$apply$20.apply(GpuConcepts.scala:221)
[error]         at scala.util.Try$.apply(Try.scala:192)
[error]         at indexer.GpuConcepts$$anonfun$exec$1.apply(GpuConcepts.scala:220)
[error]         at indexer.GpuConcepts$$anonfun$exec$1.apply(GpuConcepts.scala:224)
[error]         at indexer.GpuConcepts$.time(GpuConcepts.scala:42)
[error]         at indexer.GpuConcepts$.exec(GpuConcepts.scala:218)
[error]         at indexer.GpuConcepts$$anonfun$main$1.apply$mcV$sp(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts$$anonfun$main$1.apply(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts$$anonfun$main$1.apply(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts$.time(GpuConcepts.scala:42)
[error]         at indexer.GpuConcepts$.main(GpuConcepts.scala:103)
[error]         at indexer.GpuConcepts.main(GpuConcepts.scala)
[error] CUDA error at D:/jenkins/workspace/dl4j/all-multiplatform_windows-x86_64/libnd4j/stream3/libnd4j/blas/cuda/NativeOps.cu:4866 code=33(cudaErrorInvalidResourceHandle) "result"
[error] CUDA error at D:/jenkins/workspace/dl4j/all-multiplatform_windows-x86_64/libnd4j/stream3/libnd4j/blas/cuda/NativeOps.cu:4749 code=33(cudaErrorInvalidResourceHandle) "result"
java.lang.RuntimeException: Nonzero exit code returned from runner: 1
        at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code returned from runner: 1
[error] Total time: 51 s, completed Jan 3, 2018 9:02:54 PM 

The issue is that you need to close the context you created:

cuCtxDestroy(context)

Interested in Scala? I send out weekly, personalized emails with articles and conference talks. Click here to see an example and subscribe.

1 reply
  1. Adam Gibson
    Adam Gibson says:

    Hi, What version of the software is this? Could you file an issue so we can help with this? You shouldn’t need to do anything with the internals. Thanks (Adam from deeplearning4j)

    Please file an issue here: https://github.com/deeplearning4j/nd4j/issues – I would seriously consider not trying these kinds of workarounds if you can help it.

    Reply

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *