j-solorzano

j-solorzano t1_jdk2kod wrote

If it works in CPU but not GPU, even though the GPU should have more memory, the only difference I can think of is garbage collection timing. Try calling the garbage collector in every epoch. Also, note that you have a GRU, which retains tensors.

1