-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ResourceExhaustedError - Even with batch_size=2 #11
Comments
I can't say for sure if the GPU ran out of memory. I've only run this code
on GPUs with 10+ GB, so it's possible. Can you try running it on CPU alone?
Em qui, 28 de jun de 2018 10:26, Bshowg <[email protected]> escreveu:
… I'am trying to run the net but i have an OOM issue. I have an nVidia 960
with 2GB RAM. Is it possible that my memory is just not enough?
I report below the output of the net:
> Reading data from ./snli/snli_1.0_train.jsonl
> CReading data from ./snli/snli_1.0_dev.jsonl
> Converting words to indices
> 2018-06-28 10:21:53.901180: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
> 2018-06-28 10:21:53.902102: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
> name: GeForce GTX 960 major: 5 minor: 2 memoryClockRate(GHz): 1.253
> pciBusID: 0000:02:00.0
> totalMemory: 1.95GiB freeMemory: 1.68GiB
> 2018-06-28 10:21:53.902121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
> 2018-06-28 10:21:54.313173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
> 2018-06-28 10:21:54.313204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
> 2018-06-28 10:21:54.313212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
> 2018-06-28 10:21:54.313399: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1446 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960, pci bus id: 0000:02:00.0, compute capability: 5.2)
> Creating model
>
> Starting training
> 2018-06-28 10:15:11.058892: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 225.26MiB. Current allocation summary follows.
> 2018-06-28 10:15:11.058956: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (256): Total Chunks: 29, Chunks in use: 27. 7.2KiB allocated for chunks. 6.8KiB in use in bin. 200B client-requested in use in bin.
> 2018-06-28 10:15:11.058981: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (512): Total Chunks: 24, Chunks in use: 24. 12.0KiB allocated for chunks. 12.0KiB in use in bin. 9.4KiB client-requested in use in bin.
> 2018-06-28 10:15:11.059001: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (1024): Total Chunks: 7, Chunks in use: 7. 8.0KiB allocated for chunks. 8.0KiB in use in bin. 6.9KiB client-requested in use in bin.
> 2018-06-28 10:15:11.059017: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (2048): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059033: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059057: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059073: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (16384): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059092: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (32768): Total Chunks: 14, Chunks in use: 13. 546.2KiB allocated for chunks. 511.2KiB in use in bin. 507.2KiB client-requested in use in bin.
> 2018-06-28 10:15:11.059108: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (65536): Total Chunks: 1, Chunks in use: 0. 118.2KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059127: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (131072): Total Chunks: 6, Chunks in use: 6. 937.5KiB allocated for chunks. 937.5KiB in use in bin. 937.5KiB client-requested in use in bin.
> 2018-06-28 10:15:11.059152: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (262144): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059167: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (524288): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059182: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059199: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (2097152): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059218: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (4194304): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059247: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (8388608): Total Chunks: 1, Chunks in use: 0. 8.90MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059264: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (16777216): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059281: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059297: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (67108864): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059316: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (134217728): Total Chunks: 7, Chunks in use: 5. 1.40GiB allocated for chunks. 1.08GiB in use in bin. 1012.19MiB client-requested in use in bin.
> 2018-06-28 10:15:11.059333: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.059351: I tensorflow/core/common_runtime/bfc_allocator.cc:646] Bin for 225.27MiB was 128.00MiB, Chunk State:
> 2018-06-28 10:15:11.059379: I tensorflow/core/common_runtime/bfc_allocator.cc:652] Size: 153.18MiB | Requested Size: 126.15MiB | in_use: 0, prev: Size: 225.27MiB | Requested Size: 225.26MiB | in_use: 1, next: Size: 225.27MiB | Requested Size: 225.26MiB | in_use: 1
> 2018-06-28 10:15:11.059402: I tensorflow/core/common_runtime/bfc_allocator.cc:652] Size: 170.33MiB | Requested Size: 126.15MiB | in_use: 0, prev: Size: 225.27MiB | Requested Size: 225.26MiB | in_use: 1
> 2018-06-28 10:15:11.059419: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0000 of size 1280
> 2018-06-28 10:15:11.059433: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0500 of size 256
> 2018-06-28 10:15:11.059445: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0600 of size 256
> 2018-06-28 10:15:11.059458: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0700 of size 256
> 2018-06-28 10:15:11.059471: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0800 of size 512
> 2018-06-28 10:15:11.059484: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0a00 of size 256
> 2018-06-28 10:15:11.059497: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0b00 of size 512
> 2018-06-28 10:15:11.059510: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0d00 of size 256
> 2018-06-28 10:15:11.059521: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0e00 of size 512
> 2018-06-28 10:15:11.059532: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1000 of size 256
> 2018-06-28 10:15:11.059542: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1100 of size 512
> 2018-06-28 10:15:11.059556: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1300 of size 256
> 2018-06-28 10:15:11.059568: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1400 of size 256
> 2018-06-28 10:15:11.059581: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1500 of size 256
> 2018-06-28 10:15:11.059593: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1600 of size 256
> 2018-06-28 10:15:11.059604: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1700 of size 512
> 2018-06-28 10:15:11.059615: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1900 of size 256
> 2018-06-28 10:15:11.059626: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1a00 of size 512
> 2018-06-28 10:15:11.059639: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1c00 of size 1024
> 2018-06-28 10:15:11.059652: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c2000 of size 40192
> 2018-06-28 10:15:11.059665: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029cbd00 of size 512
> 2018-06-28 10:15:11.059678: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029cbf00 of size 40192
> 2018-06-28 10:15:11.059690: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029d5c00 of size 512
> 2018-06-28 10:15:11.059704: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029d5e00 of size 160000
> 2018-06-28 10:15:11.059716: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029fcf00 of size 512
> 2018-06-28 10:15:11.059729: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029fd100 of size 40192
> 2018-06-28 10:15:11.059742: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a06e00 of size 512
> 2018-06-28 10:15:11.059754: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a07000 of size 1280
> 2018-06-28 10:15:11.059767: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a07500 of size 256
> 2018-06-28 10:15:11.059780: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a07600 of size 160000
> 2018-06-28 10:15:11.059793: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a2e700 of size 512
> 2018-06-28 10:15:11.059821: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a2e900 of size 40192
> 2018-06-28 10:15:11.059835: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38600 of size 512
> 2018-06-28 10:15:11.059848: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38800 of size 256
> 2018-06-28 10:15:11.059861: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38900 of size 256
> 2018-06-28 10:15:11.059873: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38a00 of size 256
> 2018-06-28 10:15:11.059886: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38b00 of size 256
> 2018-06-28 10:15:11.059899: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38c00 of size 41216
> 2018-06-28 10:15:11.059912: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a42d00 of size 512
> 2018-06-28 10:15:11.059925: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a42f00 of size 40192
> 2018-06-28 10:15:11.059938: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a4cc00 of size 512
> 2018-06-28 10:15:11.059951: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a4ce00 of size 160000
> 2018-06-28 10:15:11.059963: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a73f00 of size 512
> 2018-06-28 10:15:11.059976: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74100 of size 1280
> 2018-06-28 10:15:11.059988: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74600 of size 256
> 2018-06-28 10:15:11.060001: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74700 of size 256
> 2018-06-28 10:15:11.060014: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74800 of size 256
> 2018-06-28 10:15:11.060026: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74900 of size 256
> 2018-06-28 10:15:11.060039: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74a00 of size 256
> 2018-06-28 10:15:11.060052: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74b00 of size 256
> 2018-06-28 10:15:11.060064: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74c00 of size 256
> 2018-06-28 10:15:11.060077: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74d00 of size 256
> 2018-06-28 10:15:11.060089: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a74e00 of size 256
> 2018-06-28 10:15:11.060102: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74f00 of size 256
> 2018-06-28 10:15:11.060115: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a75000 of size 256
> 2018-06-28 10:15:11.060127: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a75100 of size 256
> 2018-06-28 10:15:11.060140: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a75200 of size 35840
> 2018-06-28 10:15:11.060153: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a7de00 of size 512
> 2018-06-28 10:15:11.060165: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a7e000 of size 40192
> 2018-06-28 10:15:11.060178: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a87d00 of size 121088
> 2018-06-28 10:15:11.060191: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aa5600 of size 512
> 2018-06-28 10:15:11.060204: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aa5800 of size 40192
> 2018-06-28 10:15:11.060216: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aaf500 of size 512
> 2018-06-28 10:15:11.060229: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aaf700 of size 1024
> 2018-06-28 10:15:11.060242: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aafb00 of size 40192
> 2018-06-28 10:15:11.060264: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ab9800 of size 512
> 2018-06-28 10:15:11.060277: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ab9a00 of size 40192
> 2018-06-28 10:15:11.060291: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ac3700 of size 512
> 2018-06-28 10:15:11.060304: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ac3900 of size 160000
> 2018-06-28 10:15:11.060318: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aeaa00 of size 512
> 2018-06-28 10:15:11.060331: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aeac00 of size 40192
> 2018-06-28 10:15:11.060345: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af4900 of size 512
> 2018-06-28 10:15:11.060358: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af4b00 of size 1280
> 2018-06-28 10:15:11.060372: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af5000 of size 256
> 2018-06-28 10:15:11.060385: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af5100 of size 160000
> 2018-06-28 10:15:11.060399: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b1c200 of size 512
> 2018-06-28 10:15:11.060412: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b1c400 of size 40192
> 2018-06-28 10:15:11.060425: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b26100 of size 512
> 2018-06-28 10:15:11.060439: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b26300 of size 1024
> 2018-06-28 10:15:11.060452: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b26700 of size 40192
> 2018-06-28 10:15:11.060466: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b30400 of size 160000
> 2018-06-28 10:15:11.060479: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702b57500 of size 9330432
> 2018-06-28 10:15:11.060493: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x70343d400 of size 236208128
> 2018-06-28 10:15:11.060507: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x711581400 of size 236208128
> 2018-06-28 10:15:11.060521: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x71f6c5400 of size 220460800
> 2018-06-28 10:15:11.060534: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x72c904b00 of size 236208128
> 2018-06-28 10:15:11.060548: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x73aa48b00 of size 160621312
> 2018-06-28 10:15:11.060561: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x744376e00 of size 236208128
> 2018-06-28 10:15:11.060574: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x7524bae00 of size 178606592
> 2018-06-28 10:15:11.060587: I tensorflow/core/common_runtime/bfc_allocator.cc:671] Summary of in-use Chunks by size:
> 2018-06-28 10:15:11.060603: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 27 Chunks of size 256 totalling 6.8KiB
> 2018-06-28 10:15:11.060620: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 24 Chunks of size 512 totalling 12.0KiB
> 2018-06-28 10:15:11.060636: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 3 Chunks of size 1024 totalling 3.0KiB
> 2018-06-28 10:15:11.060651: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 4 Chunks of size 1280 totalling 5.0KiB
> 2018-06-28 10:15:11.060667: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 12 Chunks of size 40192 totalling 471.0KiB
> 2018-06-28 10:15:11.060683: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 41216 totalling 40.2KiB
> 2018-06-28 10:15:11.060700: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 160000 totalling 937.5KiB
> 2018-06-28 10:15:11.060716: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 220460800 totalling 210.25MiB
> 2018-06-28 10:15:11.060732: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 4 Chunks of size 236208128 totalling 901.06MiB
> 2018-06-28 10:15:11.060747: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Sum Total of in-use chunks: 1.09GiB
> 2018-06-28 10:15:11.060765: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats:
> Limit: 1515520000
> InUse: 1166804224
> MaxInUse: 1506032128
> NumAllocs: 35113
> MaxAllocSize: 354975232
>
> 2018-06-28 10:15:11.060795: W tensorflow/core/common_runtime/bfc_allocator.cc:279] **************************xxxxx********************************_________*****************___________
> 2018-06-28 10:15:11.060842: W tensorflow/core/common_runtime/bfc_allocator.cc:275] Allocator (GPU_0_bfc) ran out of memory trying to allocate 210.25MiB. Current allocation summary follows.
> 2018-06-28 10:15:11.060864: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at cwise_ops_common.cc:70 : Resource exhausted: OOM when allocating tensor with shape[9842,60,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
> 2018-06-28 10:15:11.060894: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (256): Total Chunks: 29, Chunks in use: 27. 7.2KiB allocated for chunks. 6.8KiB in use in bin. 200B client-requested in use in bin.
> 2018-06-28 10:15:11.060918: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (512): Total Chunks: 24, Chunks in use: 24. 12.0KiB allocated for chunks. 12.0KiB in use in bin. 9.4KiB client-requested in use in bin.
> 2018-06-28 10:15:11.060938: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (1024): Total Chunks: 7, Chunks in use: 7. 8.0KiB allocated for chunks. 8.0KiB in use in bin. 6.9KiB client-requested in use in bin.
> 2018-06-28 10:15:11.060956: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (2048): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.060973: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (4096): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.060992: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (8192): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061008: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (16384): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061031: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (32768): Total Chunks: 14, Chunks in use: 13. 546.2KiB allocated for chunks. 511.2KiB in use in bin. 507.2KiB client-requested in use in bin.
> 2018-06-28 10:15:11.061050: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (65536): Total Chunks: 1, Chunks in use: 0. 118.2KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061071: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (131072): Total Chunks: 6, Chunks in use: 6. 937.5KiB allocated for chunks. 937.5KiB in use in bin. 937.5KiB client-requested in use in bin.
> 2018-06-28 10:15:11.061089: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (262144): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061106: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (524288): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061122: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (1048576): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061139: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (2097152): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061156: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (4194304): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061175: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (8388608): Total Chunks: 1, Chunks in use: 0. 8.90MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061192: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (16777216): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061209: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (33554432): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061226: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (67108864): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061247: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (134217728): Total Chunks: 7, Chunks in use: 5. 1.40GiB allocated for chunks. 1.08GiB in use in bin. 1012.19MiB client-requested in use in bin.
> 2018-06-28 10:15:11.061264: I tensorflow/core/common_runtime/bfc_allocator.cc:630] Bin (268435456): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
> 2018-06-28 10:15:11.061283: I tensorflow/core/common_runtime/bfc_allocator.cc:646] Bin for 210.25MiB was 128.00MiB, Chunk State:
> 2018-06-28 10:15:11.061312: I tensorflow/core/common_runtime/bfc_allocator.cc:652] Size: 153.18MiB | Requested Size: 126.15MiB | in_use: 0, prev: Size: 225.27MiB | Requested Size: 225.26MiB | in_use: 1, next: Size: 225.27MiB | Requested Size: 225.26MiB | in_use: 1
> 2018-06-28 10:15:11.061338: I tensorflow/core/common_runtime/bfc_allocator.cc:652] Size: 170.33MiB | Requested Size: 126.15MiB | in_use: 0, prev: Size: 225.27MiB | Requested Size: 225.26MiB | in_use: 1
> 2018-06-28 10:15:11.061359: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0000 of size 1280
> 2018-06-28 10:15:11.061375: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0500 of size 256
> 2018-06-28 10:15:11.061389: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0600 of size 256
> 2018-06-28 10:15:11.061402: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0700 of size 256
> 2018-06-28 10:15:11.061416: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0800 of size 512
> 2018-06-28 10:15:11.061430: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0a00 of size 256
> 2018-06-28 10:15:11.061443: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0b00 of size 512
> 2018-06-28 10:15:11.061456: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0d00 of size 256
> 2018-06-28 10:15:11.061469: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c0e00 of size 512
> 2018-06-28 10:15:11.061482: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1000 of size 256
> 2018-06-28 10:15:11.061495: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1100 of size 512
> 2018-06-28 10:15:11.061508: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1300 of size 256
> 2018-06-28 10:15:11.061521: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1400 of size 256
> 2018-06-28 10:15:11.061535: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1500 of size 256
> 2018-06-28 10:15:11.061548: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1600 of size 256
> 2018-06-28 10:15:11.061561: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1700 of size 512
> 2018-06-28 10:15:11.061574: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1900 of size 256
> 2018-06-28 10:15:11.061587: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1a00 of size 512
> 2018-06-28 10:15:11.061601: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c1c00 of size 1024
> 2018-06-28 10:15:11.061615: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029c2000 of size 40192
> 2018-06-28 10:15:11.061628: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029cbd00 of size 512
> 2018-06-28 10:15:11.061641: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029cbf00 of size 40192
> 2018-06-28 10:15:11.061654: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029d5c00 of size 512
> 2018-06-28 10:15:11.061668: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029d5e00 of size 160000
> 2018-06-28 10:15:11.061681: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029fcf00 of size 512
> 2018-06-28 10:15:11.061695: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x7029fd100 of size 40192
> 2018-06-28 10:15:11.061708: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a06e00 of size 512
> 2018-06-28 10:15:11.061721: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a07000 of size 1280
> 2018-06-28 10:15:11.061734: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a07500 of size 256
> 2018-06-28 10:15:11.061748: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a07600 of size 160000
> 2018-06-28 10:15:11.061761: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a2e700 of size 512
> 2018-06-28 10:15:11.061774: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a2e900 of size 40192
> 2018-06-28 10:15:11.061787: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38600 of size 512
> 2018-06-28 10:15:11.061801: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38800 of size 256
> 2018-06-28 10:15:11.061814: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38900 of size 256
> 2018-06-28 10:15:11.061827: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38a00 of size 256
> 2018-06-28 10:15:11.061840: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38b00 of size 256
> 2018-06-28 10:15:11.061854: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a38c00 of size 41216
> 2018-06-28 10:15:11.061867: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a42d00 of size 512
> 2018-06-28 10:15:11.061881: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a42f00 of size 40192
> 2018-06-28 10:15:11.061894: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a4cc00 of size 512
> 2018-06-28 10:15:11.061907: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a4ce00 of size 160000
> 2018-06-28 10:15:11.061920: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a73f00 of size 512
> 2018-06-28 10:15:11.061933: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74100 of size 1280
> 2018-06-28 10:15:11.061946: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74600 of size 256
> 2018-06-28 10:15:11.061960: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74700 of size 256
> 2018-06-28 10:15:11.061972: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74800 of size 256
> 2018-06-28 10:15:11.061985: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74900 of size 256
> 2018-06-28 10:15:11.061998: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74a00 of size 256
> 2018-06-28 10:15:11.062011: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74b00 of size 256
> 2018-06-28 10:15:11.062024: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74c00 of size 256
> 2018-06-28 10:15:11.062037: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74d00 of size 256
> 2018-06-28 10:15:11.062049: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a74e00 of size 256
> 2018-06-28 10:15:11.062062: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a74f00 of size 256
> 2018-06-28 10:15:11.062075: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a75000 of size 256
> 2018-06-28 10:15:11.062088: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a75100 of size 256
> 2018-06-28 10:15:11.062101: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a75200 of size 35840
> 2018-06-28 10:15:11.062114: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a7de00 of size 512
> 2018-06-28 10:15:11.062127: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702a7e000 of size 40192
> 2018-06-28 10:15:11.062140: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702a87d00 of size 121088
> 2018-06-28 10:15:11.062153: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aa5600 of size 512
> 2018-06-28 10:15:11.062167: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aa5800 of size 40192
> 2018-06-28 10:15:11.062180: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aaf500 of size 512
> 2018-06-28 10:15:11.062193: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aaf700 of size 1024
> 2018-06-28 10:15:11.062206: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aafb00 of size 40192
> 2018-06-28 10:15:11.062219: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ab9800 of size 512
> 2018-06-28 10:15:11.062232: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ab9a00 of size 40192
> 2018-06-28 10:15:11.062245: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ac3700 of size 512
> 2018-06-28 10:15:11.062258: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702ac3900 of size 160000
> 2018-06-28 10:15:11.062271: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aeaa00 of size 512
> 2018-06-28 10:15:11.062284: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702aeac00 of size 40192
> 2018-06-28 10:15:11.062297: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af4900 of size 512
> 2018-06-28 10:15:11.062310: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af4b00 of size 1280
> 2018-06-28 10:15:11.062323: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af5000 of size 256
> 2018-06-28 10:15:11.062336: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702af5100 of size 160000
> 2018-06-28 10:15:11.062349: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b1c200 of size 512
> 2018-06-28 10:15:11.062362: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b1c400 of size 40192
> 2018-06-28 10:15:11.062375: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b26100 of size 512
> 2018-06-28 10:15:11.062388: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b26300 of size 1024
> 2018-06-28 10:15:11.062401: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b26700 of size 40192
> 2018-06-28 10:15:11.062415: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x702b30400 of size 160000
> 2018-06-28 10:15:11.069554: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x702b57500 of size 9330432
> 2018-06-28 10:15:11.069570: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x70343d400 of size 236208128
> 2018-06-28 10:15:11.069583: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x711581400 of size 236208128
> 2018-06-28 10:15:11.069597: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x71f6c5400 of size 220460800
> 2018-06-28 10:15:11.069610: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x72c904b00 of size 236208128
> 2018-06-28 10:15:11.069623: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x73aa48b00 of size 160621312
> 2018-06-28 10:15:11.069636: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Chunk at 0x744376e00 of size 236208128
> 2018-06-28 10:15:11.069649: I tensorflow/core/common_runtime/bfc_allocator.cc:665] Free at 0x7524bae00 of size 178606592
> 2018-06-28 10:15:11.069662: I tensorflow/core/common_runtime/bfc_allocator.cc:671] Summary of in-use Chunks by size:
> 2018-06-28 10:15:11.069678: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 27 Chunks of size 256 totalling 6.8KiB
> 2018-06-28 10:15:11.069694: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 24 Chunks of size 512 totalling 12.0KiB
> 2018-06-28 10:15:11.069710: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 3 Chunks of size 1024 totalling 3.0KiB
> 2018-06-28 10:15:11.069725: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 4 Chunks of size 1280 totalling 5.0KiB
> 2018-06-28 10:15:11.069740: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 12 Chunks of size 40192 totalling 471.0KiB
> 2018-06-28 10:15:11.069756: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 41216 totalling 40.2KiB
> 2018-06-28 10:15:11.069772: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 6 Chunks of size 160000 totalling 937.5KiB
> 2018-06-28 10:15:11.069787: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 1 Chunks of size 220460800 totalling 210.25MiB
> 2018-06-28 10:15:11.069803: I tensorflow/core/common_runtime/bfc_allocator.cc:674] 4 Chunks of size 236208128 totalling 901.06MiB
> 2018-06-28 10:15:11.069818: I tensorflow/core/common_runtime/bfc_allocator.cc:678] Sum Total of in-use chunks: 1.09GiB
> 2018-06-28 10:15:11.069834: I tensorflow/core/common_runtime/bfc_allocator.cc:680] Stats:
> Limit: 1515520000
> InUse: 1166804224
> MaxInUse: 1506032128
> NumAllocs: 35113
> MaxAllocSize: 354975232
>
> 2018-06-28 10:15:11.069867: W tensorflow/core/common_runtime/bfc_allocator.cc:279] **************************xxxxx********************************_________*****************___________
> 2018-06-28 10:15:11.069910: W tensorflow/core/framework/op_kernel.cc:1318] OP_REQUIRES failed at batch_matmul_op_impl.h:489 : Resource exhausted: OOM when allocating tensor with shape[9842,56,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
> Traceback (most recent call last):
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
> return fn(*args)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
> options, feed_dict, fetch_list, target_list, run_metadata)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
> run_metadata)
> tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[9842,60,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
> [[Node: comparison/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Reshape_1, inter-attention/beta)]]
> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
>
> [[Node: comparison_1/comparison/layer1/dense/Tensordot/Shape/_489 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_531_comparison_1/comparison/layer1/dense/Tensordot/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
>
>
> During handling of the above exception, another exception occurred:
>
> Traceback (most recent call last):
> File "train.py", line 132, in <module>
> args.clip_norm, args.report)
> File "/home/bshow/tensorflow/multiffn-nli/src/decomposable.py", line 541, in train
> feeds)
> File "/home/bshow/tensorflow/multiffn-nli/src/decomposable.py", line 474, in _run_on_validation
> loss, acc = session.run([self.loss, self.accuracy], feeds)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 900, in run
> run_metadata_ptr)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1135, in _run
> feed_dict_tensor, options, run_metadata)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
> run_metadata)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
> raise type(e)(node_def, op, message)
> tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[9842,60,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
> [[Node: comparison/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Reshape_1, inter-attention/beta)]]
> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
>
> [[Node: comparison_1/comparison/layer1/dense/Tensordot/Shape/_489 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_531_comparison_1/comparison/layer1/dense/Tensordot/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
>
>
> Caused by op 'comparison/mul', defined at:
> File "train.py", line 114, in <module>
> optimizer=args.optim)
> File "/home/bshow/tensorflow/multiffn-nli/src/multimlp.py", line 41, in __init__
> training, project_input, optimizer)
> File "/home/bshow/tensorflow/multiffn-nli/src/decomposable.py", line 139, in __init__
> self.v1 = self.compare(repr1, self.beta, self.sentence1_size)
> File "/home/bshow/tensorflow/multiffn-nli/src/decomposable.py", line 336, in compare
> sentence * soft_alignment]
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 979, in binary_op_wrapper
> return func(x, y, name=name)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 1211, in _mul_dispatch
> return gen_math_ops.mul(x, y, name=name)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4759, in mul
> "Mul", x=x, y=y, name=name)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
> op_def=op_def)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
> op_def=op_def)
> File "/home/bshow/tensorflow/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
> self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
>
> ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[9842,60,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
> [[Node: comparison/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Reshape_1, inter-attention/beta)]]
> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
>
> [[Node: comparison_1/comparison/layer1/dense/Tensordot/Shape/_489 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_531_comparison_1/comparison/layer1/dense/Tensordot/Shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
> Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAR-U5VvkqTn_SD9DhyvrOSOZkpTKgWqks5uBJM4gaJpZM4U69QF>
.
|
Got same issue, any solutions? |
If your GPU memory is small, the only solution I can think of is running a smaller batch size. |
GPU memory is 12 GB and I'm trying to run the embedding_lookup in 2 different GPUs, with batch size of 2 (I even tried batch size 1 haha) - is there no other workaround? |
That sounds very weird. I don't have any other suggestions, as I haven't touched this code for over a year now. Which version of Tensorflow are you using? This code was originally for TF 1.2; I haven't worked with TF in a while and have no idea about what might have changed. |
I'm using TF 1.12.2. I don't think it's a compatibility issue though :/ |
I'am trying to run the net but i have an OOM issue. I have an nVidia 960 with 2GB RAM. Is it possible that my memory is just not enough?
I report below the output of the net:
The text was updated successfully, but these errors were encountered: