[Jenkins-infra] JIRA and Confluence incident

R. Tyler Croy tyler at monkeypox.org
Tue Dec 17 16:59:52 UTC 2013


Looks like both JIRA and Confluence croaked overnight due to some OOM killer
related failures.

If you have access to eggplant, you'll see some errors like these in
/var/log/messages:

    Dec 17 15:56:00 eggplant kernel: [12940965.080527] lowmem_reserve[]: 0 0 0 0
    Dec 17 15:56:00 eggplant kernel: [12940965.080529] Node 0 DMA: 3*4kB 53*8kB 22*16kB 24*32kB 11*64kB 7*128kB 1*256kB 1*512kB 0*1024kB 3*2048kB 0*4096kB = 10068kB
    Dec 17 15:56:00 eggplant kernel: [12940965.080535] Node 0 DMA32: 1099*4kB 10*8kB 0*16kB 10*32kB 8*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 6332kB
    Dec 17 15:56:00 eggplant kernel: [12940965.080541] 14000 total pagecache pages
    Dec 17 15:56:00 eggplant kernel: [12940965.080542] 13790 pages in swap cache
    Dec 17 15:56:00 eggplant kernel: [12940965.080547] Swap cache stats: add 9876006, delete 9862216, find 78826484/80335245
    Dec 17 15:56:00 eggplant kernel: [12940965.080548] Free swap  = 0kB
    Dec 17 15:56:00 eggplant kernel: [12940965.080549] Total swap = 530140kB
    Dec 17 15:56:00 eggplant kernel: [12940965.085462] 655341 pages RAM
    Dec 17 15:56:00 eggplant kernel: [12940965.085464] 12119 pages reserved
    Dec 17 15:56:00 eggplant kernel: [12940965.085465] 975 pages shared
    Dec 17 15:56:00 eggplant kernel: [12940965.085466] 637807 pages non-shared
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] apache2 invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] apache2 cpuset=/ mems_allowed=0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Pid: 18262, comm: apache2 Not tainted 2.6.35-24-virtual #42-Ubuntu
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Call Trace:
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff810ae6dd>] ? cpuset_print_task_mems_allowed+0x9d/0xb0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81103891>] dump_header+0x81/0xc0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81103951>] oom_kill_process+0x81/0x180
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81103e88>] __out_of_memory+0x58/0xd0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81103f86>] out_of_memory+0x86/0x1c0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff811079de>] __alloc_pages_slowpath+0x58e/0x5a0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81107b54>] __alloc_pages_nodemask+0x164/0x1d0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81139cca>] alloc_pages_current+0x9a/0x100
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81100fa7>] __page_cache_alloc+0x87/0x90
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff8110b119>] __do_page_cache_readahead+0xc9/0x210
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff8110b281>] ra_submit+0x21/0x30
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81102793>] filemap_fault+0x3f3/0x450
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff8111d684>] __do_fault+0x54/0x560
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81120629>] handle_mm_fault+0x1b9/0x440
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81036079>] ? kvm_clock_get_cycles+0x9/0x10
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff81089649>] ? ktime_get_ts+0xa9/0xe0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff815a82b5>] do_page_fault+0x125/0x350
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  [<ffffffff815a4e35>] page_fault+0x25/0x30
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Mem-Info:
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Node 0 DMA per-cpu:
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] CPU    0: hi:    0, btch:   1 usd:   0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] CPU    1: hi:    0, btch:   1 usd:   0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Node 0 DMA32 per-cpu:
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] CPU    0: hi:  186, btch:  31 usd:   0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] CPU    1: hi:  186, btch:  31 usd:  32
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] active_anon:488663 inactive_anon:122659 isolated_anon:0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  active_file:0 inactive_file:0 isolated_file:0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  unevictable:0 dirty:0 writeback:0 unstable:0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  free:4099 slab_reclaimable:3046 slab_unreclaimable:5062
    Dec 17 15:56:00 eggplant kernel: [12940965.090869]  mapped:69 shmem:149 pagetables:2419 bounce:0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Node 0 DMA free:10064kB min:36kB low:44kB high:52kB active_anon:2304kB inactive_anon:2548kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15700kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:180kB slab_unreclaimable:732kB kernel_stack:48kB pagetables:8kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] lowmem_reserve[]: 0 2509 2509 2509
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Node 0 DMA32 free:6332kB min:6388kB low:7984kB high:9580kB active_anon:1952348kB inactive_anon:488088kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2569428kB mlocked:0kB dirty:0kB writeback:0kB mapped:276kB shmem:596kB slab_reclaimable:12004kB slab_unreclaimable:19516kB kernel_stack:5224kB pagetables:9668kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:256 all_unreclaimable? yes
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] lowmem_reserve[]: 0 0 0 0
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Node 0 DMA: 3*4kB 53*8kB 22*16kB 24*32kB 11*64kB 7*128kB 1*256kB 1*512kB 0*1024kB 3*2048kB 0*4096kB = 10068kB
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Node 0 DMA32: 1120*4kB 11*8kB 0*16kB 10*32kB 8*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 6424kB
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] 14000 total pagecache pages
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] 13798 pages in swap cache
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Swap cache stats: add 9876020, delete 9862222, find 78826486/80335257
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Free swap  = 0kB
    Dec 17 15:56:00 eggplant kernel: [12940965.090869] Total swap = 530140kB
    Dec 17 15:56:00 eggplant kernel: [12940965.095423] 655341 pages RAM
    Dec 17 15:56:00 eggplant kernel: [12940965.095425] 12119 pages reserved
    Dec 17 15:56:00 eggplant kernel: [12940965.095426] 975 pages shared
    Dec 17 15:56:00 eggplant kernel: [12940965.095427] 637808 pages non-shared


I'm bringing these services back online now

- R. Tyler Croy
------------------
   Code: https://github.com/rtyler
Chatter: https://twitter.com/agentdero
         rtyler at jabber.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://lists.jenkins-ci.org/pipermail/jenkins-infra/attachments/20131217/4b8739b7/attachment.asc>


More information about the Jenkins-infra mailing list