Hello,
Since a coupe of weeks we experience intermitted connectivity on a daily basis. Though not always on the same time, not always every day, though in the ESXi logs we see the following:
2019-03-13T20:06:20.426Z cpu2:2097220)igbn: indrv_UplinkReset:1447: indrv_UplinkReset : vmnic0 device reset started
2019-03-13T20:06:20.426Z cpu2:2097220)igbn: indrv_UplinkQuiesceIo:1411: Stopping I/O on vmnic0
2019-03-13T20:06:20.462Z cpu2:2097220)igbn: indrv_DeviceReset:2306: Device Resetting vmnic0
2019-03-13T20:06:20.462Z cpu2:2097220)igbn: indrv_ChangeState:326: vmnic0: change PF state from 2 to 8
2019-03-13T20:06:20.462Z cpu2:2097220)igbn: indrv_Stop:1890: stopping vmnic0
2019-03-13T20:06:20.462Z cpu2:2097220)igbn: indrv_ChangeState:326: vmnic0: change PF state from 8 to 4
2019-03-13T20:06:20.492Z cpu2:2097220)igbn: indrv_ChangeState:326: vmnic0: change PF state from 4 to 1
2019-03-13T20:06:20.492Z cpu2:2097220)igbn: indrv_ChangeState:326: vmnic0: change PF state from 1 to 20
2019-03-13T20:06:20.493Z cpu2:2097220)igbn: indrv_ChangeState:326: vmnic0: change PF state from 20 to 2
2019-03-13T20:06:20.493Z cpu2:2097220)igbn: indrv_UplinkStartIo:1393: Starting I/O on vmnic0
2019-03-13T20:06:20.507Z cpu2:2097220)igbn: indrv_UplinkReset:1464: indrv_UplinkReset : vmnic0 device reset completed
2019-03-13T20:06:27.426Z cpu2:2097220)NetqueueBal: 5032: vmnic0: device Up notification, reset logical space needed
2019-03-13T20:06:27.427Z cpu3:2212666)NetSched: 654: vmnic0-0-tx: worldID = 2212666 exits
2019-03-13T20:06:27.428Z cpu3:2224632)NetSched: 654: vmnic0-0-tx: worldID = 2224632 exits
2019-03-13T20:06:27.428Z cpu2:2097220)igbn: indrv_UplinkDisableCap:1114: Disable vmnic0 cap function 1
2019-03-13T20:06:27.428Z cpu2:2097220)igbn: indrv_UplinkDisableCap:1114: Disable vmnic0 cap function 4
...
In Syslog we see some more events related as the access switch in the datacenter reports an up/down event:
Date | time | Date/Time | Facility | Level | hostname | Message tekst |
13/03/2019 | 21:06:53 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:56.340772+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:56:339Z 'Activation.trace' 140109948208896 INFO [activationValidator, 1261] Trace objects loaded. |
13/03/2019 | 21:06:53 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:56.340557+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:56:337Z 'InternalScheduledTasksMgr' 140109948208896 INFO [internalScheduledTasksMgr, 853] Temp directory disk free space is:8407379968 |
13/03/2019 | 21:06:53 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:56.340338+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:56:337Z 'InternalScheduledTasksMgr' 140109948208896 INFO [internalScheduledTasksMgr, 804] Patch store disk free space is:104520380416 |
13/03/2019 | 21:06:53 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:56.340114+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:56:337Z 'InternalScheduledTasksMgr' 140109948208896 INFO [internalScheduledTasksMgr, 303] Internal Scheduled Tasks Manager Timercallback end of this timer slice.....Rescheduling after 300000000 microseconds |
13/03/2019 | 21:06:53 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:56.339871+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:56:337Z 'InternalScheduledTasksMgr' 140109948208896 INFO [internalScheduledTasksMgr, 724] InvokeCallbacks. Total number ofcallbacks: 7 |
13/03/2019 | 21:06:53 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:56.339443+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:56:337Z 'InternalScheduledTasksMgr' 140109948208896 INFO [internalScheduledTasksMgr, 194] Internal Scheduled Tasks Manager Timercallback... |
13/03/2019 | 21:06:24 | 3/13/19 21:06 | Local7 | Notice | Network Switch | 819: Mar 13 2019 20:06:27.732 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/2, changed state to up |
13/03/2019 | 21:06:22 | 3/13/19 21:06 | Local7 | Error | Network Switch | 818: Mar 13 2019 20:06:25.686 UTC: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/2, changed state to up |
13/03/2019 | 21:06:20 | 3/13/19 21:06 | Local7 | Error | Network Switch | 817: Mar 13 2019 20:06:22.875 UTC: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/2, changed state to down |
13/03/2019 | 21:06:19 | 3/13/19 21:06 | Local7 | Notice | Network Switch | 816: Mar 13 2019 20:06:21.860 UTC: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/2, changed state to down |
13/03/2019 | 21:06:17 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:21.757084+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:21:756Z 'VcIntegrity' 140109287241472 INFO [vcIntegrity, 1536] Cannot get IP address for host name: tpvc-pvvm-003 |
13/03/2019 | 21:06:17 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:21.752196+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:21:742Z 'VcIntegrity' 140109287241472 INFO [vcIntegrity, 1519] Getting IP Address from host name: tpvc-pvvm-003 |
13/03/2019 | 21:06:17 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:21.742931+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:21:742Z 'Activation' 140109287241472 INFO [activationValidator, 368] Leave Validate. Succeeded forintegrity.VcIntegrity.retrieveHostIPAddresses on target: Integrity.VcIntegrity |
13/03/2019 | 21:06:07 | 3/13/19 21:06 | User | Info | vCenter | 1 2019-03-13T20:06:11.089546+00:00 tpvc-pvvm-003 vpxd 4459 - - Event [614748] [1-1] [2019-03-13T20:06:11.089256Z] [vim.event.UserLoginSessionEvent] [info] [TRUEPARTNER\sa-veeam] [] [614748] [UserTRUEPARTNER\sa-veeam@x.x.3.101 logged in as VMware VI Client] |
13/03/2019 | 21:05:58 | 3/13/19 21:05 | User | Debug | vCenter | 1 2019-03-13T20:06:02.401034+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:02:400Z 'JobDispatcher' 140109819787008 DEBUG [JobDispatcher, 415] The number of tasks: 0 |
13/03/2019 | 21:05:58 | 3/13/19 21:05 | Cron | Info | vCenter | 1 2019-03-13T20:06:01.830614+00:00 tpvc-pvvm-003 CROND 55022 - - (root) CMD (. /etc/profile.d/VMware-visl-integration.sh; /usr/lib/applmgmt/backup_restore/scripts/SchedulerCron.py>>/var/log/vmware/applmgmt/backupSchedulerCron.log 2>&1) |
13/03/2019 | 21:05:57 | 3/13/19 21:05 | Cron | Info | vCenter | 1 2019-03-13T20:06:01.830222+00:00 tpvc-pvvm-003 CROND 55021 - - (root) CMD ( test -x /usr/sbin/vpxd_periodic && /usr/sbin/vpxd_periodic >/dev/null 2>&1) |
13/03/2019 | 21:05:57 | 3/13/19 21:05 | User | Info | vCenter | 1 2019-03-13T20:06:01.756909+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:01:756Z 'VcIntegrity' 140109286442752 INFO [vcIntegrity, 1536] Cannot get IP address for host name: tpvc-pvvm-003 |
13/03/2019 | 21:05:57 | 3/13/19 21:05 | User | Info | vCenter | 1 2019-03-13T20:06:01.752179+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:01:743Z 'VcIntegrity' 140109286442752 INFO [vcIntegrity, 1519] Getting IP Address from host name: tpvc-pvvm-003 |
13/03/2019 | 21:05:57 | 3/13/19 21:05 | User | Info | vCenter | 1 2019-03-13T20:06:01.744337+00:00 tpvc-pvvm-003 updatemgr - - - 2019-03-13T20:06:01:743Z 'Activation' 140109286442752 INFO [activationValidator, 368] Leave Validate. Succeeded forintegrity.VcIntegrity.retrieveHostIPAddresses on target: Integrity.VcIntegrity |
Hardware profile:
Cisco 3750 switch stack
Server hardware:
- Hypervisor:VMware ESXi, 6.7.0, 11675023
- Model:PowerEdge R720
- Processor Type:Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz
- Logical Processors:8
- NICs:4
- Virtual Machines:22
- State:Connected
- Uptime:11 days
At this moment it is also unknown why VMNIC1 is not taking over the data stream as it is configured to become active when the primary link fails VMNIC0.
Any suggestions are welcome and if more information is required then let me know.
Regards,
Martin Meuwese