Company brought to its knees by a cable

ExtremeDullard@lemmy.sdf.org · edit-2 3 days ago

Company brought to its knees by a cable

Gobo@lemmy.world · 3 days ago

Yea. This is what spanning tree and bpduguard is for. Don’t disable them on your edge.

mlg@lemmy.world · 3 days ago

Lol imagine the poor dude in his office who was just bored and thought “what if I plug this cable back into the hub, probably won’t do anything”

ExtremeDullard@lemmy.sdf.org · edit-2 3 days ago

Actually this happened in the lab. I know exactly who did this because he told me: we were discussing what had happened and he said “Oh yeah, Daniel and I needed to connect this Windows machine to the intranet quick because we had something urgent to do, and we connected all the ends of the nest of ethernet cables at random until the machine connected. And then we left everything as it was.” But bad luck for us, their machine was connected, but so was that fatal cable on both ends. It just happened that their machine kept working well enough for them to finish what they were doing without noticing the problems rightaway.

And in case you wonder, there’s no penalty in our company for owning up to honest mistakes, so that’s why he readily admitted to it. Only people who never do anything never do anything wrong.

Randelung@lemmy.world · 3 days ago

That’s a healthy attitude! The blame game is useless in most cases.

GreyEyedGhost@lemmy.ca · 2 days ago

I do hope you taught him the many better ways of doing this. I absolutely agree with making an environment where mistakes are easily owned up to (I made a mistake that ended up costing my employer over $10k in the last year), but if it isn’t coupled with turning those into learning experiences (here’s why you don’t do that, here’s why this is a better solution) then you just have a lot of mistakes happening over and over again.

Socsa@sh.itjust.works · 3 days ago

In my experience it’s either someone doing it on purpose, or someone accidentally pulling the wrong cable out of a rats nest.

oleorun@real.lemmy.fan · 3 days ago

This got me too once. I was in the server room replacing old 110 punch panels/blocks with 8P8C connections. I lost track of cable connections, a mistake I have learned from, and I looped a patch cable into the same switch. Within moments the entire network went down.

Forty-five minutes later and we figured out the loop.

Another lesson learned: HP Procurve switches did not have Spanning Tree enabled by default.

Anyway, mistakes happen, especially in IT. It’s all part of the learning experience. My boss was the coolest, chillest guy in the world so I learned and moved on.

dukatos@lemm.ee · 2 days ago

Managed switches are not expensive and have death loop protection.

AstridWipenaugh@lemmy.world · 3 days ago

I was diagnosing a network bottleneck at a customer site that didn’t make any sense. Literally everything had gigabit connections except one block of cubicles, but all the devices were connected to the same subnet router for that part of the building. Started tracing wires like you did and found that someone didn’t have a long enough cable when building the office and installed a 10 megabit linksys switch in the drop ceiling to connect two short cables. Rather than fix the cable, the customer just went to Best buy and bought a gigabit Linksys switch to replace it… A multi-million dollar operation is being held together by a $10 switch…

ramble81@lemm.ee · 3 days ago

I really hope you meant “switch” when saying “hub”. I haven’t seen a hub used in decades. Also your switch should have some level of STP protection enabled to prevent that. Even if someone had a hub with a routing loop, STP would have disabled the ports.

dan@upvote.au · 3 days ago

Basic unmanaged switches often don’t have any sort of protection, and on some fancier managed switches it’s disabled by default (no idea why)

Jajcus@sh.itjust.works · edit-2 15 hours ago

no idea why

Because it makes initial connection much slower. Dumb switch - you insert a cable ant the and it works. STP-enabled switch: you insert a cable and it takes a while until the port is enabled (unless you do extra configuration, appropriate for your network topology). This is annoying and for inexperienced users it could seem like the switch ‘does not work’. It is easier to sell a switch without such a feature enabled by default.

NaibofTabr@infosec.pub · 3 days ago

the tyranny of the default strikes again

Socsa@sh.itjust.works · 3 days ago

Yup, the good old “loopback FU.”

Routers do have some protections which can mitigate this, but the entire problem is broadcast flooding which can’t really be dealt with at later 2, or even at layer 3 within the same segment. Most places will have no broadcast forwarding between segments, but even if you detect unusual broadcast activity and ban that class of traffic, you break other things. A lot of times it is ARP floods, so it doesn’t happen when the network is static and converged until someone plugs a new laptop in, and then everyone assumes it’s that laptop.

°˖✧ ipha ✧˖°@lemm.ee · 3 days ago

But there’s theory and there’s reality.

Mood. I can’t count the times I’ve found issues that shouldn’t be possible, but are clearly happening.

oleorun@real.lemmy.fan · 3 days ago

We used to use Malwarebytes Corporate Edition at work.

One afternoon all of our web servers stopped responding to traffic on port 443. I could RDC into the servers, and I could ping them, but most traffic wasn’t being passed properly.

Despite not having made any changes, I did everything I could think of to get them to work. I tried moving them to different switches, different static IPs, Wireshark showed packets flowing, but no web traffic.

I left the office. It was around 8 PM and I had been banging my head on my desk trying to figure out what the hell was going on.

I came back around 10 PM, mind clear and stomach topped off. I worked a few more minutes, then heard the Outlook ding.

Mass email from Malwarebytes CEO. Bad update. Blocked all class B IP addresses by mistake (guess which class we used). Mea culpa. So sorry. New update fixes things.

I immediately uninstalled MWB CE and boom. Services restored.

The next week we got our licenses refunded by our VAR and we never used that product again.

Possibly linux@lemmy.zip · 3 days ago

Uninstalling antivirus should be step one

NaibofTabr@infosec.pub · 3 days ago

“In theory, theory and practice are the same. But in practice…”

mindbleach@sh.itjust.works · 2 days ago

Accidental ring architecture.

It is surprising the switch doesn’t occasionally check for zero-ping echo between plugs.

Orbituary@lemmy.world · 3 days ago

Just reading the title of the post I knew what happened. I read through the whole thing because your story was good and I was in suspense to figure out if it was a router or voip phone that was the culprit.

Had this happen at work about a decade ago.

Nooodel@lemmy.world · 2 days ago

Turns out a large excellence cluster technical university can do the same and bring down an entire campus for 2 days. Everything is in one big intranet, has main lines with high throughput routed to a large network node and one backup line from the local internet provider. It killed the main lines and thousands of staff plus some tens of thousands of students were connected through a household class fiber connection. That was fun :)

dan@upvote.au · 3 days ago

By “hub”, do you mean switch? I haven’t seen a hub in a very long time. I don’t think I’ve ever seen a 1Gbps one.

ExtremeDullard@lemmy.sdf.org · 3 days ago

Yeah I keep calling them hubs incorrectly…

Possibly linux@lemmy.zip · edit-2 3 days ago

There is such a thing as a small 1Gbps hub that are designed to just handle a small network. They scare me as they are cheap on Amazon and could theoretically bring a network to its knees if a random user finds a port that isn’t authenticated.

Nougat@fedia.io · 3 days ago

For the passers-by, in very simple terms:

A switch maintains a list of the IPs and MAC addresses of devices attached to it (ARP [Address Resolution Protocol] table). When a packet comes into the switch for a specific destination IP, the switch looks up on the ARP table where that destination IP can be found, and only sends the packet out on the port the destination device (or next hop towards that device) is connected to.

A hub doesn’t do any of that. Every packet that comes into the hub gets sent out of every port on the hub, to every device connected to the hub. It’s on the connected devices’ to discard packets that aren’t addressed to them. On anything but a very small and relatively slow network, this would create an unnecessarily large amount of traffic, not to mention the security issue around sending packets to devices they’re not addressed to.

pastermil@sh.itjust.works · 3 days ago

Does that kind of loop really mess with things? ELI5 please!

Also, what do you mean a lonely switch? Does it have that loop and a port connected to another switch in the network?

stoy@lemmy.zip · 3 days ago

IT tech here, yes, yes it can.

Network infrastructure is both increadibly smart while also being dumb in other ways.

To do an ELI5 answer:

Imagine you have a container of pearls that you need to sort, red, green and blue pearls all need to be dropped into a red, green or blue hole.

The container is being refilled, but slow enough that it only gets a new pearl once you have sorted the previous.

The holes are connected to pipes going to separate buckets.

Everything is fine, but then some adds a new hole that is muticolored and tells you that all pearls should go there.

You tell your friends that you have a faster way to deal with the perls and to send you their pearls.

The new hole also has a pipe, but that is connected to the container that recieves pearls, so every time you drop a pearl into the new hole, it appears in the container again.

So now you have a situation where you not only get your normal ammount of pearls, but everyone else’s pearls and you also get every pearl you send back again.

You are smart and quickly realize that something is wrong and call for your teacher for help, networking gear don’t have that capabillity to understand that it is wrong, it just looks at each pearl and not the big picture.

If we go back to the real world, we have developed tools to deal with this situation, we have protocols line spanning tree which can have switches speak with eachother and figure out if there is a physical loop before sending traffic through it.

There are other tools as well, but they all need to be configured and to be honest, it is easily forgotten or made a low priority since it happens rarely.

It is something that is often implemented after a big outage.

Socsa@sh.itjust.works · 3 days ago

Certain types of broadcast traffic always get re-broadcast from of every port on a switch. So if you directly connect two ports, and you get some broadcast coming into the switch, that broadcast will loop forever across that loopback, and then get propagated repeatedly until it hits a broadcast boundary. It’s surprisingly difficult to prevent even with managed switches unless you are willing to hand manage every port and significantly restrict the kind of network services which can flow through it.

Some devices can detect these loops and break them, but that can have other unintended impacts if your network is designed (some would argue poorly) around using dumb switches to multiply limited Ethernet drops at the edge.

Possibly linux@lemmy.zip · 3 days ago

You can Mac lock the port

Possibly linux@lemmy.zip · 3 days ago

If you are using a hub then that’s expected as they tend to be one of the main sources of floods on a network.

If you have managed switches make sure you turn on loop protection and alerting. Ideally you should immediately know when something like that happens.

Also bonus if you setup vlans with different subnets. From there practice least privilege and block all forward traffic by default.

stringere@sh.itjust.works · 3 days ago

I managed to accomplish this at my first IT job, but I used broadcast with Symantec Ghost on a 10 port 100k/1mb hub to bring our office down without knowing any better! They bought me a 10/100 switch to push laptop images with after that incident.