Well hello again, I have just learned that the host that recently had both nvme drives fail upon drive replacement, now has new problems: the filesystem report permanent data errors affecting the database of both, Matrix server and Telegram bridge.

I have just rented a new machine and am about to restore the database snapshot of the 26. of july, just in case. All the troubleshooting the recent days was very exhausting, however, i will try to do or at least prepare this within the upcoming hours.

Update

After a rescan the errors have gone away, however the drives logged errors too. It’s now the question as to whether the data integrety should be trusted.

Status august 1st

Well … good question… optimizations have been made last night, the restore was successful and … we are back to debugging outgoing federation :(


The new hardware also will be a bit more powerful… and yes, i have not forgotten that i wanted to update that database. It’s just that i was busy debugging federation problems.

References

  • Milan@discuss.tchncs.deOPM
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    1 year ago

    I am a bit confused now… the spare was 98% as to read in my snippet above … where does it say “no spare available”? I think it is on me to request a swap, and thats what i did as also the one with slightly less wear reported 255% used – which afaik is an aprox. lifetime left estimation based on rw cycles (not sure about all factors).

    The one the hoster left in for me to play with, said no:

    [Wed Jul 26 19:19:10 2023] nvme nvme1: I/O 9 QID 0 timeout, disable controller
    [Wed Jul 26 19:19:10 2023] nvme nvme1: Device shutdown incomplete; abort shutdown
    [Wed Jul 26 19:19:10 2023] nvme nvme1: Removing after probe failure status: -4
    

    Tried multiple kernelflags n stuff but couldn’t get past that error. Would have been interesting to have the hoster ship the thing to me (and maybe that would have been a long enough cooldown to have the thing working again), but i assume that would have been expensive from helsinki.

    • Haui@discuss.tchncs.de
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      My bad. I must have misread. Sorry.

      Yes, shipping it to you would have probably been a good idea. Does it cost a lot less to use the helsinki location? Otherwise Falkenstein would be a pretty good alternative I guess.