this post was submitted on 13 Nov 2024

16 points (94.4% liked)

Selfhosted

40313 readers

271 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

HDD randomly unmounting (feddit.it)

submitted 1 week ago* (last edited 1 week ago) by dontblink@feddit.it to c/selfhosted@lemmy.world

8 comments fedilink hide all child comments

Hi!

First of all sorry if this is the wrong place to ask, if it is, please point me to a better suited channel!

Anyway I've got this old 2TB HDD attached to a rpi 4b, it worked flawlessly until now, the last few days it started disconnecting randomly..

If i reboot it mounts back again.

This is the df output:

/dev/sdb1       1.8T  535G  1.2T  31% /mnt/2tb

And this is sudo dmesg | grep sdb (the device is sdb ofc).

[   14.970908] sd 1:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[   14.978857] sd 1:0:0:0: [sdb] 4096-byte physical blocks
[   14.984484] sd 1:0:0:0: [sdb] Write Protect is off
[   14.989382] sd 1:0:0:0: [sdb] Mode Sense: 43 00 00 00
[   14.989684] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   15.044802] sd 1:0:0:0: [sdb] Preferred minimum I/O size 4096 bytes
[   15.051196] sd 1:0:0:0: [sdb] Optimal transfer size 33553920 bytes not a multiple of preferred minimum block size (4096 bytes)
[   15.065585]  sdb: sdb1
[   15.068403] sd 1:0:0:0: [sdb] Attached SCSI disk
[   22.631983] EXT4-fs (sdb1): recovery complete
[   22.660922] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Quota mode: none.

The device has an external power supply of its own, so it's not a power issue.. This setup worked for a couple of years.

I cannot see anything wrong here, pheraps is the HDD which is going bad?

top 8 comments

sorted by: hot top controversial new old

[–] seaQueue@lemmy.world 11 points 1 week ago* (last edited 1 week ago) (2 children)

Don't just look at sdb hits in the log. Open up that entire session in journalctl kernel mode (journalctl -k -bN where N is the session number in session history) and find the context surrounding the drive dropping and reconnecting.

You'll probably find that something caused a USB bus reset or a similar event before the drive dropped and reconnected. if you find nothing like that try switching power supplies for the HDD and/or switching USB ports until you can move the drive to a different USB root port. Use lsusb -t and swap ports until the drive is attached beneath a different root port. You might have a neighboring USB device attached to the bus that's causing issues for other devices attached to the same root port (it happens, USB devices or drivers sometimes behave badly.)

Always look at the context of the event when you're troubleshooting a failure like this, don't just drill down on the device messages. Most of the time the real cause of the issue preceded the symptom by a bit of time.

[–] dontblink@feddit.it 1 points 18 hours ago* (last edited 18 hours ago)

Thank you so much for taking the time to answer!

I'm not sure how to get the N from session history, nor how to check my session history..

but this might be some relevant output I've found with journalctl -k -b

Nov 21 16:08:18 rpi kernel: usb 2-2.1-port2: cannot reset (err = -110)
Nov 21 16:08:19 rpi kernel: usb 2-2.1-port2: cannot reset (err = -110)
Nov 21 16:08:19 rpi kernel: usb 2-2.1-port2: Cannot enable. Maybe the USB cable is bad?

Nov 21 16:41:57 rpi kernel: I/O error, dev sdb, sector 2466347032 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 2
Nov 21 16:41:57 rpi kernel: EXT4-fs warning (device sdb1): ext4_dx_find_entry:1796: inode #75497968: lblock 42: comm apache2: error -5 reading directory block
Nov 21 16:41:57 rpi kernel: EXT4-fs error (device sdb1): ext4_journal_check_start:83: comm apache2: Detected aborted journal
Nov 21 16:41:57 rpi kernel: Buffer I/O error on dev sdb1, logical block 0, lost sync page write
Nov 21 16:41:57 rpi kernel: EXT4-fs (sdb1): I/O error while writing superblock
Nov 21 16:41:57 rpi kernel: EXT4-fs (sdb1): Remounting filesystem read-only

The output is from yesterday, when the device stopped working correctly.

I'm not familiar with linux kernel, but I can see there is definitely something wrong...

The HDD (old) is attached to a USB hub (new), I tried switching port of the hub but the same issue happened again, if I try to mount it with sudo mount /mnt/2tb, it says it is already mounted:

mount: /mnt/2tb: /dev/sdb1 already mounted on /mnt/2tb.
       dmesg(1) may have more information after failed mount system call.

sudo dmesg | grep sdb gives back:

[147776.801028] I/O error, dev sdb, sector 77904 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 2
[147776.815452] EXT4-fs warning (device sdb1): htree_dirblock_to_tree:1083: inode #2: lblock 0: comm ls: error -5 reading directory block
[147796.731734] sdb1: Can't mount, would change RO state

[–] hendrik@palaver.p3x.de 3 points 1 week ago

Very good answer. I've also spent some time analyzing some red herrings when it was something else like a bad cable or connector. And by the way, you can use the same keys in journalctl as in the usual pager (less(?)) so hit / and search for 'unmount', 'disconnect', etc. And then scroll through the log and find out what led to the situation.

[–] AceBonobo@lemmy.world 4 points 1 week ago

Maybe the power supply is dying? Do you move it often? Or could the USB cable be degrading?

[–] AceSLS@ani.social 4 points 1 week ago

Sounds like the HDD is dying, maybe check it's S.M.A.R.T. status? Most drives have statistics for errors and such

[–] possiblylinux127@lemmy.zip 2 points 1 week ago

Please post a full dmesg and a full list of specs

[–] stonkage@aussie.zone 1 points 1 week ago

What does your fstab say?

[–] BCsven@lemmy.ca 1 points 1 week ago

USB Cable connection, power failing? Is drive set to power down on idle and then falling off the radar?