USB failing after several hours - xhci_hcd 0000:01:00.0: Ring expansion failed #5088


Describe the bug

Hi, I was hoping someone here may be able to help me.

I've been using a disk image (Buster) that was working fine with RPI4 R1.4

However since the RPI4 R1.5's have started to filter through I've had to update the system to get the disk image to start up.
This seems to have caused an issue with my USB FTDI serial adapter dropping off between maybe 8 - 14 hours and requiring either a hardware reboot or an unplug and re-plug of the serial adapter.

A dmesg while the USB has stopped shows that the system is being flooded by clock change errors ie:

[73254.053318] xhci_hcd 0000:01:00.0: Ring expansion failed
[73254.053346] ftdi_sio ttyUSB0: usb_serial_generic_submit_read_urb - usb_submit_urb failed: -12
[73254.125213] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-v3d frequency: -12
[73254.209306] xhci_hcd 0000:01:00.0: Ring expansion failed
[73254.209328] ftdi_sio ttyUSB0: usb_serial_generic_submit_read_urb - usb_submit_urb failed: -12
[73254.348480] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-v3d frequency: -12

From what I've read it seems like the -12 relates to a memory issue, however I'm running 16GB disk image with 53% free and the memory usage is around 444mb/3.33G. So I can't see that being the issue

Here's a few things I've attempted to get it working:
Tried on Kernel 5.10, 5.15 both produce the same issue.
Set the coherent_pool=4M in config file to see if that helps by mirroring the older firmware I used - Still happens in same timeframe.
Updated Bootloader.
Changed PI to another Rev1.5 in case it was a board issue
Attempted on an older pi Rev1.4 to see if it was newer board issue, with my updated disk image the same issue occurs.
Tried to replace the start4.elf and the other boot files from the later disk image to my older disk image, not including kernel, I might not have done this one correctly but I'm assuming they don't play nicely together being mismatched, (it broke things).
Rebuilt on new OS (Bullseye) with latest updates/upgrades, last around 8 hours longer before it drops off again.

This is the closest I could find of a similar issue, which doesn't seem to come to any real conclusion:
#3479

Oddly enough trying to run a test script with some neopixel LEDs I get the following message:

raise RuntimeError('ws2811_init failed with code {0} ({1})'.format(resp, str_resp))
RuntimeError: ws2811_init failed with code -2 (Out of memory)

In addition to this I've discovered that if I run the following:

sudo sh -c "echo 1 > /sys/bus/pci/devices/0000:00:00.0/remove"
sudo sh -c "echo 1 > /sys/bus/pci/rescan"

As an attempt to get the USB host to reset, seemed to clear what every memory issues was previously there and allowed me to start the pixel test script and also stopped the error spam with the changing of clock frequency, this however did not redetect the FTDI serial adapter that was connected and it dropped off completely.

My latest attempt at resolving this is now using the 64bit Kernel and I will report back on how this goes.

Would appreciate any help or things that I could try to get this resolved or at the very least narrow down why it's happening in the first place.
Many Thanks in advance.

Steps to reproduce the behaviour

Unknown. I plug in USB 232 serial adapter and read from via python script, wait 4-48 hours.
USB will stop reading and dmesg gets spammed with errors.

Device (s)

Raspberry Pi 4 Mod. B

System

Previous Working Kernel:

Linux pi 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
 
Solution: Reset USB

ID: 390