I wonder if this is really necessary. You could try removing it and seeing how many games still work. If most do then this might be an option for speeding up psx4all.
Only loads in the delay slot of indirect branches would have unknown stalling at compile time, and it's a lot less likely that anyone would be explicitly relying on these to pipeline correctly. Catching it means pairing a modify mask with the branch lookup and a source mask with the destination, and I guess using a generated prologue stub to handle it when it comes up.
Analyzing in the code at branch targets could be a problem if psx4all had partial flushing or patching for self modifying code, but I'm pretty sure it just flushes everything. So looking at the branch target should be okay.









