After a good chunk of the last week figuring out what method works best for rendering a special part of my game, I finally got it nailed down.

However a new problem arises - I now have a central object with a ton of other objects added to it as overlays with various pixel offsets. I am using Click() to interact with these objects - however I need to differentiate between the objects when I click on them. The problem is that overlays automatically send the Click() back to the object they're attached to. (makes sense, they're just overlays right?)

Is there any clean way of figuring out what the user is clicking on, besides actually keeping a map of the various widths/heights of every object and their pixel offsets (effectively just creating an entire collision detection routine)?

Has anyone tried this?

Considering overlays are just images blended onto the main icon, I doubt it.
Only (really messy) workaround I can think of is to not use overlays, but objects that just follow the player when they move.
BYOND can either handle overlays for you or figure out which object you clicked on (but apparently can't do both). This means if you want to use the built-in overlay functionality you need to implement your own click collision detection. If you want to use the built-in click handling, you have to implement your own overlay handling. Implementing overlays yourself is probably the easier method, for example:

animate_movement = SYNC_STEPS
layer = MOB_LAYER + 1
icon_state = "hat"
world << "you clicked on the hat"
icon_state = "cape"
world << "you clicked on the cape"
list/_overlays = list()
add_overlay(new /atom/movable/overlay/hat())
add_overlay(new /atom/movable/overlay/cape())
o.loc = loc
o.dir = dir
_overlays += o
. = ..(l)
for(var/atom/movable/overlay/o in _overlays)
o.loc = loc
o.dir = dir
world << "you clicked on the mob"

Implementing your own collision detection would be a bit more ambitious. You could probably make use of GetPixel which would help a bit, but there could still be lots of other problems.