PCI-X I/O pads

I am currently involved in developing a PCI-X chip. The problem I am facing
is meeting the Output Valid time of 3.8ns as specified by the PCI-X
specification. The signal from the core logic is registered and it passes
through two buffers and then the PCI-X I/O pad. The delay in this path is
clock-to-Q of flop ( 0.6),Buffer delay (0.6ns), the PCI-X I/O pad delay ( 2.3ns )
and a clock insertion delay (0.5ns ). With this amount of delay it is not possible
to meet the output valid time of 3.8ns.

I would appreciate if somebody can provide some idea on how this timing of
3.8ns can be achieved if the PCI-X I/O pad delay itself is around 2.5ns ? Are you aware
of any technology library that has PCI-X  I/O pads with less than 2.0ns output delay so that
this timing can be achieved ?

Thanks in advance for your help,
Vikas M