Move it back to caller hook instead of in the ProcessMessages function as there were crash reports when running the listen server. Moving this back solved the intermittent crash problem entirely while still rate limiting process time.
Some parts of the engine have to be rebuild in order to implement this correctly, therefore, it has been removed for now to avoid potential performance problems.
fps_max_rt and fps_max_gfx have been limited to 295 to avoid a contest with the engine's hard limit causing huge performance hits.
CEngine::Frame() tends to return out early if we are ahead of frame time, and therefore need to sleep. In this particular engine, CEngine::Frame() returns true if it had actually executed an engine frame, else it returns false.
Therefore, we end up calling 'NvAPI_D3D_Sleep()' multiple times a frame, and thus causing a major loss in performance. Now we check if we have actually ran a frame (incrementing reflex frame if so). If the frame was not incremented, but 'GFX_RunLowLatencyFrame()' was called, it will not call 'NvAPI_D3D_Sleep()'.
Code has been tested and appears to work on my system (never called twice or more per frame). Needs more testing on other systems to make sure its good.
Propagated latency markers in an attempt to further reduce input latency. Code is pending rigorous testing on several systems, including systems powered by an AMD GPU to make sure we are not impacting performance on systems that does NOT support NVIDIA Reflex.
Limits frames at a much higher level of precision than 'fps_max' and 'fps_max_gfx', probably ideal to reduce input latency even more. Also changed the logic of the NVIDIA Reflex frame limiter, to which it would use the desktop's refresh rate if set to '-1'. The new render thread frame limiter has a similar behavior. Using desktop refresh rates on the render thread or NVIDIA Reflex frame limiter requires 'fps_max' to be set to 0 (unlimited), as it would otherwise result in a major performance drop due to a contest if fps_max_(gfx/rt) is set to a similar number as fps_max.
Use CUtlVector, and remove every copy caused by passing vectors by value. CUtlVector does not support copying. Also removed all extraneous std::string copies caused by calling itoa instead of std::to_string, or std::stoll, etc. All features have been tested and work as designed.
Removed all extraneous copies by adding the class 'InterfaceReg' which will construct a new interface, and link it to the engine's static register. The Source Engine macro 'EXPOSE_INTERFACE_FN' will help utilizing this. The game module from the plugin is not obtained through the process environment block, so the executable is no longer sensitive to names.
Renamed to 'm_bUseLocalSendTableFile'. If set, the 'CClientState::ProcessSendTable()' method drops the SVC_SendTable message, and keeps using the local file instead.
Added 2 comments that speaks for them self's, and 2 additional checks for the second loop as I had triggered several cases where it would loop indefinitely still, as we break out of the parallel loop and continue into the old school loop for the remainder of the outSeqNr - currentIndex frames. This should fix all problems related to this function. After numerous of tests with varying arbitrary outSeqNr values that were vastly larger than the current packet flow index, the bug was no longer triggered. A similar test had also be performed were multiple clients would do this at the same time, and even then; the server frame time(s) remained identical and the whole thing never hung.