Author: David Zimmer
Date: 04.24.21 - 3:29pm
So it was recently noted that the VB6 gosub is actually slower than a function call out to a sub function. This seems weird right? shouldnt code still inline to the local function be faster than calling out to another with arguments?
So this made me curious..we start with the following vb code:
This gives us the following PCode (the native code follows the same code flow)
So we see that the gosub routines themselves are actually in the body of the main function, this way they can share all of the local variables etc and be in the same stack frame.
So where is the slow down occurring? Looking back at the native code..we see the following:
.text:0040198C call ds:__vbaGosub .text:00401992 test eax, eax .text:00401994 jnz short loc_40199B .text:00401996 jmp loc_401A2F ; jmp to actual inline gosub handlervbaGosub sets eax to 0 so when it returns the jnz short loc_40199B will not trigger.
vbaGosubReturn sets eax to 1, so when execution again gets back to 0401992, this time the jnz will trigger
which skips the jmp handler.
so the reason gosub is slow is because there is an alloc/free behind the scenes in every gosub call and also two calls into the runtime for the vbaGosub/vbaGosubReturn pair.
I havent bothered to explicitly confirm it, but I suspect the second value saved to the alloc is to create a one way linked list to any previous gosub call allocs so they can be nested.
The actual native handlers are
ENGINE:6610D2E8 ___vbaGosub@4 proc near ENGINE:6610D2E8 ENGINE:6610D2E8 arg_0 = dword ptr 8 ENGINE:6610D2E8 ENGINE:6610D2E8 push ebp ENGINE:6610D2E9 mov ebp, esp ENGINE:6610D2EB push ebx ENGINE:6610D2EC push esi ENGINE:6610D2ED push edi ENGINE:6610D2EE push 8 ENGINE:6610D2F0 call _ProfMemAlloc@4 ; 8 byte alloc to store the current return addr and a var ENGINE:6610D2F5 or eax, eax ENGINE:6610D2F7 jz short OutOfMemory ENGINE:6610D2F9 mov ebx, [ebp+arg_0] ENGINE:6610D2FC mov ecx, [ebx] ENGINE:6610D2FE mov [eax], ecx ENGINE:6610D300 mov ecx, [ebp+4] ENGINE:6610D303 mov [eax+4], ecx ENGINE:6610D306 mov [ebx], eax ENGINE:6610D308 xor eax, eax <-- 0 eax make sure we dont trigger the jnz and hit the jmp ENGINE:6610D30A pop edi ENGINE:6610D30B pop esi ENGINE:6610D30C pop ebx ENGINE:6610D30D leave ENGINE:6610D30E retn 4 ENGINE:6610D30E ___vbaGosub@4 endp ENGINE:6610D311 ___vbaGosubReturn@4 proc near ENGINE:6610D311 ENGINE:6610D311 arg_0 = dword ptr 8 ENGINE:6610D311 ENGINE:6610D311 push ebp ENGINE:6610D312 mov ebp, esp ENGINE:6610D314 push ebx ENGINE:6610D315 push esi ENGINE:6610D316 push edi ENGINE:6610D317 mov ebx, [ebp+arg_0] ENGINE:6610D31A mov esi, [ebx] ENGINE:6610D31C or esi, esi ENGINE:6610D31E jz short ReturnWOGoSub ENGINE:6610D320 mov ecx, [esi] ; esi = alloced mem from gosub ENGINE:6610D322 mov [ebx], ecx ENGINE:6610D324 mov ecx, [esi+4] ENGINE:6610D327 mov [ebp+4], ecx ; overwrite return address with one saved from alloc ENGINE:6610D32A push esi ENGINE:6610D32B call _ProfMemFree@4 ; free alloc ENGINE:6610D330 mov eax, 1 <--- triggers the jnz this time ENGINE:6610D335 pop edi ENGINE:6610D336 pop esi ENGINE:6610D337 pop ebx ENGINE:6610D338 leave ENGINE:6610D339 retn 4 ; now we return to the address right after the original vbagosub call ENGINE:6610D339 ___vbaGosubReturn@4 endpThe api decl is like this: (do not run in IDE, compiled only)