1995-01-17 16:29:31 +00:00
|
|
|
|
|
|
|
/* This code implemented by Dag.Gruneau@elsa.preseco.comm.se */
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
/* Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru */
|
1995-01-17 16:29:31 +00:00
|
|
|
|
1997-08-14 20:12:58 +00:00
|
|
|
#include <windows.h>
|
|
|
|
#include <limits.h>
|
|
|
|
#include <process.h>
|
1995-01-17 16:29:31 +00:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
typedef struct NRMUTEX {
|
|
|
|
LONG owned ;
|
|
|
|
DWORD thread_id ;
|
|
|
|
HANDLE hevent ;
|
|
|
|
} NRMUTEX, *PNRMUTEX ;
|
|
|
|
|
|
|
|
|
|
|
|
typedef PVOID WINAPI interlocked_cmp_xchg_t(PVOID *dest, PVOID exc, PVOID comperand) ;
|
|
|
|
|
|
|
|
/* Sorry mate, but we haven't got InterlockedCompareExchange in Win95! */
|
|
|
|
static PVOID WINAPI interlocked_cmp_xchg(PVOID *dest, PVOID exc, PVOID comperand)
|
|
|
|
{
|
|
|
|
static LONG spinlock = 0 ;
|
|
|
|
PVOID result ;
|
2000-05-11 12:53:51 +00:00
|
|
|
DWORD dwSleep = 0;
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
|
|
|
|
/* Acqire spinlock (yielding control to other threads if cant aquire for the moment) */
|
2000-05-11 12:53:51 +00:00
|
|
|
while(InterlockedExchange(&spinlock, 1))
|
|
|
|
{
|
|
|
|
// Using Sleep(0) can cause a priority inversion.
|
|
|
|
// Sleep(0) only yields the processor if there's
|
|
|
|
// another thread of the same priority that's
|
|
|
|
// ready to run. If a high-priority thread is
|
|
|
|
// trying to acquire the lock, which is held by
|
|
|
|
// a low-priority thread, then the low-priority
|
|
|
|
// thread may never get scheduled and hence never
|
|
|
|
// free the lock. NT attempts to avoid priority
|
|
|
|
// inversions by temporarily boosting the priority
|
|
|
|
// of low-priority runnable threads, but the problem
|
|
|
|
// can still occur if there's a medium-priority
|
|
|
|
// thread that's always runnable. If Sleep(1) is used,
|
|
|
|
// then the thread unconditionally yields the CPU. We
|
|
|
|
// only do this for the second and subsequent even
|
|
|
|
// iterations, since a millisecond is a long time to wait
|
|
|
|
// if the thread can be scheduled in again sooner
|
|
|
|
// (~100,000 instructions).
|
|
|
|
// Avoid priority inversion: 0, 1, 0, 1,...
|
|
|
|
Sleep(dwSleep);
|
|
|
|
dwSleep = !dwSleep;
|
|
|
|
}
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
result = *dest ;
|
|
|
|
if (result == comperand)
|
|
|
|
*dest = exc ;
|
|
|
|
/* Release spinlock */
|
|
|
|
spinlock = 0 ;
|
|
|
|
return result ;
|
|
|
|
} ;
|
|
|
|
|
|
|
|
static interlocked_cmp_xchg_t *ixchg ;
|
|
|
|
BOOL InitializeNonRecursiveMutex(PNRMUTEX mutex)
|
|
|
|
{
|
|
|
|
if (!ixchg)
|
|
|
|
{
|
|
|
|
/* Sorely, Win95 has no InterlockedCompareExchange API (Win98 has), so we have to use emulation */
|
|
|
|
HANDLE kernel = GetModuleHandle("kernel32.dll") ;
|
|
|
|
if (!kernel || (ixchg = (interlocked_cmp_xchg_t *)GetProcAddress(kernel, "InterlockedCompareExchange")) == NULL)
|
|
|
|
ixchg = interlocked_cmp_xchg ;
|
|
|
|
}
|
|
|
|
|
|
|
|
mutex->owned = -1 ; /* No threads have entered NonRecursiveMutex */
|
|
|
|
mutex->thread_id = 0 ;
|
|
|
|
mutex->hevent = CreateEvent(NULL, FALSE, FALSE, NULL) ;
|
|
|
|
return mutex->hevent != NULL ; /* TRUE if the mutex is created */
|
|
|
|
}
|
|
|
|
|
2000-06-28 22:07:35 +00:00
|
|
|
#ifdef InterlockedCompareExchange
|
|
|
|
#undef InterlockedCompareExchange
|
|
|
|
#endif
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
#define InterlockedCompareExchange(dest,exchange,comperand) (ixchg((dest), (exchange), (comperand)))
|
|
|
|
|
|
|
|
VOID DeleteNonRecursiveMutex(PNRMUTEX mutex)
|
|
|
|
{
|
|
|
|
/* No in-use check */
|
|
|
|
CloseHandle(mutex->hevent) ;
|
|
|
|
mutex->hevent = NULL ; /* Just in case */
|
|
|
|
}
|
|
|
|
|
|
|
|
DWORD EnterNonRecursiveMutex(PNRMUTEX mutex, BOOL wait)
|
|
|
|
{
|
|
|
|
/* Assume that the thread waits successfully */
|
|
|
|
DWORD ret ;
|
|
|
|
|
|
|
|
/* InterlockedIncrement(&mutex->owned) == 0 means that no thread currently owns the mutex */
|
|
|
|
if (!wait)
|
|
|
|
{
|
|
|
|
if (InterlockedCompareExchange((PVOID *)&mutex->owned, (PVOID)0, (PVOID)-1) != (PVOID)-1)
|
|
|
|
return WAIT_TIMEOUT ;
|
|
|
|
ret = WAIT_OBJECT_0 ;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
ret = InterlockedIncrement(&mutex->owned) ?
|
|
|
|
/* Some thread owns the mutex, let's wait... */
|
|
|
|
WaitForSingleObject(mutex->hevent, INFINITE) : WAIT_OBJECT_0 ;
|
|
|
|
|
|
|
|
mutex->thread_id = GetCurrentThreadId() ; /* We own it */
|
|
|
|
return ret ;
|
|
|
|
}
|
|
|
|
|
|
|
|
BOOL LeaveNonRecursiveMutex(PNRMUTEX mutex)
|
|
|
|
{
|
|
|
|
/* We don't own the mutex */
|
|
|
|
mutex->thread_id = 0 ;
|
|
|
|
return
|
|
|
|
InterlockedDecrement(&mutex->owned) < 0 ||
|
|
|
|
SetEvent(mutex->hevent) ; /* Other threads are waiting, wake one on them up */
|
|
|
|
}
|
|
|
|
|
2000-07-22 18:47:25 +00:00
|
|
|
PNRMUTEX AllocNonRecursiveMutex(void)
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
{
|
|
|
|
PNRMUTEX mutex = (PNRMUTEX)malloc(sizeof(NRMUTEX)) ;
|
|
|
|
if (mutex && !InitializeNonRecursiveMutex(mutex))
|
|
|
|
{
|
|
|
|
free(mutex) ;
|
|
|
|
mutex = NULL ;
|
|
|
|
}
|
|
|
|
return mutex ;
|
|
|
|
}
|
|
|
|
|
|
|
|
void FreeNonRecursiveMutex(PNRMUTEX mutex)
|
|
|
|
{
|
|
|
|
if (mutex)
|
|
|
|
{
|
|
|
|
DeleteNonRecursiveMutex(mutex) ;
|
|
|
|
free(mutex) ;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
long PyThread_get_thread_ident(void);
|
1995-01-17 16:29:31 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Change all headers to pure ANSI as no one will use K&R style on an
|
|
|
|
* NT
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Initialization of the C package, should not be needed.
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
static void PyThread__init_thread(void)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Thread support.
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
int PyThread_start_new_thread(void (*func)(void *), void *arg)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
2001-08-29 21:37:10 +00:00
|
|
|
unsigned long rv;
|
1997-08-14 20:12:58 +00:00
|
|
|
int success = 0;
|
1995-01-17 16:29:31 +00:00
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
dprintf(("%ld: PyThread_start_new_thread called\n", PyThread_get_thread_ident()));
|
1995-01-17 16:29:31 +00:00
|
|
|
if (!initialized)
|
1998-12-21 19:32:43 +00:00
|
|
|
PyThread_init_thread();
|
1995-01-17 16:29:31 +00:00
|
|
|
|
1997-08-14 20:12:58 +00:00
|
|
|
rv = _beginthread(func, 0, arg); /* use default stack size */
|
1995-01-17 16:29:31 +00:00
|
|
|
|
2001-08-29 21:37:10 +00:00
|
|
|
if (rv != (unsigned long)-1) {
|
1995-01-17 16:29:31 +00:00
|
|
|
success = 1;
|
2000-06-28 22:07:35 +00:00
|
|
|
dprintf(("%ld: PyThread_start_new_thread succeeded: %p\n", PyThread_get_thread_ident(), rv));
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
return success;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the thread Id instead of an handle. The Id is said to uniquely identify the
|
|
|
|
* thread in the system
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
long PyThread_get_thread_ident(void)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
|
|
|
if (!initialized)
|
1998-12-21 19:32:43 +00:00
|
|
|
PyThread_init_thread();
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
|
1995-01-17 16:29:31 +00:00
|
|
|
return GetCurrentThreadId();
|
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
static void do_PyThread_exit_thread(int no_cleanup)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
1998-12-21 19:32:43 +00:00
|
|
|
dprintf(("%ld: PyThread_exit_thread called\n", PyThread_get_thread_ident()));
|
1995-01-17 16:29:31 +00:00
|
|
|
if (!initialized)
|
|
|
|
if (no_cleanup)
|
|
|
|
_exit(0);
|
|
|
|
else
|
|
|
|
exit(0);
|
1997-08-14 20:12:58 +00:00
|
|
|
_endthread();
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread_exit_thread(void)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
1998-12-21 19:32:43 +00:00
|
|
|
do_PyThread_exit_thread(0);
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread__exit_thread(void)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
1998-12-21 19:32:43 +00:00
|
|
|
do_PyThread_exit_thread(1);
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
#ifndef NO_EXIT_PROG
|
1998-12-21 19:32:43 +00:00
|
|
|
static void do_PyThread_exit_prog(int status, int no_cleanup)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
1998-12-21 19:32:43 +00:00
|
|
|
dprintf(("PyThread_exit_prog(%d) called\n", status));
|
1995-01-17 16:29:31 +00:00
|
|
|
if (!initialized)
|
|
|
|
if (no_cleanup)
|
|
|
|
_exit(status);
|
|
|
|
else
|
|
|
|
exit(status);
|
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread_exit_prog(int status)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
1998-12-21 19:32:43 +00:00
|
|
|
do_PyThread_exit_prog(status, 0);
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
2000-07-22 18:47:25 +00:00
|
|
|
void PyThread__exit_prog(int status)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
1998-12-21 19:32:43 +00:00
|
|
|
do_PyThread_exit_prog(status, 1);
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
#endif /* NO_EXIT_PROG */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Lock support. It has too be implemented as semaphores.
|
|
|
|
* I [Dag] tried to implement it with mutex but I could find a way to
|
|
|
|
* tell whether a thread already own the lock or not.
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
PyThread_type_lock PyThread_allocate_lock(void)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
PNRMUTEX aLock;
|
1995-01-17 16:29:31 +00:00
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
dprintf(("PyThread_allocate_lock called\n"));
|
1995-01-17 16:29:31 +00:00
|
|
|
if (!initialized)
|
1998-12-21 19:32:43 +00:00
|
|
|
PyThread_init_thread();
|
1995-01-17 16:29:31 +00:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
aLock = AllocNonRecursiveMutex() ;
|
1995-01-17 16:29:31 +00:00
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_allocate_lock() -> %p\n", PyThread_get_thread_ident(), aLock));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
return (PyThread_type_lock) aLock;
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread_free_lock(PyThread_type_lock aLock)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_free_lock(%p) called\n", PyThread_get_thread_ident(),aLock));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
FreeNonRecursiveMutex(aLock) ;
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return 1 on success if the lock was acquired
|
|
|
|
*
|
|
|
|
* and 0 if the lock was not acquired. This means a 0 is returned
|
|
|
|
* if the lock has already been acquired by this thread!
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
int PyThread_acquire_lock(PyThread_type_lock aLock, int waitflag)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
int success ;
|
1995-01-17 16:29:31 +00:00
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_acquire_lock(%p, %d) called\n", PyThread_get_thread_ident(),aLock, waitflag));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
success = aLock && EnterNonRecursiveMutex((PNRMUTEX) aLock, (waitflag == 1 ? INFINITE : 0)) == WAIT_OBJECT_0 ;
|
1995-01-17 16:29:31 +00:00
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_acquire_lock(%p, %d) -> %d\n", PyThread_get_thread_ident(),aLock, waitflag, success));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
|
|
|
return success;
|
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread_release_lock(PyThread_type_lock aLock)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_release_lock(%p) called\n", PyThread_get_thread_ident(),aLock));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
if (!(aLock && LeaveNonRecursiveMutex((PNRMUTEX) aLock)))
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: Could not PyThread_release_lock(%p) error: %l\n", PyThread_get_thread_ident(), aLock, GetLastError()));
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Semaphore support.
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
PyThread_type_sema PyThread_allocate_sema(int value)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
|
|
|
HANDLE aSemaphore;
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
dprintf(("%ld: PyThread_allocate_sema called\n", PyThread_get_thread_ident()));
|
1995-01-17 16:29:31 +00:00
|
|
|
if (!initialized)
|
1998-12-21 19:32:43 +00:00
|
|
|
PyThread_init_thread();
|
1995-01-17 16:29:31 +00:00
|
|
|
|
|
|
|
aSemaphore = CreateSemaphore( NULL, /* Security attributes */
|
Fast NonRecursiveMutex support by Yakov Markovitch, markovitch@iso.ru,
who wrote:
Here's the new version of thread_nt.h. More particular, there is a
new version of thread lock that uses kernel object (e.g. semaphore)
only in case of contention; in other case it simply uses interlocked
functions, which are faster by the order of magnitude. It doesn't
make much difference without threads present, but as soon as thread
machinery initialised and (mostly) the interpreter global lock is on,
difference becomes tremendous. I've included a small script, which
initialises threads and launches pystone. With original thread_nt.h,
Pystone results with initialised threads are twofold worse then w/o
threads. With the new version, only 10% worse. I have used this
patch for about 6 months (with threaded and non-threaded
applications). It works remarkably well (though I'd desperately
prefer Python was free-threaded; I hope, it will soon).
2000-05-04 18:47:15 +00:00
|
|
|
value, /* Initial value */
|
|
|
|
INT_MAX, /* Maximum value */
|
|
|
|
NULL); /* Name of semaphore */
|
1995-01-17 16:29:31 +00:00
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_allocate_sema() -> %p\n", PyThread_get_thread_ident(), aSemaphore));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
return (PyThread_type_sema) aSemaphore;
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread_free_sema(PyThread_type_sema aSemaphore)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_free_sema(%p) called\n", PyThread_get_thread_ident(), aSemaphore));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
|
|
|
CloseHandle((HANDLE) aSemaphore);
|
|
|
|
}
|
|
|
|
|
1996-10-08 14:17:53 +00:00
|
|
|
/*
|
|
|
|
XXX must do something about waitflag
|
|
|
|
*/
|
1998-12-21 19:32:43 +00:00
|
|
|
int PyThread_down_sema(PyThread_type_sema aSemaphore, int waitflag)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
|
|
|
DWORD waitResult;
|
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_down_sema(%p) called\n", PyThread_get_thread_ident(), aSemaphore));
|
1995-01-17 16:29:31 +00:00
|
|
|
|
|
|
|
waitResult = WaitForSingleObject( (HANDLE) aSemaphore, INFINITE);
|
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_down_sema(%p) return: %l\n", PyThread_get_thread_ident(), aSemaphore, waitResult));
|
1996-10-08 14:17:53 +00:00
|
|
|
return 0;
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|
|
|
|
|
1998-12-21 19:32:43 +00:00
|
|
|
void PyThread_up_sema(PyThread_type_sema aSemaphore)
|
1995-01-17 16:29:31 +00:00
|
|
|
{
|
|
|
|
ReleaseSemaphore(
|
|
|
|
(HANDLE) aSemaphore, /* Handle of semaphore */
|
|
|
|
1, /* increment count by one */
|
|
|
|
NULL); /* not interested in previous count */
|
|
|
|
|
2000-06-30 15:01:00 +00:00
|
|
|
dprintf(("%ld: PyThread_up_sema(%p)\n", PyThread_get_thread_ident(), aSemaphore));
|
1995-01-17 16:29:31 +00:00
|
|
|
}
|