mirror of
https://github.com/valitydev/thrift.git
synced 2024-11-07 02:45:22 +00:00
Added thread library documentation to thrift whitepaper
Reviewed By: To be reviewed by slee and aditya Test Plan: N.A. git-svn-id: https://svn.apache.org/repos/asf/incubator/thrift/trunk@665075 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
parent
3f234dad0e
commit
10b3bdbb85
172
doc/thrift.tex
172
doc/thrift.tex
@ -16,6 +16,7 @@
|
||||
\usepackage{amssymb}
|
||||
\usepackage{amsfonts}
|
||||
\usepackage{amsmath}
|
||||
\usepackage{url}
|
||||
|
||||
\begin{document}
|
||||
|
||||
@ -769,9 +770,159 @@ reap the benefit of being able to easily debug corrupt or misunderstood data by
|
||||
looking for string contents.
|
||||
|
||||
\subsection{Servers and Multithreading}
|
||||
MARC TO WRITE THIS SECTION ON THE C++ concurrency PACKAGE AND
|
||||
BASIC TThreadPoolServer PERFORMANCE ETC. (ie. 140K req/second, that kind of
|
||||
thing)
|
||||
Thrift services require basic multithreading services to handle simultaneous
|
||||
requests from multiple clients. For the python and java implementations of
|
||||
thrift server logic, the multi-thread support provided by those runtimes was more
|
||||
than adequate. For the C++ implementation no standard multithread runtime
|
||||
library support exists. Specifically a robust, lightweight, and portable
|
||||
thread manager and timer class implementation do not exist. We investigated
|
||||
existing implementations, namely {\tt boost::thread},
|
||||
{\tt boost::threadpool}, {\tt ACE\_Thread\_Manager} and {\tt ACE\_Timer}.
|
||||
|
||||
While {\tt boost::threads \cite{boost.threads} } provides clean, lightweight and
|
||||
robust implementations of multi-thread primitives (mutexes, conditions, threads)
|
||||
it does not provide a thread manager or timer implementation.
|
||||
|
||||
{\tt boost::threadpool \cite{boost.threadpool} } also looked promising but was not
|
||||
far enough along for our purposes. We wanted to limit the dependency on
|
||||
thirdparty libraries as much as possible. Because {\tt boost::threadpool} is not
|
||||
a pure template library and requires runtime libraries and because it is not yet
|
||||
part of the official boost distribution we felt it was not ready for use in thrift.
|
||||
As {\tt boost::threadpool} evolves and especially if it is added to the boost
|
||||
distribution we may reconsider our decision not to use it.
|
||||
|
||||
ACE has both a thread manager and timer class in addition to multi-thread
|
||||
primitives. The biggest problem with ACE is that it is ACE. Unlike boost, ACE
|
||||
API quality is poor. Everything in ACE has large numbers of dependencies on
|
||||
everything else in ACE - thus forcing developers to throw out standard classes,
|
||||
like STL collection is favor of ACE's homebrewed implementations. In addition,
|
||||
unlike boost, ACE implementations demonstrate little understanding of the power
|
||||
and pitfalls of C++ programming and take no advantage of modern templating
|
||||
techniques to ensure compile time safety and reasonable compiler error messages.
|
||||
For all these reasons, ACE was rejected.
|
||||
|
||||
\subsection{Thread Primitives}
|
||||
|
||||
The thrift thread libraries have three components
|
||||
\begin{itemize}
|
||||
\item \texttt{primitives}
|
||||
\item \texttt{thread pool manager}
|
||||
\item \texttt{timer manager}
|
||||
\end{itemize}
|
||||
|
||||
As mentioned above, we were hesitant to introduce any additional dependencies on
|
||||
thrift. We decided to use {\tt boost::shared\_ptr} because it is so useful for
|
||||
multithreaded application, because it requires no link-time or runtime libraries
|
||||
(ie it is a pure template library) and because it is become part of the C++0X
|
||||
standard.
|
||||
|
||||
We implement standard {\tt Mutex} and {\tt Condition} classes, and a
|
||||
{\tt Monitor} class. The latter is simply a combination of a mutex and
|
||||
condition variable and is analogous to the monitor implementation provided for
|
||||
all objects in java. This is also sometimes referred to as a barrier. We
|
||||
provide a {\tt Synchronized} guard class to allow java-like synchronized blocks.
|
||||
This is just a bit of syntactic sugar, but, like its java counterpart, clearly
|
||||
delimits critical sections of code. Unlike it's java counterpart, we still have
|
||||
the ability to programmatically lock, unlock, block, and signal monitors.
|
||||
|
||||
\begin{verbatim}
|
||||
void run() {
|
||||
{Synchronized s(manager->monitor);
|
||||
if (manager->state == TimerManager::STARTING) {
|
||||
manager->state = TimerManager::STARTED;
|
||||
manager->monitor.notifyAll();
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{verbatim}
|
||||
|
||||
We again borrowed from java the distinction between a thread and a runnable
|
||||
class. A {\tt facebook::thread:Thread} is the actual schedulable object. The
|
||||
{\tt facebook::thread::Runnable} is the logic to execute within the thread.
|
||||
The {\tt Thread} implementation deals with all the platform-specific thread
|
||||
creation and destruction issues, while the {tt Runnable} implementation deals
|
||||
with the application-specific per-thread logic. . The benefit of this approach
|
||||
is that developers can easily subclass the Runnable class without pulling in
|
||||
platform-specific super-clases.
|
||||
|
||||
\subsection{Thread, Runnable, and shared\_ptr}
|
||||
We use {\tt boost::shared\_ptr} throughout the {\tt ThreadManager} and
|
||||
{\tt TimerManager} implementations to guarantee cleanup of dead objects that can
|
||||
be accessed by multiple threads. For {\tt Thread} class implementations,
|
||||
{\tt boost::shared\_ptr} usage requires particular attention to make sure
|
||||
{\tt Thread} objects are neither leaked nor dereferenced prematurely while
|
||||
creating and shutting down threads.
|
||||
|
||||
Thread creation requires calling into a C library. (In our case the POSIX
|
||||
thread library, libhthread, but the same would be true for WIN32 threads).
|
||||
Typically, the OS makes few if any guarantees about when a C thread's
|
||||
entry-point function, {\tt ThreadMain} will be called. Therefore, it is
|
||||
possible that our thread create call,
|
||||
{\tt facebook::thread::ThreadFactory::newThread()} could return to the caller
|
||||
well before that time. To ensure that the returned {\tt Thread} object is not
|
||||
prematurely cleaned up if the caller gives up its reference prior to the
|
||||
{\tt ThreadMain} call, the {\tt Thread} object makes a weak referenence to
|
||||
itself in its {\tt start} method.
|
||||
|
||||
With the weak reference in hand the {\tt ThreadMain} function can attempt to get
|
||||
a strong reference before entering the {\tt Runnable::run} method of the
|
||||
{\tt Runnable} object bound to the {\tt Thread}. If no strong refereneces to the
|
||||
thread obtained between exiting {\tt Thread::start} and entering the C helper
|
||||
function, {\tt ThreadMain}, the weak reference returns null and the function
|
||||
exits immediately.
|
||||
|
||||
The need for the {\tt Thread} to make a weak reference to itself has a
|
||||
significant impact on the API. Since references are managed through the
|
||||
{\tt boost::shared\_ptr} templates, the {\tt Thread} object must have a reference
|
||||
to itself wrapped by the same {\tt boost::shared\_ptr} envelope that is returned
|
||||
to the caller. This necessitated use of the factory pattern.
|
||||
{\tt ThreadFactory} creates the raw {\tt Thread} object and
|
||||
{tt boost::shared\_ptr} wrapper, and calls a private helper method of the class
|
||||
implementing the {\tt Thread} interface (in this case, {\tt PosixThread::weakRef}
|
||||
to allow it to make add weak reference to itself through the
|
||||
{\tt boost::shared\_ptr} envelope.
|
||||
|
||||
{\tt Thread} and {\tt Runnable} objects reference each other. A {\tt Runnable}
|
||||
object may need to know which thread it is executing in and a Thread, obviously,
|
||||
needs to know what {\tt Runnable} object it is hosting. This interdependency is
|
||||
further complicated because the lifecycle of each object is independent of the
|
||||
other. An application may create a set of {\tt Runnable} object to be used overs
|
||||
and over in different threads, or it may create and forget a {\tt Runnable} object
|
||||
once a thread has been created and started for it.
|
||||
|
||||
The {\tt Thread} class takes a {\tt boost::shared\_ptr} reference to the hosted
|
||||
{\tt Runnable} object in its contructor, while the {\tt Runnable} class has an
|
||||
explicit {\tt thread} method to allow explicit binding of the hosted thread.
|
||||
{\tt ThreadFactory::newThread} binds the two objects to each other.
|
||||
|
||||
\subsection{ThreadManager}
|
||||
|
||||
{\tt facebook::thread::ThreadManager} creates a pool of worker threads and
|
||||
allows applications to schedule tasks for execution as free worker threads
|
||||
become available. The {\tt ThreadManager} does not implement dynamic
|
||||
thread pool resizing, but provides primitives so that applications can add
|
||||
and remove threads based on load. This approach was chosen because
|
||||
implementing load metrics and thread pool size is very application
|
||||
specific. For example some applications may want to adjust pool size based
|
||||
on running-average of work arrival rates that are measured via polled
|
||||
samples. Others may simply wish to react immediately to work-queue
|
||||
depth high and low water marks. Rather than trying to create a complex
|
||||
API that is abstract enough to capture these different approaches, we
|
||||
simply leave it up to the particular application and provide the
|
||||
primitives to enact the desired policy and sample current status.
|
||||
|
||||
\subsection{TimerManager}
|
||||
|
||||
{\tt facebook::thread::TimerManager} applows applications to schedule
|
||||
{\tt Runnable} object execution at some point in the future. Its specific task
|
||||
is to allows applications to sample {\tt ThreadManager} load at regular
|
||||
intervals and make changes to the thread pool size based on application policy.
|
||||
Of course, it can be used to generate any number of timer or alarm events.
|
||||
|
||||
The default implementation of {\tt TimerManager} uses a single thread to
|
||||
execute expired {\tt Runnable} objects. Thus, if a timer operation needs to
|
||||
do a large amount of work and especially if it needs to do blocking I/O,
|
||||
that should be done in a separate thread.
|
||||
|
||||
\subsection{Nonblocking Operation}
|
||||
Though the Thrift transport interfaces map more directly to a blocking I/O
|
||||
@ -879,11 +1030,18 @@ Thrift is a successor to Pillar, a similar system developed
|
||||
by Adam D'Angelo, first while at Caltech and continued later at Facebook.
|
||||
Thrift simply would not have happened without Adam's insights.
|
||||
|
||||
%\begin{thebibliography}{}
|
||||
\begin{thebibliography}{}
|
||||
|
||||
%\bibitem{smith02}
|
||||
%Smith, P. Q. reference text
|
||||
\bibitem{boost.threads}
|
||||
Kempf, William,
|
||||
``Boost.Threads'',
|
||||
\url{http://www.boost.org/doc/html/threads.html}
|
||||
|
||||
%\end{thebibliography}
|
||||
\bibitem{boost.threadpool}
|
||||
Henkel, Philipp,
|
||||
``threadpool'',
|
||||
\url{http://threadpool.sourceforge.net}
|
||||
|
||||
\end{thebibliography}
|
||||
|
||||
\end{document}
|
||||
|
Loading…
Reference in New Issue
Block a user