Class Mongrel::HttpServer
In: lib/mongrel.rb
lib/mongrel.rb
Parent: Object

This is the main driver of Mongrel, while the Mongrel::HttpParser and Mongrel::URIClassifier make up the majority of how the server functions. It‘s a very simple class that just has a thread accepting connections and a simple HttpServer.process_client function to do the heavy lifting with the IO and Ruby.

You use it by doing the following:

  server = HttpServer.new("0.0.0.0", 3000)
  server.register("/stuff", MyNiftyHandler.new)
  server.run.join

The last line can be just server.run if you don‘t want to join the thread used. If you don‘t though Ruby will mysteriously just exit on you.

Ruby‘s thread implementation is "interesting" to say the least. Experiments with many different types of IO processing simply cannot make a dent in it. Future releases of Mongrel will find other creative ways to make threads faster, but don‘t hold your breath until Ruby 1.9 is actually finally useful.

Methods

Attributes

acceptor  [R] 
acceptor  [R] 
classifier  [R] 
classifier  [R] 
host  [R] 
host  [R] 
num_processors  [R] 
num_processors  [R] 
port  [R] 
port  [R] 
throttle  [R] 
throttle  [R] 
timeout  [R] 
timeout  [R] 
workers  [R] 
workers  [R] 

Public Class methods

Creates a working server on host:port (strange things happen if port isn‘t a Number). Use HttpServer::run to start the server and HttpServer.acceptor.join to join the thread that‘s processing incoming requests on the socket.

The num_processors optional argument is the maximum number of concurrent processors to accept, anything over this is closed immediately to maintain server processing performance. This may seem mean but it is the most efficient way to deal with overload. Other schemes involve still parsing the client‘s request which defeats the point of an overload handling system.

The throttle parameter is a sleep timeout (in hundredths of a second) that is placed between socket.accept calls in order to give the server a cheap throttle time. It defaults to 0 and actually if it is 0 then the sleep is not done at all.

[Source]

     # File lib/mongrel.rb, line 89
 89:     def initialize(host, port, num_processors=950, throttle=0, timeout=60)
 90:       
 91:       tries = 0
 92:       @socket = TCPServer.new(host, port) 
 93:       
 94:       @classifier = URIClassifier.new
 95:       @host = host
 96:       @port = port
 97:       @workers = ThreadGroup.new
 98:       @throttle = throttle / 100.0
 99:       @num_processors = num_processors
100:       @timeout = timeout
101:     end

Creates a working server on host:port (strange things happen if port isn‘t a Number). Use HttpServer::run to start the server and HttpServer.acceptor.join to join the thread that‘s processing incoming requests on the socket.

The num_processors optional argument is the maximum number of concurrent processors to accept, anything over this is closed immediately to maintain server processing performance. This may seem mean but it is the most efficient way to deal with overload. Other schemes involve still parsing the client‘s request which defeats the point of an overload handling system.

The throttle parameter is a sleep timeout (in hundredths of a second) that is placed between socket.accept calls in order to give the server a cheap throttle time. It defaults to 0 and actually if it is 0 then the sleep is not done at all.

[Source]

     # File lib/mongrel.rb, line 89
 89:     def initialize(host, port, num_processors=950, throttle=0, timeout=60)
 90:       
 91:       tries = 0
 92:       @socket = TCPServer.new(host, port) 
 93:       
 94:       @classifier = URIClassifier.new
 95:       @host = host
 96:       @port = port
 97:       @workers = ThreadGroup.new
 98:       @throttle = throttle / 100.0
 99:       @num_processors = num_processors
100:       @timeout = timeout
101:     end

Public Instance methods

[Source]

     # File lib/mongrel.rb, line 239
239:     def configure_socket_options
240:       case RUBY_PLATFORM
241:       when /linux/
242:         # 9 is currently TCP_DEFER_ACCEPT
243:         $tcp_defer_accept_opts = [Socket::SOL_TCP, 9, 1]
244:         $tcp_cork_opts = [Socket::SOL_TCP, 3, 1]
245:       when /freebsd(([1-4]\..{1,2})|5\.[0-4])/
246:         # Do nothing, just closing a bug when freebsd <= 5.4
247:       when /freebsd/
248:         # Use the HTTP accept filter if available.
249:         # The struct made by pack() is defined in /usr/include/sys/socket.h as accept_filter_arg
250:         unless `/sbin/sysctl -nq net.inet.accf.http`.empty?
251:           $tcp_defer_accept_opts = [Socket::SOL_SOCKET, Socket::SO_ACCEPTFILTER, ['httpready', nil].pack('a16a240')]
252:         end
253:       end
254:     end

[Source]

     # File lib/mongrel.rb, line 239
239:     def configure_socket_options
240:       case RUBY_PLATFORM
241:       when /linux/
242:         # 9 is currently TCP_DEFER_ACCEPT
243:         $tcp_defer_accept_opts = [Socket::SOL_TCP, 9, 1]
244:         $tcp_cork_opts = [Socket::SOL_TCP, 3, 1]
245:       when /freebsd(([1-4]\..{1,2})|5\.[0-4])/
246:         # Do nothing, just closing a bug when freebsd <= 5.4
247:       when /freebsd/
248:         # Use the HTTP accept filter if available.
249:         # The struct made by pack() is defined in /usr/include/sys/socket.h as accept_filter_arg
250:         unless `/sbin/sysctl -nq net.inet.accf.http`.empty?
251:           $tcp_defer_accept_opts = [Socket::SOL_SOCKET, Socket::SO_ACCEPTFILTER, ['httpready', nil].pack('a16a240')]
252:         end
253:       end
254:     end

Performs a wait on all the currently running threads and kills any that take too long. It waits by @timeout seconds, which can be set in .initialize or via mongrel_rails. The @throttle setting does extend this waiting period by that much longer.

[Source]

     # File lib/mongrel.rb, line 232
232:     def graceful_shutdown
233:       while reap_dead_workers("shutdown") > 0
234:         STDERR.puts "Waiting for #{@workers.list.length} requests to finish, could take #{@timeout + @throttle} seconds."
235:         sleep @timeout / 10
236:       end
237:     end

Performs a wait on all the currently running threads and kills any that take too long. It waits by @timeout seconds, which can be set in .initialize or via mongrel_rails. The @throttle setting does extend this waiting period by that much longer.

[Source]

     # File lib/mongrel.rb, line 232
232:     def graceful_shutdown
233:       while reap_dead_workers("shutdown") > 0
234:         STDERR.puts "Waiting for #{@workers.list.length} requests to finish, could take #{@timeout + @throttle} seconds."
235:         sleep @timeout / 10
236:       end
237:     end

Does the majority of the IO processing. It has been written in Ruby using about 7 different IO processing strategies and no matter how it‘s done the performance just does not improve. It is currently carefully constructed to make sure that it gets the best possible performance, but anyone who thinks they can make it faster is more than welcome to take a crack at it.

[Source]

     # File lib/mongrel.rb, line 108
108:     def process_client(client)
109:       begin
110:         parser = HttpParser.new
111:         params = HttpParams.new
112:         request = nil
113:         data = client.readpartial(Const::CHUNK_SIZE)
114:         nparsed = 0
115: 
116:         # Assumption: nparsed will always be less since data will get filled with more
117:         # after each parsing.  If it doesn't get more then there was a problem
118:         # with the read operation on the client socket.  Effect is to stop processing when the
119:         # socket can't fill the buffer for further parsing.
120:         while nparsed < data.length
121:           nparsed = parser.execute(params, data, nparsed)
122: 
123:           if parser.finished?
124:             if not params[Const::REQUEST_PATH]
125:               # it might be a dumbass full host request header
126:               uri = URI.parse(params[Const::REQUEST_URI])
127:               params[Const::REQUEST_PATH] = uri.path
128:             end
129: 
130:             raise "No REQUEST PATH" if not params[Const::REQUEST_PATH]
131: 
132:             script_name, path_info, handlers = @classifier.resolve(params[Const::REQUEST_PATH])
133: 
134:             if handlers
135:               params[Const::PATH_INFO] = path_info
136:               params[Const::SCRIPT_NAME] = script_name
137: 
138:               # From http://www.ietf.org/rfc/rfc3875 :
139:               # "Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
140:               #  meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
141:               #  ultimate source of the request.  They identify the client for the
142:               #  immediate request to the server; that client may be a proxy, gateway,
143:               #  or other intermediary acting on behalf of the actual source client."
144:               params[Const::REMOTE_ADDR] = client.peeraddr.last
145: 
146:               # select handlers that want more detailed request notification
147:               notifiers = handlers.select { |h| h.request_notify }
148:               request = HttpRequest.new(params, client, notifiers)
149: 
150:               # in the case of large file uploads the user could close the socket, so skip those requests
151:               break if request.body == nil  # nil signals from HttpRequest::initialize that the request was aborted
152: 
153:               # request is good so far, continue processing the response
154:               response = HttpResponse.new(client)
155: 
156:               # Process each handler in registered order until we run out or one finalizes the response.
157:               handlers.each do |handler|
158:                 handler.process(request, response)
159:                 break if response.done or client.closed?
160:               end
161: 
162:               # And finally, if nobody closed the response off, we finalize it.
163:               unless response.done or client.closed? 
164:                 response.finished
165:               end
166:             else
167:               # Didn't find it, return a stock 404 response.
168:               client.write(Const::ERROR_404_RESPONSE)
169:             end
170: 
171:             break #done
172:           else
173:             # Parser is not done, queue up more data to read and continue parsing
174:             chunk = client.readpartial(Const::CHUNK_SIZE)
175:             break if !chunk or chunk.length == 0  # read failed, stop processing
176: 
177:             data << chunk
178:             if data.length >= Const::MAX_HEADER
179:               raise HttpParserError.new("HEADER is longer than allowed, aborting client early.")
180:             end
181:           end
182:         end
183:       rescue EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF
184:         client.close rescue nil
185:       rescue HttpParserError => e
186:         STDERR.puts "#{Time.now}: HTTP parse error, malformed request (#{params[Const::HTTP_X_FORWARDED_FOR] || client.peeraddr.last}): #{e.inspect}"
187:         STDERR.puts "#{Time.now}: REQUEST DATA: #{data.inspect}\n---\nPARAMS: #{params.inspect}\n---\n"
188:       rescue Errno::EMFILE
189:         reap_dead_workers('too many files')
190:       rescue Object => e
191:         STDERR.puts "#{Time.now}: Read error: #{e.inspect}"
192:         STDERR.puts e.backtrace.join("\n")
193:       ensure
194:         begin
195:           client.close
196:         rescue IOError
197:           # Already closed
198:         rescue Object => e
199:           STDERR.puts "#{Time.now}: Client error: #{e.inspect}"
200:           STDERR.puts e.backtrace.join("\n")
201:         end
202:         request.body.delete if request and request.body.class == Tempfile
203:       end
204:     end

Does the majority of the IO processing. It has been written in Ruby using about 7 different IO processing strategies and no matter how it‘s done the performance just does not improve. It is currently carefully constructed to make sure that it gets the best possible performance, but anyone who thinks they can make it faster is more than welcome to take a crack at it.

[Source]

     # File lib/mongrel.rb, line 108
108:     def process_client(client)
109:       begin
110:         parser = HttpParser.new
111:         params = HttpParams.new
112:         request = nil
113:         data = client.readpartial(Const::CHUNK_SIZE)
114:         nparsed = 0
115: 
116:         # Assumption: nparsed will always be less since data will get filled with more
117:         # after each parsing.  If it doesn't get more then there was a problem
118:         # with the read operation on the client socket.  Effect is to stop processing when the
119:         # socket can't fill the buffer for further parsing.
120:         while nparsed < data.length
121:           nparsed = parser.execute(params, data, nparsed)
122: 
123:           if parser.finished?
124:             if not params[Const::REQUEST_PATH]
125:               # it might be a dumbass full host request header
126:               uri = URI.parse(params[Const::REQUEST_URI])
127:               params[Const::REQUEST_PATH] = uri.path
128:             end
129: 
130:             raise "No REQUEST PATH" if not params[Const::REQUEST_PATH]
131: 
132:             script_name, path_info, handlers = @classifier.resolve(params[Const::REQUEST_PATH])
133: 
134:             if handlers
135:               params[Const::PATH_INFO] = path_info
136:               params[Const::SCRIPT_NAME] = script_name
137: 
138:               # From http://www.ietf.org/rfc/rfc3875 :
139:               # "Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
140:               #  meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
141:               #  ultimate source of the request.  They identify the client for the
142:               #  immediate request to the server; that client may be a proxy, gateway,
143:               #  or other intermediary acting on behalf of the actual source client."
144:               params[Const::REMOTE_ADDR] = client.peeraddr.last
145: 
146:               # select handlers that want more detailed request notification
147:               notifiers = handlers.select { |h| h.request_notify }
148:               request = HttpRequest.new(params, client, notifiers)
149: 
150:               # in the case of large file uploads the user could close the socket, so skip those requests
151:               break if request.body == nil  # nil signals from HttpRequest::initialize that the request was aborted
152: 
153:               # request is good so far, continue processing the response
154:               response = HttpResponse.new(client)
155: 
156:               # Process each handler in registered order until we run out or one finalizes the response.
157:               handlers.each do |handler|
158:                 handler.process(request, response)
159:                 break if response.done or client.closed?
160:               end
161: 
162:               # And finally, if nobody closed the response off, we finalize it.
163:               unless response.done or client.closed? 
164:                 response.finished
165:               end
166:             else
167:               # Didn't find it, return a stock 404 response.
168:               client.write(Const::ERROR_404_RESPONSE)
169:             end
170: 
171:             break #done
172:           else
173:             # Parser is not done, queue up more data to read and continue parsing
174:             chunk = client.readpartial(Const::CHUNK_SIZE)
175:             break if !chunk or chunk.length == 0  # read failed, stop processing
176: 
177:             data << chunk
178:             if data.length >= Const::MAX_HEADER
179:               raise HttpParserError.new("HEADER is longer than allowed, aborting client early.")
180:             end
181:           end
182:         end
183:       rescue EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF
184:         client.close rescue nil
185:       rescue HttpParserError => e
186:         STDERR.puts "#{Time.now}: HTTP parse error, malformed request (#{params[Const::HTTP_X_FORWARDED_FOR] || client.peeraddr.last}): #{e.inspect}"
187:         STDERR.puts "#{Time.now}: REQUEST DATA: #{data.inspect}\n---\nPARAMS: #{params.inspect}\n---\n"
188:       rescue Errno::EMFILE
189:         reap_dead_workers('too many files')
190:       rescue Object => e
191:         STDERR.puts "#{Time.now}: Read error: #{e.inspect}"
192:         STDERR.puts e.backtrace.join("\n")
193:       ensure
194:         begin
195:           client.close
196:         rescue IOError
197:           # Already closed
198:         rescue Object => e
199:           STDERR.puts "#{Time.now}: Client error: #{e.inspect}"
200:           STDERR.puts e.backtrace.join("\n")
201:         end
202:         request.body.delete if request and request.body.class == Tempfile
203:       end
204:     end

Used internally to kill off any worker threads that have taken too long to complete processing. Only called if there are too many processors currently servicing. It returns the count of workers still active after the reap is done. It only runs if there are workers to reap.

[Source]

     # File lib/mongrel.rb, line 210
210:     def reap_dead_workers(reason='unknown')
211:       if @workers.list.length > 0
212:         STDERR.puts "#{Time.now}: Reaping #{@workers.list.length} threads for slow workers because of '#{reason}'"
213:         error_msg = "Mongrel timed out this thread: #{reason}"
214:         mark = Time.now
215:         @workers.list.each do |worker|
216:           worker[:started_on] = Time.now if not worker[:started_on]
217: 
218:           if mark - worker[:started_on] > @timeout + @throttle
219:             STDERR.puts "Thread #{worker.inspect} is too old, killing."
220:             worker.raise(TimeoutError.new(error_msg))
221:           end
222:         end
223:       end
224: 
225:       return @workers.list.length
226:     end

Used internally to kill off any worker threads that have taken too long to complete processing. Only called if there are too many processors currently servicing. It returns the count of workers still active after the reap is done. It only runs if there are workers to reap.

[Source]

     # File lib/mongrel.rb, line 210
210:     def reap_dead_workers(reason='unknown')
211:       if @workers.list.length > 0
212:         STDERR.puts "#{Time.now}: Reaping #{@workers.list.length} threads for slow workers because of '#{reason}'"
213:         error_msg = "Mongrel timed out this thread: #{reason}"
214:         mark = Time.now
215:         @workers.list.each do |worker|
216:           worker[:started_on] = Time.now if not worker[:started_on]
217: 
218:           if mark - worker[:started_on] > @timeout + @throttle
219:             STDERR.puts "Thread #{worker.inspect} is too old, killing."
220:             worker.raise(TimeoutError.new(error_msg))
221:           end
222:         end
223:       end
224: 
225:       return @workers.list.length
226:     end

Simply registers a handler with the internal URIClassifier. When the URI is found in the prefix of a request then your handler‘s HttpHandler::process method is called. See Mongrel::URIClassifier#register for more information.

If you set in_front=true then the passed in handler will be put in the front of the list for that particular URI. Otherwise it‘s placed at the end of the list.

[Source]

     # File lib/mongrel.rb, line 319
319:     def register(uri, handler, in_front=false)
320:       begin
321:         @classifier.register(uri, [handler])
322:       rescue URIClassifier::RegistrationError
323:         handlers = @classifier.resolve(uri)[2]
324:         method_name = in_front ? 'unshift' : 'push'
325:         handlers.send(method_name, handler)
326:       end
327:       handler.listener = self
328:     end

Simply registers a handler with the internal URIClassifier. When the URI is found in the prefix of a request then your handler‘s HttpHandler::process method is called. See Mongrel::URIClassifier#register for more information.

If you set in_front=true then the passed in handler will be put in the front of the list for that particular URI. Otherwise it‘s placed at the end of the list.

[Source]

     # File lib/mongrel.rb, line 319
319:     def register(uri, handler, in_front=false)
320:       begin
321:         @classifier.register(uri, [handler])
322:       rescue URIClassifier::RegistrationError
323:         handlers = @classifier.resolve(uri)[2]
324:         method_name = in_front ? 'unshift' : 'push'
325:         handlers.send(method_name, handler)
326:       end
327:       handler.listener = self
328:     end

Runs the thing. It returns the thread used so you can "join" it. You can also access the HttpServer::acceptor attribute to get the thread later.

[Source]

     # File lib/mongrel.rb, line 258
258:     def run
259:       BasicSocket.do_not_reverse_lookup=true
260: 
261:       configure_socket_options
262: 
263:       if defined?($tcp_defer_accept_opts) and $tcp_defer_accept_opts
264:         @socket.setsockopt(*$tcp_defer_accept_opts) rescue nil
265:       end
266: 
267:       @acceptor = Thread.new do
268:         begin
269:           while true
270:             begin
271:               client = @socket.accept
272:   
273:               if defined?($tcp_cork_opts) and $tcp_cork_opts
274:                 client.setsockopt(*$tcp_cork_opts) rescue nil
275:               end
276:   
277:               worker_list = @workers.list
278:   
279:               if worker_list.length >= @num_processors
280:                 STDERR.puts "Server overloaded with #{worker_list.length} processors (#@num_processors max). Dropping connection."
281:                 client.close rescue nil
282:                 reap_dead_workers("max processors")
283:               else
284:                 thread = Thread.new(client) {|c| process_client(c) }
285:                 thread[:started_on] = Time.now
286:                 @workers.add(thread)
287:   
288:                 sleep @throttle if @throttle > 0
289:               end
290:             rescue StopServer
291:               break
292:             rescue Errno::EMFILE
293:               reap_dead_workers("too many open files")
294:               sleep 0.5
295:             rescue Errno::ECONNABORTED
296:               # client closed the socket even before accept
297:               client.close rescue nil
298:             rescue Object => e
299:               STDERR.puts "#{Time.now}: Unhandled listen loop exception #{e.inspect}."
300:               STDERR.puts e.backtrace.join("\n")
301:             end
302:           end
303:           graceful_shutdown
304:         ensure
305:           @socket.close
306:           # STDERR.puts "#{Time.now}: Closed socket."
307:         end
308:       end
309: 
310:       return @acceptor
311:     end

Runs the thing. It returns the thread used so you can "join" it. You can also access the HttpServer::acceptor attribute to get the thread later.

[Source]

     # File lib/mongrel.rb, line 258
258:     def run
259:       BasicSocket.do_not_reverse_lookup=true
260: 
261:       configure_socket_options
262: 
263:       if defined?($tcp_defer_accept_opts) and $tcp_defer_accept_opts
264:         @socket.setsockopt(*$tcp_defer_accept_opts) rescue nil
265:       end
266: 
267:       @acceptor = Thread.new do
268:         begin
269:           while true
270:             begin
271:               client = @socket.accept
272:   
273:               if defined?($tcp_cork_opts) and $tcp_cork_opts
274:                 client.setsockopt(*$tcp_cork_opts) rescue nil
275:               end
276:   
277:               worker_list = @workers.list
278:   
279:               if worker_list.length >= @num_processors
280:                 STDERR.puts "Server overloaded with #{worker_list.length} processors (#@num_processors max). Dropping connection."
281:                 client.close rescue nil
282:                 reap_dead_workers("max processors")
283:               else
284:                 thread = Thread.new(client) {|c| process_client(c) }
285:                 thread[:started_on] = Time.now
286:                 @workers.add(thread)
287:   
288:                 sleep @throttle if @throttle > 0
289:               end
290:             rescue StopServer
291:               break
292:             rescue Errno::EMFILE
293:               reap_dead_workers("too many open files")
294:               sleep 0.5
295:             rescue Errno::ECONNABORTED
296:               # client closed the socket even before accept
297:               client.close rescue nil
298:             rescue Object => e
299:               STDERR.puts "#{Time.now}: Unhandled listen loop exception #{e.inspect}."
300:               STDERR.puts e.backtrace.join("\n")
301:             end
302:           end
303:           graceful_shutdown
304:         ensure
305:           @socket.close
306:           # STDERR.puts "#{Time.now}: Closed socket."
307:         end
308:       end
309: 
310:       return @acceptor
311:     end

Stops the acceptor thread and then causes the worker threads to finish off the request queue before finally exiting.

[Source]

     # File lib/mongrel.rb, line 339
339:     def stop(synchronous=false)
340:       @acceptor.raise(StopServer.new)
341: 
342:       if synchronous
343:         sleep(0.5) while @acceptor.alive?
344:       end
345:     end

Stops the acceptor thread and then causes the worker threads to finish off the request queue before finally exiting.

[Source]

     # File lib/mongrel.rb, line 339
339:     def stop(synchronous=false)
340:       @acceptor.raise(StopServer.new)
341: 
342:       if synchronous
343:         sleep(0.5) while @acceptor.alive?
344:       end
345:     end

Removes any handlers registered at the given URI. See Mongrel::URIClassifier#unregister for more information. Remember this removes them all so the entire processing chain goes away.

[Source]

     # File lib/mongrel.rb, line 333
333:     def unregister(uri)
334:       @classifier.unregister(uri)
335:     end

Removes any handlers registered at the given URI. See Mongrel::URIClassifier#unregister for more information. Remember this removes them all so the entire processing chain goes away.

[Source]

     # File lib/mongrel.rb, line 333
333:     def unregister(uri)
334:       @classifier.unregister(uri)
335:     end

[Validate]