Logging and querying web requests

Out-of-the-box, logworm will start logging the requests that your application gets, into a log table named web_log. This happens by default, and you don't have to do anything to setup this special behavior. Every log entry in that log table records the following fields:

  • timestamp and _ts_utc (this appears in ALL log tables)
  • request_id (an automatically assigned id, which also appears in ALL log tables)
  • summary --e.g. "POST /posturl - 200 0.000613"--, with method, url, response status, and processing time
  • request_method --e.g. "POST"
  • request_path --e.g. "/posturl"
  • response_status --e.g. 200
  • profiling --e.g. 0.000613
  • input --e.g. "table=sales&key=customer"-- if the request included any parameters (in the URL or form-encoded)
  • queue_size --e.g. 2--, indicating the size of the HEROKU_QUEUE_DEPTH variable, if present
  • request_ip --e.g. "24.63.134.228"--

One obvious thing you can do with that information is tail it. For that, you can use the lw-tail command:

$ lw-tail web_log

2010-03-02 @ 15:39:38 GMT ==> 
	_request_id: "23949566400260-1267544378088"
	_ts_utc: 1267544378089
	input: "table=sales&keys%5B%5D=customer&values%5B%5D=keith&keys%5B%5D=total&values%5B%5D=100&keys%5B%5D=&values%5B%5D="
	profiling: 0.000613
	queue_size: "-1"
	request_ip: "24.63.134.228"
	request_method: "POST"
	request_path: "/post"
	response_status: 200
	summary: "POST /post - 200 0.000613"

2010-03-02 @ 15:39:40 GMT ==> 
	_request_id: "23949566400260-1267544380865"
	_ts_utc: 1267544380866
	input: ""
	profiling: 0.00056
	queue_size: "-1"
	request_ip: "24.63.134.228, 10.242.45.203"
	request_method: "GET"
	request_path: "/favicon.ico"
	response_status: 404
	summary: "GET /favicon.ico - 404 0.00056"

lw-tail allows you to filter its results by conditions that you specify. For example, let's say that you are only interested in seeing 500 errors; you can then do the following:

$ lw-tail web_log -c '"response_status":"500"'
2010-04-09 @ 16:18:43 GMT ==> 
	_request_id: "2159095760-1270829923200"
	_ts_utc: 1270829923204
	input: ""
	profiling: 0.002967
	queue_size: -1
	request_ip: "127.0.0.1"
	request_method: "GET"
	request_path: "/error"
	response_status: 500
	summary: "GET /error - 500 0.002967"

Another thing you may want to do is get counts and groupings of this information. The lw-compute command-line tool is useful for that. For example, let's say you want to aggregate all requests by their return status (200,400,500,etc.) and count them --you can do that from your terminal with the following command:

$ lw-compute web_log count -g "response_status"
[response_status:200.0] 	 ==> count = 7.0
[response_status:404.0] 	 ==> count = 7.0
[response_status:500.0] 	 ==> count = 1.0

With lw-compute you can group by multiple fields or by functions (such as hour(_ts_utc)), compute sums, counts, min and max, and also filter by conditions and time, as with lw-tail. For example, to get the total processing time spent handling requests grouped by hour of day you can do the following:

$ lw-compute web_log sum profiling -g "hour(_ts_utc)"
[...]
[hour(_ts_utc):15] 	 ==> sum(profiling) = 0.073745
[hour(_ts_utc):16] 	 ==> sum(profiling) = 0.011361
[...]

For a complete guide on how to use these two commands, consult the lw-tail reference or the lw-compute reference.


Back to Overview or On to Logging app-specific data with the HTTP request entries