Friday, March 30, 2007

php: When ob_start Is Not Your Friend

When programming for the web, you are sooner or later going to start having to look beyond the contents of <body> and messing about with html headers.

And, in php in particular, you will soon encounter the dreaded error which arises when you call 'header()' after having output some characters.

Fortunately, this problem is easily fixed by a call to ob_start, which turns on output buffering and stores and collates your html until it's all done. Headers are placed they're meant to be and all is sweetnes and light.

One common use of headers (which I have just started to get into) is to allow someone to view a file by streaming it through the browser rather than just setting up a url link. (as you might want to do when access to a particular set of files is restricted)

I'm not going to go into great explanation on how this is done; other and more experienced heads have done this very well already (eg: Leon Atkinson's 'Tricks of the Trade'). I'm more interested in describing a subtle but very annoying gotcha when you start doing this with output buffering in play.

You see, if you're streaming a binary file (like pdf or jpeg) for viewing, the binary data has to be very precise or the result will be unintelligible garbage. You must be very sure that no other data sneaks into the stream.

Data such as might be output prior to header calls...

With ob_start on, you will get no warning of this and, like me, you may spend a good deal of time scratching your head wondering why this stuff doesn't work and probably cursing IE (which is a bit more sensitive to this than is Firefox).

Without ob_start you will get an illuminating error saying that there already data in the output buffer. In this situation, *any* prior output is poisonous. Not just the result of print and echo statements, but anything that sits outside the <php? and ?> tags. html tags, spaces, carriage returns, anything.

Fortunately, the error message will usually give you enough information to track down and eliminate the rubbish.

Spurious output *after* the headers is probably less dangerous, but it's probably just as well to get rid of that as well.

Hope this helps someone out there.

No comments: