Monday, January 04, 2010

PHP Tips : converting legacy content into UTF-8

Today, we have encoding problems with legacy apps written in PHP. After some discussions and some googlings, I found this interesting post :

You can read in it :
  • Latin1 on the inside, utf-8 on the outside
  • Embed utf-8 within latin1
I copied-pasted just a few lines of code :

// declare that the output will be in utf-8
header("Content-Type: text/html; charset=utf-8");
// open an output buffer, capturing all output
// when the script ends, the buffer is piped through this functions, encoding it from latin1 to utf-8
function output_handler($buffer) {
return utf8_encode($buffer);