Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Dealing with Legacy PHP Applications Clinton R. Nixon [email_address]
What is a legacy application? Code you didn't write Code you wouldn't write Untested code Code with competing visions
What do we do with legacy code? We  refactor! Refactoring is  safely  changing the implementation of code without changing the behavior of code.
Bad code smells What are some specific problems in legacy PHP code? No separation between PHP and HTML Lots of  require s, few method calls Global variables
No separation between PHP and HTML <h1>Orders</h1> <?php $account = new Account($account_id); $account->loadOrders(); foreach ($account->getOrders() as $order) { echo '<h2>' . $order['id'] . '</h2>'; echo '<p>Status: ' . lookup_status($order['status_id']) . '<br />; echo 'Total: '; $total = array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')); echo $total . '</p>'; } ?>
Separating controllers and views  Even without a solid MVC architecture, this helps You can do this in several  safe  and easy steps You absolutely will find pain points
Why do I need to do this? Your code complexity will increase echo  isn't as fun as it looks You will find hidden bugs and mistakes
The simplest view class class View { protected static $VIEW_PATH = '/wherever/views/'; public function assign($name, $value) { return $this->$name = $value; } public function render($filename) { $filename = self::$VIEW_PATH . $filename; if (is_file($filename)) { ob_start(); include($filename); return ob_get_clean(); } } }
Obvious improvements to make Error handling Assignment by reference Changing view path Display convenience method Use-specific subclasses with helper methods
The separation process Gather all your code Sift and separate controller from view code Assign variables to the view object Change all variable references in the view code Split the files Find duplicated views
The rules of view code Allowed: Control structures echo , or  <?= $var ?> Display-specific functions, never nested Not allowed: Assignment Other function calls
Gather and sift code The step you won't like: gather all code for this controller Wipe brow Draw a line at the top of the code Move controller code above this line, fixing as necessary At this point,  everything  is view code
Code gathered <?php // View code goes below here ?> <h1>Orders</h1> <?php $account = new Account($account_id); $account->loadOrders(); foreach ($account->getOrders() as $order) { echo '<h2>' . $order['id'] . '</h2>'; echo '<p>Status: ' . lookup_status($order['status_id']) . '<br />; echo 'Total: '; $total = array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')); echo $total . '</p>'; } ?>
Some controller code moved <?php $account = new Account($account_id); $account->loadOrders(); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ($account->getOrders() as $order) { ?> <h2><?= $order['id'] ?></h2> <p>Status: <?= lookup_status($order['status_id']) <br /> Total:  <?= array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')) ?> </p> <?php } ?>
Alternative control structures <?php if ($foo): ?> ... <?php endif; ?> <?php foreach ($this as $that): ?> ... <?php endforeach; ?>
Using alternative control structures <?php $account = new Account($account_id); $account->loadOrders(); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ($account->getOrders() as $order): ?> <h2><?= $order['id'] ?></h2> <p>Status: <?= lookup_status($order['status_id']) ?> <br /> Total:  <?= array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')) ?> </p> <?php endforeach; ?>
A frustrating problem <?php foreach ($account->getOrders() as $order): ?> <h2><?= $order['id'] ?></h2> <p>Status: <?= lookup_status($order['status_id']) ?> <br /> Total:  <?= array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a'))  ?> </p> <?php endforeach; ?>
Dealing with this problem There are two approaches. You can create a new array of variables for your view. Or, you can encapsulate this logic in an object.
Our new order object <?php class Order { ... public function getStatus() { return lookup_status($this->getStatusId()); } public function getTotal() { return array_reduce($this->getPurchases(), create_function('$a, $b', '$a += $b; return $a')); } } ?>
Logic removed from view code <?php $account = new Account($account_id); $account->loadOrders(); $orders = $account->getOrders(); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ( $orders  as $order): ?> <h2> <?= $order->getId() ?> </h2> <p>Status:  <?= $order->getStatus() ?> <br /> Total:  <?= $order->getTotal() ?> </p> <?php endforeach; ?>
Change all variables to  view object variables Assign variables to the view object. $view->assign('foo', $foo); One-by-one, change variables in view code. Test to convince yourself. You will probably iterate back to the previous step. Document inputs to the view.
View object created <?php $account = new Account($account_id); $account->loadOrders(); $orders = $account->getOrders(); $view = new View(); $view->assign('orders', $orders); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ( $view->orders  as $order): ?> <h2><?= $order->getId() ?></h2> <p>Status: <?= $order->getStatus() ?> <br /> Total: <?= $order->getTotal() ?> </p> <?php endforeach; ?>
Separate the files Create a new file for the view code. Important!  Search and replace  $view  with  $this . Test one more time.
Our two files <?php $account = new Account($account_id); $account->loadOrders(); $orders = $account->getOrders(); $view = new View(); $view->assign('orders', $orders); $view->display('orders.tpl'); ?> <h1>Orders</h1> <?php foreach ( $this->orders  as $order): ?> <h2><?= $order->getId() ?></h2> <p>Status: <?= $order->getStatus() ?> <br /> Total: <?= $order->getTotal() ?> </p> <?php endforeach; ?>
Find duplicated views As you do this to multiple controllers, you will see repetition. There will probably be subtle differences. Take the time to re-work these so you can re-use view files. Note!  You can include views in other views with $this->render('included_file.tpl');
Using nested requires  instead of function calls <?php require_once('db_setup_inc.php'); require_once('account_auth_inc.php'); require_once('i18n_inc.php'); echo ' <h1>Orders for account #' . $account_id . '</h1>'; require('get_all_orders_inc.php'); ...
Untangling a require web Require statements which call other require statements. Can be very complex. Dependent on application structure.
Important reasons to  untangle this web Remove unneeded complexity. Create less procedural code. Prior to PHP 5.2,  require_once  and  include_once  are more expensive than you would think. If you are requiring class definitions, and you have a standard file naming method, use  __autoload().
The untangling process Identify inputs Identify outputs Wrap the file in a method Refactor method Move method to correct location
Identify inputs and outputs Find all variables expected to be set before this file is included. One possible way: execute this file by itself. Find all variables expected to be set or mutated by this file. Set variables are easy: comment out the require and watch the errors. Mutated is the set of inputs changed. Learn to search for these!
account_auth_inc.php <?php $auth_token =  $_COOKIE['token'] ; if ($auth_token) { $acct_id  =  $db ->GetOne('SELECT acct_id FROM logins WHERE auth_token = ?', array($auth_token)); } if ($acct_id) { $acct  = new Account($acct_id); } else { $acct  = null; } $_COOKIE['token']  = gen_new_token($auth_token);
Wrap the file in a function Wrap the entire include in a function. Pass all input variables. Return all output variables as an array. And then, call that function at the bottom of the required file! This is a mess!
Function-wrapped <?php function account_auth($db, $auth_token) { if ( $auth_token ) { $acct_id = $db->GetOne('SELECT acct_id FROM logins WHERE auth_token = ?', array($auth_token)); } if ($acct_id) { $acct = new Account($acct_id); } else { $acct = null; } return array($acct, gen_new_token($auth_token)); } list($acct, $_COOKIE['token']) = account_auth($db, $_COOKIE['token']);
Refactor until complete Tease out the functions, or objects, inside this function. If you are returning a lot of data, see if it can be an object. Leave your temporary big function in place, so that your outside code doesn't break. Keep updating it to deal with your refactoring.
Moved token handling to Account <?php function account_auth($db, $auth_token) { // Instead of null, we now return an unloaded Account. $acct = new Account(); if ($auth_token) { // SQL code from before $acct->loadFromToken($auth_token); // Token generation and cookie setting $acct->genNewToken($auth_token); } return $acct; } $acct = account_auth($db, $_COOKIE['token']);
Move to correct location Finally! Figure out where these functions or objects should live in your application. Move them there. Find where the require is called throughout your application, and replace that with your new function call or object method.
Global variables everywhere <?php $account_id = $_POST['acct_id']; $account = new Account($account_id); function getPurchases() { global $account; global $database; ... } function getLanguage() { global $account; global $database; global $i18n; ... }
Removing globals one by one Common globals: $_POST  and  $_GET Session or cookie data Database handles User account Language
Do you still have  register_globals  on? You may have heard: this is a bad idea. You may think that it will be impossible to fix. It's not. Turn on  E_ALL. Spider your site and grep for uninitialized variables. It's some work, but not as hard as you think. It's worth it.
$_POST  and  $_GET These aren't horrible. But not horrible isn't a very high standard. class InputVariable { public function __construct($name) {...} public function isSet() {...} public function isGet() {...} public function isPost() {...} public function getAsString() {...} public function getAsInt() {...} ... }
The database global object Very common in PHP code Again, not  horrible Prevents testing Prevents multiple databases
Parameterizing the DB handle Does it need to be everywhere? Can you pass it in to a function or to a constructor? The process is simple. Add database parameter. Pass in that global variable. If the call is not in global scope, find out how to pass in that variable to the current scope. Repeat.
Parameterizing globals <?php $account_id = $_POST['acct_id']; $account = new Account( $database , $account_id); function getPurchases( $account ) { global $account; global $database; ... } function getLanguage( $account, $i18n ) { global $account; global $database; global $i18n; ... }
Maybe it does have to be everywhere. Use a singleton. But not really. Make a way to change the singleton instance. Global define or environment variable. Static mutator.
A quick recap What are some specific problems in legacy PHP code? Mixed PHP and HTML – confusion between controller and view Use of require statements instead of function calls Unnecessary global variables causing dependencies
Further reading Working Effectively With Legacy Code,  Michael Feathers Refactoring,  Martin Fowler
Questions? [email_address]

More Related Content

Os Nixon

  • 1. Dealing with Legacy PHP Applications Clinton R. Nixon [email_address]
  • 2. What is a legacy application? Code you didn't write Code you wouldn't write Untested code Code with competing visions
  • 3. What do we do with legacy code? We refactor! Refactoring is safely changing the implementation of code without changing the behavior of code.
  • 4. Bad code smells What are some specific problems in legacy PHP code? No separation between PHP and HTML Lots of require s, few method calls Global variables
  • 5. No separation between PHP and HTML <h1>Orders</h1> <?php $account = new Account($account_id); $account->loadOrders(); foreach ($account->getOrders() as $order) { echo '<h2>' . $order['id'] . '</h2>'; echo '<p>Status: ' . lookup_status($order['status_id']) . '<br />; echo 'Total: '; $total = array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')); echo $total . '</p>'; } ?>
  • 6. Separating controllers and views Even without a solid MVC architecture, this helps You can do this in several safe and easy steps You absolutely will find pain points
  • 7. Why do I need to do this? Your code complexity will increase echo isn't as fun as it looks You will find hidden bugs and mistakes
  • 8. The simplest view class class View { protected static $VIEW_PATH = '/wherever/views/'; public function assign($name, $value) { return $this->$name = $value; } public function render($filename) { $filename = self::$VIEW_PATH . $filename; if (is_file($filename)) { ob_start(); include($filename); return ob_get_clean(); } } }
  • 9. Obvious improvements to make Error handling Assignment by reference Changing view path Display convenience method Use-specific subclasses with helper methods
  • 10. The separation process Gather all your code Sift and separate controller from view code Assign variables to the view object Change all variable references in the view code Split the files Find duplicated views
  • 11. The rules of view code Allowed: Control structures echo , or <?= $var ?> Display-specific functions, never nested Not allowed: Assignment Other function calls
  • 12. Gather and sift code The step you won't like: gather all code for this controller Wipe brow Draw a line at the top of the code Move controller code above this line, fixing as necessary At this point, everything is view code
  • 13. Code gathered <?php // View code goes below here ?> <h1>Orders</h1> <?php $account = new Account($account_id); $account->loadOrders(); foreach ($account->getOrders() as $order) { echo '<h2>' . $order['id'] . '</h2>'; echo '<p>Status: ' . lookup_status($order['status_id']) . '<br />; echo 'Total: '; $total = array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')); echo $total . '</p>'; } ?>
  • 14. Some controller code moved <?php $account = new Account($account_id); $account->loadOrders(); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ($account->getOrders() as $order) { ?> <h2><?= $order['id'] ?></h2> <p>Status: <?= lookup_status($order['status_id']) <br /> Total: <?= array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')) ?> </p> <?php } ?>
  • 15. Alternative control structures <?php if ($foo): ?> ... <?php endif; ?> <?php foreach ($this as $that): ?> ... <?php endforeach; ?>
  • 16. Using alternative control structures <?php $account = new Account($account_id); $account->loadOrders(); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ($account->getOrders() as $order): ?> <h2><?= $order['id'] ?></h2> <p>Status: <?= lookup_status($order['status_id']) ?> <br /> Total: <?= array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')) ?> </p> <?php endforeach; ?>
  • 17. A frustrating problem <?php foreach ($account->getOrders() as $order): ?> <h2><?= $order['id'] ?></h2> <p>Status: <?= lookup_status($order['status_id']) ?> <br /> Total: <?= array_reduce($order['purchases'], create_function('$a, $b', '$a += $b; return $a')) ?> </p> <?php endforeach; ?>
  • 18. Dealing with this problem There are two approaches. You can create a new array of variables for your view. Or, you can encapsulate this logic in an object.
  • 19. Our new order object <?php class Order { ... public function getStatus() { return lookup_status($this->getStatusId()); } public function getTotal() { return array_reduce($this->getPurchases(), create_function('$a, $b', '$a += $b; return $a')); } } ?>
  • 20. Logic removed from view code <?php $account = new Account($account_id); $account->loadOrders(); $orders = $account->getOrders(); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ( $orders as $order): ?> <h2> <?= $order->getId() ?> </h2> <p>Status: <?= $order->getStatus() ?> <br /> Total: <?= $order->getTotal() ?> </p> <?php endforeach; ?>
  • 21. Change all variables to view object variables Assign variables to the view object. $view->assign('foo', $foo); One-by-one, change variables in view code. Test to convince yourself. You will probably iterate back to the previous step. Document inputs to the view.
  • 22. View object created <?php $account = new Account($account_id); $account->loadOrders(); $orders = $account->getOrders(); $view = new View(); $view->assign('orders', $orders); ?> <?php // View code goes below here ?> <h1>Orders</h1> <?php foreach ( $view->orders as $order): ?> <h2><?= $order->getId() ?></h2> <p>Status: <?= $order->getStatus() ?> <br /> Total: <?= $order->getTotal() ?> </p> <?php endforeach; ?>
  • 23. Separate the files Create a new file for the view code. Important! Search and replace $view with $this . Test one more time.
  • 24. Our two files <?php $account = new Account($account_id); $account->loadOrders(); $orders = $account->getOrders(); $view = new View(); $view->assign('orders', $orders); $view->display('orders.tpl'); ?> <h1>Orders</h1> <?php foreach ( $this->orders as $order): ?> <h2><?= $order->getId() ?></h2> <p>Status: <?= $order->getStatus() ?> <br /> Total: <?= $order->getTotal() ?> </p> <?php endforeach; ?>
  • 25. Find duplicated views As you do this to multiple controllers, you will see repetition. There will probably be subtle differences. Take the time to re-work these so you can re-use view files. Note! You can include views in other views with $this->render('included_file.tpl');
  • 26. Using nested requires instead of function calls <?php require_once('db_setup_inc.php'); require_once('account_auth_inc.php'); require_once('i18n_inc.php'); echo ' <h1>Orders for account #' . $account_id . '</h1>'; require('get_all_orders_inc.php'); ...
  • 27. Untangling a require web Require statements which call other require statements. Can be very complex. Dependent on application structure.
  • 28. Important reasons to untangle this web Remove unneeded complexity. Create less procedural code. Prior to PHP 5.2, require_once and include_once are more expensive than you would think. If you are requiring class definitions, and you have a standard file naming method, use __autoload().
  • 29. The untangling process Identify inputs Identify outputs Wrap the file in a method Refactor method Move method to correct location
  • 30. Identify inputs and outputs Find all variables expected to be set before this file is included. One possible way: execute this file by itself. Find all variables expected to be set or mutated by this file. Set variables are easy: comment out the require and watch the errors. Mutated is the set of inputs changed. Learn to search for these!
  • 31. account_auth_inc.php <?php $auth_token = $_COOKIE['token'] ; if ($auth_token) { $acct_id = $db ->GetOne('SELECT acct_id FROM logins WHERE auth_token = ?', array($auth_token)); } if ($acct_id) { $acct = new Account($acct_id); } else { $acct = null; } $_COOKIE['token'] = gen_new_token($auth_token);
  • 32. Wrap the file in a function Wrap the entire include in a function. Pass all input variables. Return all output variables as an array. And then, call that function at the bottom of the required file! This is a mess!
  • 33. Function-wrapped <?php function account_auth($db, $auth_token) { if ( $auth_token ) { $acct_id = $db->GetOne('SELECT acct_id FROM logins WHERE auth_token = ?', array($auth_token)); } if ($acct_id) { $acct = new Account($acct_id); } else { $acct = null; } return array($acct, gen_new_token($auth_token)); } list($acct, $_COOKIE['token']) = account_auth($db, $_COOKIE['token']);
  • 34. Refactor until complete Tease out the functions, or objects, inside this function. If you are returning a lot of data, see if it can be an object. Leave your temporary big function in place, so that your outside code doesn't break. Keep updating it to deal with your refactoring.
  • 35. Moved token handling to Account <?php function account_auth($db, $auth_token) { // Instead of null, we now return an unloaded Account. $acct = new Account(); if ($auth_token) { // SQL code from before $acct->loadFromToken($auth_token); // Token generation and cookie setting $acct->genNewToken($auth_token); } return $acct; } $acct = account_auth($db, $_COOKIE['token']);
  • 36. Move to correct location Finally! Figure out where these functions or objects should live in your application. Move them there. Find where the require is called throughout your application, and replace that with your new function call or object method.
  • 37. Global variables everywhere <?php $account_id = $_POST['acct_id']; $account = new Account($account_id); function getPurchases() { global $account; global $database; ... } function getLanguage() { global $account; global $database; global $i18n; ... }
  • 38. Removing globals one by one Common globals: $_POST and $_GET Session or cookie data Database handles User account Language
  • 39. Do you still have register_globals on? You may have heard: this is a bad idea. You may think that it will be impossible to fix. It's not. Turn on E_ALL. Spider your site and grep for uninitialized variables. It's some work, but not as hard as you think. It's worth it.
  • 40. $_POST and $_GET These aren't horrible. But not horrible isn't a very high standard. class InputVariable { public function __construct($name) {...} public function isSet() {...} public function isGet() {...} public function isPost() {...} public function getAsString() {...} public function getAsInt() {...} ... }
  • 41. The database global object Very common in PHP code Again, not horrible Prevents testing Prevents multiple databases
  • 42. Parameterizing the DB handle Does it need to be everywhere? Can you pass it in to a function or to a constructor? The process is simple. Add database parameter. Pass in that global variable. If the call is not in global scope, find out how to pass in that variable to the current scope. Repeat.
  • 43. Parameterizing globals <?php $account_id = $_POST['acct_id']; $account = new Account( $database , $account_id); function getPurchases( $account ) { global $account; global $database; ... } function getLanguage( $account, $i18n ) { global $account; global $database; global $i18n; ... }
  • 44. Maybe it does have to be everywhere. Use a singleton. But not really. Make a way to change the singleton instance. Global define or environment variable. Static mutator.
  • 45. A quick recap What are some specific problems in legacy PHP code? Mixed PHP and HTML – confusion between controller and view Use of require statements instead of function calls Unnecessary global variables causing dependencies
  • 46. Further reading Working Effectively With Legacy Code, Michael Feathers Refactoring, Martin Fowler