Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Detecting (and even preventing) SQL Injection
Using the Percona Toolkit and Noinject!
Justin Swanhart
Percona Live, April 2013
INTRODUCTION
2
Introduction
• Who am I?
• What do I do?
• Why am I here?
3
The tools
• MySQL (5.0+)
• Percona Toolkit
– pt-query-digest
– pt-fingerprint
• MySQL Proxy (0.8.0+)
• Apache and PHP 5.3+
4
WHAT IS SQL INJECTION?
5
What is SQL injection?
• SQL injection is an attack vector
– An attacker modifies the SQL queries which will be
executed by the server
– But the attacker does not need to change the code
on the server or get access to the server
6
What is SQL injection – interpolation (strings)
$username = $_GET[‘username’];
$sql =
“select 1
from users.users
where admin_flag=true
and username = ‘“ . $username . “’”;
$ wget http://host/path.php?username=bob
$ wget http://host/path.php?user_id=“' or '1'='1”
 and username = ‘’ or ‘1’ = ‘1’
7
SQL injection!
Escape strings, or use prepared statements!
#escape string values
$username = mysqli_real_escape_string($_GET[‘username’]);
$sql = “select … and username = ‘“ . $username . “’”;
#prepared statement
$username = GET[‘username’];
$stmt = mysqli_stmt_init($conn);
$sql = “select … and username = ?”
mysqli_stmt_prepare($stmt, $sql);
mysqli_stmt_bind_param($stmt, “s”, $username);
mysqli_stmt_execute($stmt);
mysqli_stmt_close($stmt);
8
What is SQL injection – interpolation (ints)
$user_id = $_GET[‘user_id’];
$sql =
“select 1
from users.users
where admin_flag=true
and user_id = “ . $user_id;
…
$ wget http://host/path.php?user_id=1
$ wget http://host/path.php?user_id=“1 or 1=1”
9
SQL injection!
Use type checking, or prepared statements!
#check that integers really are integers!
$user_id = GET[‘user_id’];
if(!is_numeric(user_id)) $user_id = “NULL”;
$sql = “select … and user_id = “ . $user_id;
#prepared statement
$user_id = GET[‘user_id’];
$sql = “select … and user_id = ?”
…
mysqli_stmt_bind_param($stmt, “i”, $user_id);
mysqli_stmt_execute($stmt);
10
When escaping can’t help
• Some parts of a SQL statement can’t be
manipulated using parameters
• These include
– ORDER BY columns
– Variable number of items in an IN list
– Adding SQL syntax like DISTINCT
11
Don’t use user input in the query
#avoid using user input directly in ANY way
$sql = “select * from listings where deleted = 0 and sold
= 0 and open = 1”;
if(!empty($_GET[‘ob’])) {
$sql .= “ ORDER BY “ . $_GET[‘ob’];
}
wget … ?ob=post_date
wget … ?ob=“post_date union all (select * from listings)”
12
Now we can see all listings
Bad!
Use whitelisting instead
#avoid using user input directly in ANY way
$sql = “select * from listings where deleted = 0 and sold
= 0 and open = 1”;
$allowed = array(‘post_date’,’neighborhood’,’etc’);
if(!empty($_GET[‘ob’]) && is_string($_GET[‘ob’])) {
if(in_array($_GET[‘ob’], $allowed)) {
$sql .= “ ORDER BY “ . $_GET[‘ob’];
}
}
wget … ?ob=post_date
wget … ?ob=“post_date union all (select * from listings)”
13
in_array() is the keeper of the gate
All that works great for the apps you control
• BUT…
– If you don’t have the source for an app, then you
really can’t be sure it isn’t safe from SQL injection
– Or maybe you have to support old apps
– Or apps that were not developed rigorously
– What do we do in these cases?
14
SQL INJECTION DETECTION USING
PT-QUERY-DIGEST
Out-of-band SQL injection detection
15
How to detect SQL injection?
• Most applications only do a small number of
things.
– Add orders, mark orders as shipped, update
addresses, etc.
– The SQL “patterns” that identify these behaviors
can be collected and whitelisted.
– Queries that don’t match a known fingerprint may
be investigated as SQL injection attempts
16
What is a query fingerprint?
• A query fingerprinting algorithm transforms a
query into a form that allows like queries to be
grouped together and identified as a unit
– In other words, these like queries share a
fingerprint
– Even though the queries differ slightly they still
fingerprint to the same value
– This is a heuristic based approach
17
Tools that support query fingerprints
• Percona Toolkit tools
– pt-query-digest
– pt-fingerprint
18
Reads slow query logs and
populates the whitelist table.
Can also be used to display new
queries that have not been
marked as allowed.
Takes a query (or queries) and
produces fingerprints.
Useful for third party tools that
want to use fingerprints.
What is a query fingerprint (cont?)
select * from some_table where col = 3
becomes
select * from some_table where col = ?
select * from some_table where col = IN (1,2)
becomes
select * from some_table where col IN (?)
19
Query fingerprints expressed as hashes
pt-query-digest can provide short hashes of
checksums
select * from some_table where col = ?
982e5737f9747a5d (1631105377)
select * from some_table where col = IN (?)
2da8ed487cdfc1c8 (1680229806268)
20
base 10
pt-query-digest
• Normally used for profiling slow queries
• Has a “SQL review” feature for DBAs
– Designed to mark query fingerprints as having
been reviewed
– This feature can be co-opted to discover new
query fingerprints automatically
– New fingerprints are either new application code
or SQL injection attempts
21
pt-query-digest – review feature
• Need to store the fingerprints in a table
– Known good fingerprints will be marked as
reviewed
– If pt-query-digest discovers new fingerprints you
will be alerted because there will be unreviewed
queries in the table
22
pt-query-digest - review table initialization
Need to initialize the table
pt-query-digest /path/to/slow.log 
--create-review-table
--review “h=127.0.0.1,P=3306,u=percona,p=2un1c0rns,D=percona,t=whitelist” 
--sample 1 
--no-report
23
Where to store fingerprints
Don’t waste time on stats
Don’t print
report
pt-query-digest – command-line review
pt-query-digest /path/to/slow.log 
--review “DSN…” 
--sample 1 
--report 
--limit 0
24
Ensure that all unreviewed queries are shown
Display the report of queries
Don’t collect stats, just sample one of
each new fingerprint
How it knows which queries have already
been reviewed
USING THE WHITELIST WITH SQL
25
Detecting new query fingerprints
SELECT count(*)
FROM percona.whitelist
WHERE reviewed_by IS NULL;
SELECT checksum, sample
FROM percona.whitelist
WHERE reviewed_by IS NULL;
26
Any new queries?
percona.whitelist is just
an example name, you can
use any you like
Get a list of the
queries
Add a query fingerprint to the whitelist
UPDATE percona.whitelist
SET reviewed_by = ‘allow’,
reviewed_on = now()
WHERE checksum= 1680229806268;
27
Blacklist a query fingerprint
You might also explicitly blacklist a fingerprint
UPDATE percona.whitelist
SET reviewed_by = ‘deny’,
reviewed_on = now()
WHERE checksum = 1631105377;
28
Web interface for whitelist management
• The Noinject! project (discussed later) has a
web interface that can be used to mark
queries as reviewed
• It can be with both the noinject.lua proxy
script or with pt-query-digest
29
LIMITATIONS AND CAVEATS
Out of band detection
30
Out-of-band detection
• Some damage or information leakage may
have already happened
• To limit the extent of the damage send an alert
as soon as a new pattern is detected
– Ensure thorough application pattern detection in a
test environment to avoid false positives
31
Get logs as fast as possible
• Use tcpdump on a mirrored server port
– Pipe the output to pt-query-digest
• Use tcpdump on the database server
– Adds some additional overhead from running the
tools on the same machine
– Possibly higher packet loss
• Collect and process slow query logs frequently
– Adds slow query log overhead to server
– Longer delay before processing
32
FINDING THE VULNERABILITY
What to do BEFORE a fishy fingerprint appears
33
Prepare for finding a vulnerability
• Tracking down the vulnerable code fragment
can be difficult if you have only the SQL
statement
• Not just a problem with SQL injection since it is
usually convenient to see where a SQL
statement was generated from
34
Add tracing comments to queries
• A good approach is to modify the data access
layer (DAL) to add SQL comments
– Comments are preserved in the slow query log
– Comments are displayed in SHOW commands
• SHOW ENGINE INNODB STATUS
• SHOW PROCESSLIST
– Make sure your client does not strip comments!
35
Add tracing information
• PHP can use debug_backtrace() for example
• PERL has variables that point to the file and
line
• Investigate the debugging section of your
langauge’s manual
36
What to place in the comment
• Here are some important things to consider
placing into the tracing comment
– session_id (or important cookie info)
– application file name, and line number
– important GET, POST, PUT or DELETE contents
– Any other important information which could be
useful for tracking down the vector being used in
an attack
37
Example comments in SQL queries
select airport_name, count(*)
from dim_airport
join ontime_fact
on dest_airport_id = airport_id
where depdelay > 30
and flightdate_id = 20080101
/*
webserver:192.168.1.3,file:show_delays.php,l
ine:326,function:get_delayed_flights,user:ju
stin,sessionid:7B7N2PCNIOKCGF
*/
38
This comment contains all that you need
Most apps don’t do this out of the box
• You can modify the application
– If you have the source code (and it uses a DAL)
• BUT…
– There isn’t much you can do if
• The application is closed source , or you can’t change
the source
• There is no DAL (code/query spaghetti)
• For any other reason it is problematic to inject
information into all SQL queries
39
If I can’t change the source?
• You can’t fix the problems when you detect
them.
• Consider using an open source solution
• Or consider in-band protection
40
SQL INJECTION PREVENTION
In-band SQL injection detection
41
In-band protection
• Using pt-query-digest to discover new query
patterns is useful
– But it doesn’t work in real time
– It can’t block bad queries from actually executing
42
In-band protection
• What is needed is a “man in the middle” that
inspects each query to ensure it matches an
allowed fingerprint.
– MySQL proxy can be used for this purpose
43
MySQL Proxy
• MySQL Proxy
– Supports Lua scripting for easy development
– Adds some latency to all queries
– Considered “alpha” quality though for simple
scripts it seems stable enough
– Fingerprinting and checking database also adds
latency. 3ms – 5ms per query is to be expected
44
Noinject! – The Lua script and PHP interface
• http://code.google.com/p/noinject-mysql
• The Lua script for MySQL proxy is pretty much
drop-in.
– Just modify it to point to your database server and
specify credentials and other options.
• PHP script is similarly easy to configure.
– Drop in a directory on an Apache box
– Modify the script to set the options.
45
The Lua proxy script – known queries
• By default the script will retrieve all known
good fingerprints and cache them locally when
the first query is received from a client
• Also by default, all queries that fail to pass the
known whitelist check are logged in an
exception table.
46
Both of these options can be changed
easily
The Lua proxy script – known queries
• Each query is fingerprinted
– If the fingerprint is on the whitelist, the actual
query is sent to the server
– If the query is not on the whitelist the behavior
varies depending on the proxy mode
47
Lua script – Proxy mode
• permissive mode
– Records the SQL fingerprint into the whitelist table
but does not mark it as reviewed
– Allows the query to proceed
• restrictive mode
– Records the SQL fingerprint into the whitelist table
– Returns an empty set for the query
48
Why use permissive mode?
• Permissive mode allows the collection of SQL
fingerprints for an application dynamically
– Just run the application with typical workload and
the SQL queries will be recorded automatically
– Eventually switch to restrictive mode
49
PHP Web interface
• 1999 mode HTML interface 
50
Query Sample
Last action time
with note
White or
black list
the fingerprint
If you want something prettier
• This is open source so…
• If you want bug fixes or have feature requests
– You can engage with Percona for development
– You can contribute!
– You can fork your own version 
51
If the proxy overhead is too high
• You could develop the functionality in MySQL
– too bad the parser is not pluggable 
• Try mysqlnd plugins
– fingerprint queries in PHP
– match them to a whitelist maintained in a serialized
PHP array
– reject queries that aren’t approved
• Improve the proxy lua script
– fingerprint process could probably be made faster
52
Percona Training Advantage
• This presentation and the Noinject! tool were
created by Justin Swanhart, one of Percona’s
expert trainers
– Check out http://training.percona.com for a list of
training events near you
– Request training directly by Justin or any of our
other expert trainers by contacting your Percona
sales rep today
53
54
Q/A

More Related Content

Noinject

  • 1. Detecting (and even preventing) SQL Injection Using the Percona Toolkit and Noinject! Justin Swanhart Percona Live, April 2013
  • 3. Introduction • Who am I? • What do I do? • Why am I here? 3
  • 4. The tools • MySQL (5.0+) • Percona Toolkit – pt-query-digest – pt-fingerprint • MySQL Proxy (0.8.0+) • Apache and PHP 5.3+ 4
  • 5. WHAT IS SQL INJECTION? 5
  • 6. What is SQL injection? • SQL injection is an attack vector – An attacker modifies the SQL queries which will be executed by the server – But the attacker does not need to change the code on the server or get access to the server 6
  • 7. What is SQL injection – interpolation (strings) $username = $_GET[‘username’]; $sql = “select 1 from users.users where admin_flag=true and username = ‘“ . $username . “’”; $ wget http://host/path.php?username=bob $ wget http://host/path.php?user_id=“' or '1'='1”  and username = ‘’ or ‘1’ = ‘1’ 7 SQL injection!
  • 8. Escape strings, or use prepared statements! #escape string values $username = mysqli_real_escape_string($_GET[‘username’]); $sql = “select … and username = ‘“ . $username . “’”; #prepared statement $username = GET[‘username’]; $stmt = mysqli_stmt_init($conn); $sql = “select … and username = ?” mysqli_stmt_prepare($stmt, $sql); mysqli_stmt_bind_param($stmt, “s”, $username); mysqli_stmt_execute($stmt); mysqli_stmt_close($stmt); 8
  • 9. What is SQL injection – interpolation (ints) $user_id = $_GET[‘user_id’]; $sql = “select 1 from users.users where admin_flag=true and user_id = “ . $user_id; … $ wget http://host/path.php?user_id=1 $ wget http://host/path.php?user_id=“1 or 1=1” 9 SQL injection!
  • 10. Use type checking, or prepared statements! #check that integers really are integers! $user_id = GET[‘user_id’]; if(!is_numeric(user_id)) $user_id = “NULL”; $sql = “select … and user_id = “ . $user_id; #prepared statement $user_id = GET[‘user_id’]; $sql = “select … and user_id = ?” … mysqli_stmt_bind_param($stmt, “i”, $user_id); mysqli_stmt_execute($stmt); 10
  • 11. When escaping can’t help • Some parts of a SQL statement can’t be manipulated using parameters • These include – ORDER BY columns – Variable number of items in an IN list – Adding SQL syntax like DISTINCT 11
  • 12. Don’t use user input in the query #avoid using user input directly in ANY way $sql = “select * from listings where deleted = 0 and sold = 0 and open = 1”; if(!empty($_GET[‘ob’])) { $sql .= “ ORDER BY “ . $_GET[‘ob’]; } wget … ?ob=post_date wget … ?ob=“post_date union all (select * from listings)” 12 Now we can see all listings Bad!
  • 13. Use whitelisting instead #avoid using user input directly in ANY way $sql = “select * from listings where deleted = 0 and sold = 0 and open = 1”; $allowed = array(‘post_date’,’neighborhood’,’etc’); if(!empty($_GET[‘ob’]) && is_string($_GET[‘ob’])) { if(in_array($_GET[‘ob’], $allowed)) { $sql .= “ ORDER BY “ . $_GET[‘ob’]; } } wget … ?ob=post_date wget … ?ob=“post_date union all (select * from listings)” 13 in_array() is the keeper of the gate
  • 14. All that works great for the apps you control • BUT… – If you don’t have the source for an app, then you really can’t be sure it isn’t safe from SQL injection – Or maybe you have to support old apps – Or apps that were not developed rigorously – What do we do in these cases? 14
  • 15. SQL INJECTION DETECTION USING PT-QUERY-DIGEST Out-of-band SQL injection detection 15
  • 16. How to detect SQL injection? • Most applications only do a small number of things. – Add orders, mark orders as shipped, update addresses, etc. – The SQL “patterns” that identify these behaviors can be collected and whitelisted. – Queries that don’t match a known fingerprint may be investigated as SQL injection attempts 16
  • 17. What is a query fingerprint? • A query fingerprinting algorithm transforms a query into a form that allows like queries to be grouped together and identified as a unit – In other words, these like queries share a fingerprint – Even though the queries differ slightly they still fingerprint to the same value – This is a heuristic based approach 17
  • 18. Tools that support query fingerprints • Percona Toolkit tools – pt-query-digest – pt-fingerprint 18 Reads slow query logs and populates the whitelist table. Can also be used to display new queries that have not been marked as allowed. Takes a query (or queries) and produces fingerprints. Useful for third party tools that want to use fingerprints.
  • 19. What is a query fingerprint (cont?) select * from some_table where col = 3 becomes select * from some_table where col = ? select * from some_table where col = IN (1,2) becomes select * from some_table where col IN (?) 19
  • 20. Query fingerprints expressed as hashes pt-query-digest can provide short hashes of checksums select * from some_table where col = ? 982e5737f9747a5d (1631105377) select * from some_table where col = IN (?) 2da8ed487cdfc1c8 (1680229806268) 20 base 10
  • 21. pt-query-digest • Normally used for profiling slow queries • Has a “SQL review” feature for DBAs – Designed to mark query fingerprints as having been reviewed – This feature can be co-opted to discover new query fingerprints automatically – New fingerprints are either new application code or SQL injection attempts 21
  • 22. pt-query-digest – review feature • Need to store the fingerprints in a table – Known good fingerprints will be marked as reviewed – If pt-query-digest discovers new fingerprints you will be alerted because there will be unreviewed queries in the table 22
  • 23. pt-query-digest - review table initialization Need to initialize the table pt-query-digest /path/to/slow.log --create-review-table --review “h=127.0.0.1,P=3306,u=percona,p=2un1c0rns,D=percona,t=whitelist” --sample 1 --no-report 23 Where to store fingerprints Don’t waste time on stats Don’t print report
  • 24. pt-query-digest – command-line review pt-query-digest /path/to/slow.log --review “DSN…” --sample 1 --report --limit 0 24 Ensure that all unreviewed queries are shown Display the report of queries Don’t collect stats, just sample one of each new fingerprint How it knows which queries have already been reviewed
  • 25. USING THE WHITELIST WITH SQL 25
  • 26. Detecting new query fingerprints SELECT count(*) FROM percona.whitelist WHERE reviewed_by IS NULL; SELECT checksum, sample FROM percona.whitelist WHERE reviewed_by IS NULL; 26 Any new queries? percona.whitelist is just an example name, you can use any you like Get a list of the queries
  • 27. Add a query fingerprint to the whitelist UPDATE percona.whitelist SET reviewed_by = ‘allow’, reviewed_on = now() WHERE checksum= 1680229806268; 27
  • 28. Blacklist a query fingerprint You might also explicitly blacklist a fingerprint UPDATE percona.whitelist SET reviewed_by = ‘deny’, reviewed_on = now() WHERE checksum = 1631105377; 28
  • 29. Web interface for whitelist management • The Noinject! project (discussed later) has a web interface that can be used to mark queries as reviewed • It can be with both the noinject.lua proxy script or with pt-query-digest 29
  • 30. LIMITATIONS AND CAVEATS Out of band detection 30
  • 31. Out-of-band detection • Some damage or information leakage may have already happened • To limit the extent of the damage send an alert as soon as a new pattern is detected – Ensure thorough application pattern detection in a test environment to avoid false positives 31
  • 32. Get logs as fast as possible • Use tcpdump on a mirrored server port – Pipe the output to pt-query-digest • Use tcpdump on the database server – Adds some additional overhead from running the tools on the same machine – Possibly higher packet loss • Collect and process slow query logs frequently – Adds slow query log overhead to server – Longer delay before processing 32
  • 33. FINDING THE VULNERABILITY What to do BEFORE a fishy fingerprint appears 33
  • 34. Prepare for finding a vulnerability • Tracking down the vulnerable code fragment can be difficult if you have only the SQL statement • Not just a problem with SQL injection since it is usually convenient to see where a SQL statement was generated from 34
  • 35. Add tracing comments to queries • A good approach is to modify the data access layer (DAL) to add SQL comments – Comments are preserved in the slow query log – Comments are displayed in SHOW commands • SHOW ENGINE INNODB STATUS • SHOW PROCESSLIST – Make sure your client does not strip comments! 35
  • 36. Add tracing information • PHP can use debug_backtrace() for example • PERL has variables that point to the file and line • Investigate the debugging section of your langauge’s manual 36
  • 37. What to place in the comment • Here are some important things to consider placing into the tracing comment – session_id (or important cookie info) – application file name, and line number – important GET, POST, PUT or DELETE contents – Any other important information which could be useful for tracking down the vector being used in an attack 37
  • 38. Example comments in SQL queries select airport_name, count(*) from dim_airport join ontime_fact on dest_airport_id = airport_id where depdelay > 30 and flightdate_id = 20080101 /* webserver:192.168.1.3,file:show_delays.php,l ine:326,function:get_delayed_flights,user:ju stin,sessionid:7B7N2PCNIOKCGF */ 38 This comment contains all that you need
  • 39. Most apps don’t do this out of the box • You can modify the application – If you have the source code (and it uses a DAL) • BUT… – There isn’t much you can do if • The application is closed source , or you can’t change the source • There is no DAL (code/query spaghetti) • For any other reason it is problematic to inject information into all SQL queries 39
  • 40. If I can’t change the source? • You can’t fix the problems when you detect them. • Consider using an open source solution • Or consider in-band protection 40
  • 41. SQL INJECTION PREVENTION In-band SQL injection detection 41
  • 42. In-band protection • Using pt-query-digest to discover new query patterns is useful – But it doesn’t work in real time – It can’t block bad queries from actually executing 42
  • 43. In-band protection • What is needed is a “man in the middle” that inspects each query to ensure it matches an allowed fingerprint. – MySQL proxy can be used for this purpose 43
  • 44. MySQL Proxy • MySQL Proxy – Supports Lua scripting for easy development – Adds some latency to all queries – Considered “alpha” quality though for simple scripts it seems stable enough – Fingerprinting and checking database also adds latency. 3ms – 5ms per query is to be expected 44
  • 45. Noinject! – The Lua script and PHP interface • http://code.google.com/p/noinject-mysql • The Lua script for MySQL proxy is pretty much drop-in. – Just modify it to point to your database server and specify credentials and other options. • PHP script is similarly easy to configure. – Drop in a directory on an Apache box – Modify the script to set the options. 45
  • 46. The Lua proxy script – known queries • By default the script will retrieve all known good fingerprints and cache them locally when the first query is received from a client • Also by default, all queries that fail to pass the known whitelist check are logged in an exception table. 46 Both of these options can be changed easily
  • 47. The Lua proxy script – known queries • Each query is fingerprinted – If the fingerprint is on the whitelist, the actual query is sent to the server – If the query is not on the whitelist the behavior varies depending on the proxy mode 47
  • 48. Lua script – Proxy mode • permissive mode – Records the SQL fingerprint into the whitelist table but does not mark it as reviewed – Allows the query to proceed • restrictive mode – Records the SQL fingerprint into the whitelist table – Returns an empty set for the query 48
  • 49. Why use permissive mode? • Permissive mode allows the collection of SQL fingerprints for an application dynamically – Just run the application with typical workload and the SQL queries will be recorded automatically – Eventually switch to restrictive mode 49
  • 50. PHP Web interface • 1999 mode HTML interface  50 Query Sample Last action time with note White or black list the fingerprint
  • 51. If you want something prettier • This is open source so… • If you want bug fixes or have feature requests – You can engage with Percona for development – You can contribute! – You can fork your own version  51
  • 52. If the proxy overhead is too high • You could develop the functionality in MySQL – too bad the parser is not pluggable  • Try mysqlnd plugins – fingerprint queries in PHP – match them to a whitelist maintained in a serialized PHP array – reject queries that aren’t approved • Improve the proxy lua script – fingerprint process could probably be made faster 52
  • 53. Percona Training Advantage • This presentation and the Noinject! tool were created by Justin Swanhart, one of Percona’s expert trainers – Check out http://training.percona.com for a list of training events near you – Request training directly by Justin or any of our other expert trainers by contacting your Percona sales rep today 53