[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[freehaven-cvs] r1766: Added a subsection formally defining injection, and natterin (doc/trunk/dejector)

To: freehaven-cvs@xxxxxxxxxxxxx
Subject: [freehaven-cvs] r1766: Added a subsection formally defining injection, and natterin (doc/trunk/dejector)
From: mlp@xxxxxxxx
Date: Fri, 23 Mar 2007 05:33:53 -0400 (EDT)
Delivered-to: archiver@seul.org
Delivered-to: freehaven-cvs-outgoing@seul.org
Delivered-to: freehaven-cvs@seul.org
Delivery-date: Fri, 23 Mar 2007 05:34:04 -0400
Reply-to: freehaven-dev@xxxxxxxxxxxxx
Sender: owner-freehaven-cvs@xxxxxxxxxxxxx

Author: mlp
Date: 2007-03-23 05:33:52 -0400 (Fri, 23 Mar 2007)
New Revision: 1766

Modified:
   doc/trunk/dejector/dejector.tex
Log:
Added a subsection formally defining injection, and nattering up in the why-we-need-formal-definitions section.


Modified: doc/trunk/dejector/dejector.tex
===================================================================
--- doc/trunk/dejector/dejector.tex	2007-03-22 12:28:12 UTC (rev 1765)
+++ doc/trunk/dejector/dejector.tex	2007-03-23 09:33:52 UTC (rev 1766)
@@ -61,7 +61,7 @@
 
 \section{Introduction}
  
-SQL injection attacks have been a leading method of network compromise since their discovery in XXXX~\cite{}. Blah blah blah. Numerous methods have been proposed and deployed to hinder or inhibit injection attacks on SQL, to varying success. In this paper, we review the techniques both proposed in the literature, and deployed in the field. There is agreement in the literature that strong input validation is the proper approach to solving injection attacks in general.~\cite{FIXME}. However, the literature is lacking both in a set of formal axioms for input validation, as well as guidance for the implementors of systems concerned with input validation. We present such a framework in Section \ref{framework}, say something else in Section \ref{}, and show our approach as applied to SQL in Section \ref{}. In Section \ref{dejector}, we discuss \tt{Libdejector}, our approach as implemented as a PostgreSQL~\cite{} library.
+SQL injection attacks have been a leading method of network compromise since their discovery in XXXX~\cite{}. Blah blah blah. Numerous methods have been proposed and deployed to hinder or inhibit injection attacks on SQL, to varying success. In this paper, we review the techniques both proposed in the literature, and deployed in the field. There is agreement in the literature that strong input validation is the proper approach to solving injection attacks in general.~\cite{FIXME}. However, the literature is lacking both in a set of formal axioms for input validation, as well as guidance for the implementors of systems concerned with input validation. We present such a framework in Section \ref{framework}, say something else in Section \ref{}, and show our approach as applied to SQL in Section \ref{}. In Section \ref{dejector}, we discuss {\tt Libdejector}, our approach as implemented as a PostgreSQL~\cite{} library.
  
 \section{Injection attacks}
 
@@ -71,6 +71,10 @@
 \subsubsection{regex, etc...}
 
 \section{The need for formal models} 
+
+(This could be the place where we introduce the notion of language-theoretic security, i.e., limiting the user's capabilities by limiting the expressive power of the language and limiting the expressive power of the language by limiting both the complexity of its syntax and the semantic content captured in the vocabulary. I'm sure we can dig up plenty of examples of people doing things which ought not be done, just using the tools available to them -- simple ones like Towers of Hanoi in sendmail rules pave the way for more serious exploits.
+
+As I see it, this section exists to motivate section 4 below, and gives us the formal-language street cred for this Sydney conference. I would like, if at all possible, to run it by what's-his-name from your office that does the formal security work, though.)
  
 \section{Framework}
 \label{framework}
@@ -122,7 +126,7 @@
 Note that it is trivial to generate a one-off instance of a computational mechanism weaker than $M$ which validates one or more strings in $L$, but this does not invalidate this theorem.
 Consider $L = \{all strings of the form a^nb^n\}$.
 This language is context-free, but a finite state machine $F$ can be constructed which validates strings in $L$ of some finite length $l$.
-However, $L$ still contains an infinite number of strings which $F$ cannot validate; validating an arbitrary string in $L$ requires a pushdown automata.
+However, $L$ still contains an infinite number of strings which $F$ cannot validate; validating an arbitrary string in $L$ requires a context-free grammar.
 
 \begin{observation}[Maximal validation strength]
 A language L requiring a minimally--strong computational mechamism M should not be validated using a mechanism stronger than M.
@@ -142,7 +146,7 @@
 
 \begin{lemma}[Construction of sublanguages]
 \label{lem:sublanguage-construction}
-Given pushdown automata G and H, where $H = \langle V, T, R, S \in V \rangle$, L(G) is a sublanguage of L(H) if $V(G) \subseteq V(H), T(G) \subseteq T(H), R(G) \subseteq R(H)$ and $S(G) = S(H)$.
+Given context-free grammars G and H, where $H = \langle V, T, R, S \in V \rangle$, L(G) is a sublanguage of L(H) if $V(G) \subseteq V(H), T(G) \subseteq T(H), R(G) \subseteq R(H)$ and $S(G) = S(H)$.
 \end{lemma}
 \emph{Proof by Construction.} For any string $w$ in $L(G)$, there exists a production $P(w) \in G$.
 All variables, terminals, and rules used in $P(w)$ are correspondingly in $G$. $S(P(w)) = S(G)$.
@@ -153,9 +157,28 @@
 
 \begin{observation}[Mechanism equivalence]
 \label{obs:equiv}
-Any regular expression, finite-state automata or context-free grammar can be converted into an equivalent pushdown automata, and thus falls under Lemma 1.
+Any regular expression, finite-state automata or pushdown automata can be converted into an equivalent context-free grammar, and thus falls under Lemma \ref{lem:sublanguage-construction}.
 \end{observation}
 
+\subsection{Injection Formally Defined}
+
+In practice, attackers use injection to retrieve information which they cannot access legitimately or to manipulate the state of some set of information (e.g., unauthorized {\sc insert} or {\sc delete} queries on a relational database).
+Either the contents of the result set or the contents of the information set will differ from those which a non-injected query would produce.
+This necessarily alters both the semantic meaning and the syntactic structure of the query string.
+
+However, altering syntactic structure need not change semantic meaning.
+Adding any number of repetitions of the clause {\tt and '1' =  '1'} to the {\sc where} clause of an SQL {\sc select} statement will not produce a change in the contents of the result set, because $\forall P$, $P \wedge True \rightarrow P$ by identity.
+Each repetition changes the parse, though, and a sufficiently large number of repetitions could produce observable effects in the state of the machine responding to the query (e.g., increased {\sc cpu} or memory consumption during query processing, since systems which parse user input typically transform the entire parse tree into an in-memory representation and do not resolve tautologies or identities\footnote{Indeed, identifying all tautologies and identities is impossible in practice.
+While string-literal comparisons are easily identified, given minimal knowledge of a database schema an attacker can still compose comparisons of variable data which always evaluate to $True$, e.g. {\tt user.id = user.id} or {\tt count(id) > -1}.}).
+As unnecessarily increased resource consumption is also harmful, we argue that query alteration which alters syntactic structure but not semantic meaning constitutes injection as well.
+
+Thus, consider a string $S$ containing a variable $V$ for which some input string $I$ (of fixed or variable length) is substituted, and for which a unique parse $P(S)$ is expected.
+
+\begin{definition}[Injection input]
+\label{def:injection}
+Any $I$ which, when substituted for $V \in S$, results in a successful parse $P'(S) \neq P(S)$ constitutes an injection input.
+\end{definition}
+
 \section{Axiomatic Input Validation}
 The conventional wisdom for validating user inputs is to use regular expressions to filter out bad inputs or allow in only good ones. 
 A schism exists over which is the safest way to perform regular expression validation: whether it is better  to identify and allow through only those inputs known to be good, or to to identify and forbid inputs known to be bad.

***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxx with
unsubscribe freehaven-cvs       in the body. http://freehaven.net/

Next by Author: [freehaven-cvs] r1763: Added abstract PDF. (in doc/trunk/pynchon-gate: . bp-ringstwice)
Previous by thread: [freehaven-cvs] r1765: Initial restructuring. (doc/trunk/dejector)
Next by thread: [freehaven-cvs] r1767: Minor cleanup. (doc/trunk/dejector)
Index(es):
- Author
- Thread