0% found this document useful (0 votes)

328 views

Quantile Regression Through Linear Programming PDF

This document discusses quantile regression through linear programming. It provides theoretical background on formulating quantile regression as a linear programming problem. Examples are shown using a Mathematica package to find regression quantiles for sample data fitted to a logarithmic model. Regression quantiles for different quantiles are computed and plotted along with the least squares fit for comparison.

Uploaded by

Vipul Garg

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

328 views

Quantile Regression Through Linear Programming PDF

Uploaded by

Vipul Garg

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Quantile regression through linear

programming
Anton Antonov
Mathematica for Prediction blog
Mathematica for Prediction project at GitHub
December 2013

Introduction
We can say that least squares linear regression corresponds to finding the mean of a
single distribution. Similarly, quantile regression corresponds to finding quantiles of a
single distribution. With quantile regression we obtain curves -- called “regression quan-
tiles” -- that together with the least squares regression curve would give a more complete
picture of the distributions (the y’s) corresponding to a set of x’s.
For a complete, interesting, and colorful introduction and justification to quantile regression
see [2]. An introduction and description of the major properties of quantile regression is
given in the Wikipedia entry [3].
In order to have a fast enough for practical purposes implementation of quantile regression
we need to re-cast the quantile regression problem as linear programming problem. (Such
a formulation is also discussed in [2].)
This document is mostly a guide for usage of the Mathematica package for quantile regres-
sion that is provided by the MathematicaForPrediction project at GitHub, see [1].
The second section provides theoretical background of the linear programming formulation
of the quantile regression problem. The third section shows examples of finding regression
quantiles using the function QuantileRegressionFit provided by [1]. The last section
describes profiling experiments and their results.
The motivational examples in the theoretical section, formulas (1) and (2), can be com-
pleted with more expansions and proofs. (Which will be done in the next version of the
document.)

Theory
We can formulate the quantile regression problem in way analogous to the formulation of
least squares (conditional mean) regression.
Consider a random variable Y having some distribution function F and a sample 8yi <ni=1 of
Y . The median of the set of samples 8yi <ni=1 can be defined as the solution of the minimiza-
tion problem
2 Quantile regression through linear programming.nb

Consider a random variable Y having some distribution function F and a sample 8yi <ni=1 of
Y . The median of the set of samples 8yi <ni=1 can be defined as the solution of the minimiza-
tion problem
n
min ‚ †yi - b§, b œ R . (1)
b
i=1

To see that the b which minimizes (1) is the median of 8yi <ni=1 , consider two points y1 < y2 .
Then †y1 - m§ + †m - y2 § = †y1 - y2 §, " m œ @y1 , y2 D, hence any m œ @y1 , y2 D minimizes (1). Using
the observation for two points we see that for three points y1 < y2 < y3 , m = y2 minimizes
(1). For four points y1 < y2 < y3 < y4 any m œ @y2 , y3 D minimizes (1). We can generalize
these observations and show that (1) gives the median for any set of points.
If we want to find q-th sample quantile of 8yi <ni=1 then we need to change (1) into

min ‚ q †yi - b§ + ‚ H1 - qL †yi - b§ , b œ R. (2)

b
yi ¥b yi <b

Consider a set of random variables Yi , i œ @1, nD, n œ N that are paired with a set of x-
coordinates X = 8xi <ni=1 . We have data of pairs 8xi , yi <ni=1 , where yi is a realization of Yi .
The linear regression problem can be formulated as
n
min ‚ Hyi - Hb0 + b1 xi LL2 . (3)
b0 ,b1
i=1

Similarly, the median regression problem can be formulated as

n
min ‚ †yi - Hb0 + b1 xi L§ . (4)
b0 ,b1
i=1

and the q-th quantile regression problem as

min ‚ q †yi - Hb0 + b1 xi L§ + ‚ H1 - qL †yi - Hb0 + b1 xi L§ , b0 , b1 œ R. (5)

b0 ,b1
iœ8i:yi ¥b0 +b1 xi < iœ8i:yi <b0 +b1 xi <

In order to convert (5) into a linear programming problem, let us introduce the non-negative
variables ui and vi for which the following equations are true:
yi - Hb0 + b1 xi L + ui = 0, i œ 8i : yi ¥ b0 + b1 xi <,
(6)
ui = 0, i – 8i : yi ¥ b0 + b1 xi <,
Hb0 + b1 xi L - yi + vi = 0, i œ 8i : yi < b0 + b1 xi <,
(7)
vi = 0, i – 8i : yi < b0 + b1 xi <.
Since ui and vi are greater than 0 on complementary sets, we can re-write (6) and (7)
simply as
yi - Hb0 + b1 xi L + ui - vi = 0, ui ¥ 0, vi ¥ 0, i œ @1, nD. (8)
Quantile regression through linear programming.nb 3

Then (5) expressed with ui and vi becomes

min ‚ q ui + ‚ H1 - qL vi . (9)
ui ,vi ,b0 ,b1
iœ8i:yi ¥b0 +b1 xi < iœ8i:yi <b0 +b1 xi <

Since ui ¥ 0 and vi ¥ 0 the minimization function (9) can simply be written as

n n
min ‚ q ui + ‚ H1 - qL vi . (10)
ui ,vi ,b0 ,b1
i=1 i=1

The equations (8) and formula (10) are the linear programming formulation of the quantile
regression problem (5).
Note that ui vi = 0, " i œ @1, nD.
The quantile regression formulations (5), and (8) and (10) can be done for any model of Yi
that is a linear combination of functions over X not just for the linear model b0 + b1 X .
4 Quantile regression through linear programming.nb

Examples of usage

Package load
Load the package [1]:
In[1]:= Get@"~êMathFilesêMathematicaForPredictionêQuantileRegression.m"D

Logarithmic curve with noise

Let us generate some data.
In[2]:= Block@8n = 1200, start = 10, end = 200<,
data = Table@8t, 5 + Log@tD +
RandomReal@SkewNormalDistribution@0, Log@tD ê 5, 12DD<,
8t, Rescale@Range@1, nD, 81, n<, 8start, end<D<D;
D
ListPlot@data, AxesLabel Ø 8"x", "y"<,
PlotRange Ø All, ImageSize Ø 400D
y

11
Out[3]=

8
x
50 100 150 200

Consider the following quantiles:

In[4]:= qs = 80.05, 0.25, 0.5, 0.75, 0.95<;
We want to find curves that separate the data according the quantiles. Those curves are
called “regression quantiles”.
Pretending that we do not know how the data is generated, just by looking at the plot we
assume that the model for the data is

y = b0 + b1 x + b2 x + b3 logHxL. (11)

Let us put the model functions for the regression fit in the variable funcs:
Quantile regression through linear programming.nb 5

In[5]:= funcs = :1, x, x , Log@xD>;

Here we find the regression quantiles:

In[6]:= qrFuncs = QuantileRegressionFit@data, funcs, x, qsD;
TableForm@List êü qrFuncsD
Out[7]//TableForm=

5.03476 + 0.00913189 x + 2.15707 µ 10-15 x + 0.98281 Log@xD

4.98702 + 4.08128 µ 10-11 x + 1.69792 µ 10-13 x + 1.0654 Log@xD
5.07619 + 2.84134 µ 10-12 x + 2.09895 µ 10-11 x + 1.11739 Log@xD
5.09638 + 1.02751 µ 10-12 x + 1.04385 µ 10-12 x + 1.21172 Log@xD
5.48266 + 2.24466 µ 10-14 x + 4.18194 µ 10-14 x + 1.30597 Log@xD

We also apply Fit to the data and the model functions in order to compare the regression
quantiles with the least-squares regression fit:
In[8]:= fFunc = Fit@data, funcs, xD
Out[8]= 5.53539 + 0.349617 x - 0.0111451 x + 0.53445 Log@xD

Here is a plot that combines the found regression quantiles and least squares fit:

12 ·ϱH0.05,xL
·ϱH0.25,xL
11
·ϱH0.5,xL
Out[10]=

10 ·ϱH0.75,xL
·ϱH0.95,xL
9
Least squares
8

50 100 150 200

Let us check how good the regression quantiles are for separating the data according to
the quantiles they were computed for:
6 Quantile regression through linear programming.nb

In[12]:= tbl = Table@

8qsPiT, Length@Select@data, ÒP2T ¥ HqrFuncsPiT ê. x Ø ÒP1TL &DD ê
Length@dataD êê N<, 8i, Length@qsD<D;
TableForm@tbl, TableHeadings Ø 8None,
8"quantile", "fraction\nabove"<<D
Out[13]//TableForm=

quantile fraction
above
0.05 0.950833
0.25 0.749167
0.5 0.499167
0.75 0.249167
0.95 0.0491667

Robustness
Let us demonstrate the robustness of the regression quantiles with the data of the previous
example. Suppose that for some reason all the data y-values greater than 11.25 are
altered by multiplying them with a some greater than 1 factor, say, a = 1.2 . Then the
altered data looks like this:
In[14]:= a = 1.2;
dataAlt = Map@If@ÒP2T > 11.25, 8ÒP1T, a ÒP2T<, ÒD &, dataD;
ListPlot@dataAlt, AxesLabel Ø 8"x", "y"<,
PlotRange Ø All, ImageSize Ø 400D
y

Out[16]=
12

8
x
50 100 150 200

Let us compute the regression quantiles for the altered data:

Quantile regression through linear programming.nb 7

In[17]:= qrFuncsAlt = QuantileRegressionFit@dataAlt, funcs, x, qsD;

TableForm@List êü qrFuncsD
Out[18]//TableForm=

5.03476 + 0.00913189 x + 2.15707 µ 10-15 x + 0.98281 Log@xD

and let us also compute the least squares fit of the model (11):
In[19]:= fFuncAlt = Fit@dataAlt, funcs, xD
Out[19]= 6.36794 + 0.529118 x - 0.00904845 x - 0.0328413 Log@xD

Here is a plot that combines the functions found over the altered data:
16

14 ·ϱH0.05,xL
·ϱH0.25,xL
12
·ϱH0.5,xL
Out[21]=
·ϱH0.75,xL
·ϱH0.95,xL
10
Least squares

50 100 150 200

We can see that the new regression quantiles computed for 0.05, 0.25, and 0.5 have not
changed significantly:
In[23]:= qrFuncsP1 ;; 3T
Out[23]= 95.03476 + 0.00913189 x + 2.15707 µ 10-15 x + 0.98281 Log@xD,
4.98702 + 4.08128 µ 10-11 x + 1.69792 µ 10-13 x + 1.0654 Log@xD,
5.07619 + 2.84134 µ 10-12 x + 2.09895 µ 10-11 x + 1.11739 Log@xD=

In[24]:= qrFuncsAltP1 ;; 3T
Out[24]= 95.03476 + 0.00913189 x + 3.87407 µ 10-15 x + 0.98281 Log@xD,
4.98702 + 8.04435 µ 10-10 x + 5.84834 µ 10-12 x + 1.0654 Log@xD,
5.07619 + 2.49087 µ 10-13 x + 1.25704 µ 10-12 x + 1.11739 Log@xD=

ant that they are still good for separating the un-altered data:
8 Quantile regression through linear programming.nb

In[25]:= tbl = Table@

8qsPiT, Length@Select@data, ÒP2T ¥ HqrFuncsAltPiT ê. x Ø ÒP1TL &DD ê
Length@dataD êê N<, 8i, Length@qsD<D; TableForm@tbl,
TableHeadings Ø 8None, 8"quantile", "fraction\nabove"<<D
Out[25]//TableForm=

quantile fraction
above
0.05 0.950833
0.25 0.750833
0.5 0.499167
0.75 0.204167
0.95 0.0116667

Also we can see that the least squares fit of (11) has significantly changed:
In[26]:= fFunc
Out[26]= 5.53539 + 0.349617 x - 0.0111451 x + 0.53445 Log@xD

In[27]:= fFuncAlt
Out[27]= 6.36794 + 0.529118 x - 0.00904845 x - 0.0328413 Log@xD

Data generated with Sin and noise

In[94]:= With@8n = 2000, start = 10, end = 140<,
dataSN = Table@8t,
Sin@10 + t p ê 15D + RandomReal@NormalDistribution@0, Log@tD ê 3DD<,
8t, Rescale@Range@1, nD, 81, n<, 8start, end<D<D;
dataSNPAll, 2T = dataSNPAll, 2T + Log@dataSNPAll, 1TD;
D
ListPlot@dataSN, PlotRange Ø All, ImageSize Ø 400D
10

Out[95]=
4

20 40 60 80 100 120 140

Quantile regression through linear programming.nb 9

We are going to use again the quantiles:

In[51]:= qs = 80.05, 0.25, 0.5, 0.75, 0.95<;
We can derive a guess about the data model by observing that we have peaks at x = 50,
x = 80, x = 110. With these observations we make equations for Solve and take the sec-
ond solution:
In[52]:= Solve@8a > 0, Sin@x aD ã Sin@a Hx + 60LD, Sin@x aD ã Sin@a Hx + 30LD<, aD
Out[52]= Solve@8a > 0, Sin@1.2 xD ã Sin@1.2 H60 + xLD,
Sin@1.2 xD ã Sin@1.2 H30 + xLD<, 1.2D

In[53]:= p ê 15 êê N
Out[53]= 0.20944

Next we need to find a guess for the phase. Again we use the second solution provided by
Solve:
In[54]:= Solve@8f > 0, Sin@f + 50 p ê 15D ã 1<, fD
Out[54]= ::f Ø ConditionalExpressionB
1
H-5 p - 12 p C@1DL, C@1D œ Integers && C@1D § -1F>,
6
1
:f Ø ConditionalExpressionB H7 p - 12 p C@1DL,
6
C@1D œ Integers && C@1D § 0F>>

1
In[55]:= 7 p êê N
6
Out[55]= 3.66519

Alternatively, we can simply use Manipulate and plot the data together with a model
function subject to different parameters change.
10 Quantile regression through linear programming.nb

In[96]:= Manipulate@
DynamicModule@8gr1, gr2<,
gr1 = ListPlot@dataSN, PlotRange Ø AllD;
gr2 = Plot@a Sin@f + b xD, 8x, 0, 140<, PlotStyle Ø Darker@RedDD;
Show@8gr1, gr2<D
D, 88a, 1<, 0.5, 10, 1<, 8b, 0, 2, 0.01<, 8f, 0, 30, 0.25<D

0.21

3.75

10
Out[96]=
8

0 20 40 60 80 100 120 140

-2

From the calculations we did so far we assume that the model for the data is
y = b0 + b1 x + b2 Sin@3.7 + x p ê 15D
Let us put the model functions for the regression fit in the variable funcs:
In[57]:= funcs = 81, x, Sin@3.7 + x p ê 15D<;
We find the regression quantiles:
Quantile regression through linear programming.nb 11

In[97]:= qrFuncs = QuantileRegressionFit@dataSN, funcs, x, qsD;

Grid@List êü qrFuncsD
px
1.23145 + 0.00851515 x + 1.02497 SinA3.7 + 15
E
2.19726 + 0.0139957 x + 1.02081 SinA3.7 + p15x E
Out[98]= 2.79455 + 0.0187037 x + 1.04776 SinA3.7 + p15x E
3.58857 + 0.020715 x + 1.05949 SinA3.7 + p15x E
4.6321 + 0.0255846 x + 0.958566 SinA3.7 + p15x E

As in the previous example we also apply Fit to the data and the model functions in order
to compare the regression quantiles with the least-squares regression fit:
In[99]:= fFunc = Fit@dataSN, funcs, xD
px
Out[99]= 2.94185 + 0.0167552 x + 1.01842 SinB3.7 + F
15
Here is a plot that combines the functions found:
10

8
·ϱH0.05,xL
6
·ϱH0.25,xL
·ϱH0.5,xL
Out[106]=
4 ·ϱH0.75,xL
·ϱH0.95,xL
2
Least squares

20 40 60 80 100 120 140

Let us check how good the regression quantiles are:

12 Quantile regression through linear programming.nb

In[108]:= tbl = Table@

8qsPiT, Length@Select@dataSN, ÒP2T ¥ HqrFuncsPiT ê. x Ø ÒP1TL &DD ê
Length@dataSND êê N<, 8i, Length@qsD<D;
TableForm@tbl, TableHeadings Ø 8None,
8"quantile", "fraction\nabove"<<D
Out[109]//TableForm=

quantile fraction
above
0.05 0.95
0.25 0.7505
0.5 0.5005
0.75 0.2495
0.95 0.0505
Quantile regression through linear programming.nb 13

Profiling
It is interesting the see timing profile of the computations with QuantileRegressionFit
across two axes: (i) data size and (ii) number of functions to be fit.
First we need to choose a family or several families of test data. Also, since Mathematica’s
function LinearProgramming has several methods it is a good idea to test with all of
them. Here I am going to show results only with one family of data and two
LinearProgramming methods. The data family is the skewed noise over a logarithmic
curve used as an example above. The first LinearProgramming method is Mathemati-
ca’s (default) “InteriorPoint”, the second method is “CLP” that uses the built-in COIN-OR
CLP optimizer. I run the profiling tests using one quantile 80.5< and five quantiles
80.05, 0.25, 0.5, 0.75, 0.95<, which are shown in blue and red respectively. I also run tests
with different number of model functions 91, x, x , Log@xD= and 81, x, Log@xD< but there
was no significant difference in the timings (less than 2%).

Test family functions definitions

In this sub-section are shown definitions of functions generating families of data sets.
Clear@LogarithmicCurveWithNoiseD
LogarithmicCurveWithNoise@
nPoints_Integer, start_?NumberQ, end_?NumberQD :=
Block@8data<,
data = Table@8t, 5 + Log@tD +
RandomReal@SkewNormalDistribution@0, Log@tD ê 5, 12DD<,
8t, Rescale@Range@1, nPointsD, 81, nPoints<, 8start, end<D<D;
data
D;
Clear@SinWithUpwardTrendD
SinWithUpwardTrend@
nPoints_Integer, start_?NumberQ, end_?NumberQD :=
Block@8data<,
data = Table@8t,
Sin@t p ê 15D + RandomReal@NormalDistribution@0, Log@tD ê 3DD<,
8t, Rescale@Range@1, nPointsD, 81, nPoints<, 8start, end<D<D;
dataPAll, 2T = dataPAll, 2T + dataPAll, 1T^1 ê 6;
data
D;
14 Quantile regression through linear programming.nb

Clear@SinWithParabolaTrendD
SinWithParabolaTrend@
nPoints_Integer, start_?NumberQ, end_?NumberQD :=
BlockB8data<,

data = TableB

:t, Sin@t p ê 10D + RandomRealBNormalDistributionB0, t í 10FF>,

8t, Rescale@Range@1, nPointsD, 81, nPoints<, 8start, end<D<F;

dataPAll, 2T = dataPAll, 2T +
HMean@8start, end<D - dataPAll, 1TL^2 ê 300;
data
F;

LogarithmicCurveWithNoise 4 model functions with MethodØ

LinearProgramming
modelFuncs = :1, x, x , Log@xD>;

qs = 80.05`, 0.25`, 0.5`, 0.75`, 0.95`<

80.05, 0.25, 0.5, 0.75, 0.95<

dataSets = LogarithmicCurveWithNoise@Ò, 10., 100.D & êü

Range@500, 10 000, 500D;
dataSets êê Length
20

timingsLogarithmicCurveWithNoiseF4Q1 = Map@
8Length@ÒD, AbsoluteTiming@QuantileRegressionFit@Ò, modelFuncs,
x, 80.5<, Method Ø LinearProgrammingD;DP1T< &, dataSetsD
88500, 0.098310<, 81000, 0.298962<,
81500, 0.588580<, 82000, 0.987977<, 82500, 1.412103<,
83000, 1.961045<, 83500, 2.658844<, 84000, 3.302903<,
84500, 3.824936<, 85000, 4.812682<, 85500, 5.670178<,
86000, 7.059653<, 86500, 8.519677<, 87000, 9.950919<,
87500, 10.886583<, 88000, 12.807758<, 88500, 14.694095<,
89000, 16.205932<, 89500, 17.815996<, 810 000, 21.027593<<
Quantile regression through linear programming.nb 15

timingsLogarithmicCurveWithNoiseF4Q5 = Map@
8Length@ÒD, AbsoluteTiming@QuantileRegressionFit@Ò, modelFuncs,
x, qs, Method Ø LinearProgrammingD;DP1T< &, dataSetsD
88500, 0.549529<, 81000, 1.494088<,
81500, 2.591068<, 82000, 3.493207<, 82500, 5.204162<,
83000, 6.551883<, 83500, 7.290538<, 84000, 7.828214<,
84500, 11.299679<, 85000, 12.629508<, 85500, 13.832353<,
86000, 18.640458<, 86500, 19.818611<, 87000, 21.795818<,
87500, 23.999628<, 88000, 27.048659<, 88500, 30.410464<,
89000, 33.484113<, 89500, 38.420541<, 810 000, 38.569162<<

ListLogPlot@8timingsLogarithmicCurveWithNoiseF4Q1,
timingsLogarithmicCurveWithNoiseF4Q5<,
PlotStyle Ø 8PointSize@0.012D<,
PlotLegends Ø SwatchLegend@8Darker@BlueD, Darker@RedD<,
8"one quantile", "five quantiles"<D,
AxesLabel Ø Map@Style@Ò, LargerD &, 8"data size", "time,s"<D,
PlotLabel Ø Style@
"QuantileRegressionFit@__,Method->LinearProgrammingD\ntimings
per data size with four model functions\nfor
skewed noise over a logarithmic curve",
LargerD, PlotRange Ø All, ImageSize Ø 600D
QuantileRegressionFit@__,Method->LinearProgrammingD
timings per data size with four model functions
for skewed noise over a logarithmic curve
time,s

10.0

5.0 one quantile

five quantiles

1.0

0.5

data size
2000 4000 6000 8000 10 000

Average ratio between the execution times.

16 Quantile regression through linear programming.nb

Mean@timingsLogarithmicCurveWithNoiseF4Q1PAll, 2T ê
timingsLogarithmicCurveWithNoiseF4Q5PAll, 2TD
0.380121

Ratios between the execution times:

ListPlot@Transpose@8timingsLogarithmicCurveWithNoiseF4Q1PAll, 1T,
timingsLogarithmicCurveWithNoiseF4Q1PAll, 2T ê
timingsLogarithmicCurveWithNoiseF4Q5PAll, 2T<D,
PlotStyle Ø 8PointSize@0.02D<, PlotRange Ø AllD
0.55

0.50

0.45

0.40

0.35

0.30

0.25

0.20

2000 4000 6000 8000 10 000

LogarithmicCurveWithNoise 4 model functions with MethodØ

{LinearProgramming,MethodØ”CLP”}
The same model functions and quantiles were used as in the previous sub-section.
timingsLogarithmicCurveWithNoiseCLPF4Q1 =
Map@8Length@ÒD, AbsoluteTiming@
QuantileRegressionFit@Ò, modelFuncs, x, 80.5<, Method Ø
8LinearProgramming, Method Ø "CLP"<D;DP1T< &, dataSetsD
88500, 0.057988<, 81000, 0.196983<,
81500, 0.419909<, 82000, 0.795889<, 82500, 1.187998<,
83000, 1.706534<, 83500, 2.287483<, 84000, 2.910575<,
84500, 3.583797<, 85000, 4.425360<, 85500, 5.352640<,
86000, 6.446963<, 86500, 8.402463<, 87000, 9.487401<,
87500, 10.838176<, 88000, 12.300428<, 88500, 14.839014<,
89000, 15.998385<, 89500, 17.698214<, 810 000, 20.404420<<
Quantile regression through linear programming.nb 17

timingsLogarithmicCurveWithNoiseCLPF4Q5 =
Map@8Length@ÒD, AbsoluteTiming@
QuantileRegressionFit@Ò, modelFuncs, x, qs, Method Ø
8LinearProgramming, Method Ø "CLP"<D;DP1T< &, dataSetsD
88500, 0.120797<, 81000, 0.411864<,
81500, 0.913235<, 82000, 1.595715<, 82500, 2.369918<,
83000, 3.433828<, 83500, 4.666453<, 84000, 6.038684<,
84500, 7.456060<, 85000, 9.125450<, 85500, 10.848903<,
86000, 13.103446<, 86500, 15.463412<, 87000, 17.807215<,
87500, 20.780654<, 88000, 23.596392<, 88500, 26.951688<,
89000, 31.034202<, 89500, 34.052749<, 810 000, 38.549676<<

ListLogPlot@8timingsLogarithmicCurveWithNoiseCLPF4Q1,
timingsLogarithmicCurveWithNoiseCLPF4Q5<,
PlotStyle Ø 8PointSize@0.012D<,
PlotLegends Ø SwatchLegend@8Darker@BlueD, Darker@RedD<,
8"one quantile", "five quantiles"<D,
AxesLabel Ø Map@Style@Ò, LargerD &, 8"data size", "time,s"<D,
PlotLabel Ø Style@
"QuantileRegressionFit@__,Method->8LinearProgramming,Method->\"
CLP\"<D\ntimings per data size with four model
functions\nfor skewed noise over a logarithmic curve",
LargerD, PlotRange Ø All, ImageSize Ø 600D
QuantileRegressionFit@__,Method->8LinearProgramming,Method->"CLP"<D
timings per data size with four model functions
for skewed noise over a logarithmic curve
time,s

10.0

5.0
one quantile
five quantiles

1.0

0.5

0.1

data size
2000 4000 6000 8000 10 000

Average ratio between the execution times.

18 Quantile regression through linear programming.nb

Mean@timingsLogarithmicCurveWithNoiseCLPF4Q1PAll, 2T ê
timingsLogarithmicCurveWithNoiseCLPF4Q5PAll, 2TD
0.50362

Ratios between the execution times:

ListPlot@
Transpose@8timingsLogarithmicCurveWithNoiseCLPF4Q1PAll, 1T,
timingsLogarithmicCurveWithNoiseCLPF4Q1PAll, 2T ê
timingsLogarithmicCurveWithNoiseCLPF4Q5PAll, 2T<D,
PlotStyle Ø 8PointSize@0.02D<, PlotRange Ø AllD

0.54

0.52

0.50

0.48

2000 4000 6000 8000 10 000

Quantile regression through linear programming.nb 19

Notes
It is interesting to note that the average ratio of the timings with 1 vs. 5 quantiles is 0.38 for
"InteriorPoint" and 0.5 for "CLP".

Careful with “CLP”

During the profiling experiments with some of data families was observed that “CLP” gives
curves that do not follow the data closely. Below is shown such a computation with both
“InteriorPoint” and “CLP”, the former looks good, the latter does not.
In[66]:= modelFuncs =
81, x, Sin@x p ê 10D, -Sin@x p ê 10D, -Sin@x p ê 200D, Sin@x p ê 200D<;
In[67]:= qs = 80.05`, 0.25`, 0.5`, 0.75`, 0.95`<
Out[67]= 80.05, 0.25, 0.5, 0.75, 0.95<

In[68]:= dataSet = SinWithParabolaTrend@1500, 10., 100.D;

In[69]:= AbsoluteTiming@
qrFuncs = QuantileRegressionFit@dataSet,
modelFuncs, x, qs, Method Ø 8LinearProgramming<D
D
px px
Out[69]= :4.828194, :7.73641 + 0.353175 x - 37.4138 SinB F + 1.09561 SinB F,
200 10
px px
8.11931 + 0.358496 x - 37.2938 SinB F + 1.07894 SinB F,
200 10
px px
8.42616 + 0.361935 x - 37.2894 SinB F + 1.01131 SinB F,
200 10
px px
8.69552 + 0.364615 x - 37.1227 SinB F + 1.03673 SinB F,
200 10
px px
9.02949 + 0.366971 x - 36.8279 SinB F + 1.01898 SinB F>>
200 10
20 Quantile regression through linear programming.nb

In[70]:= grData = ListPlot@dataSet, PlotRange Ø AllD;

grFit = Plot@Evaluate@MapThread@Tooltip@Ò1, Ò2D &, 8qrFuncs, qs<DD,
8x, Min@dataSetPAll, 1TD, Max@dataSetPAll, 1TD<,
PlotStyle Ø AbsoluteThickness@1.4DD;
Show@8grData, grFit<, ImageSize Ø 500D

Out[71]=

20 40 60 80 100

-2

In[72]:= AbsoluteTiming@
qrFuncs = QuantileRegressionFit@dataSet, modelFuncs, x, qs, Method Ø
8LinearProgramming, Method Ø "CLP", Tolerance Ø 10^-14.0<D
D
px px
Out[72]= :0.852718, :-2.6634 + 2.62856 SinB F + 0.998518 SinB F,
200 10
px px
2.71231 - 2.98137 SinB F + 0.898953 SinB F,
200 10
px px
-2.6634 + 5.42374 SinB F + 0.597093 SinB F,
200 10
px px
-2.6634 + 7.41221 SinB F - 0.0484257 SinB F,
200 10
px px
-2.6634 + 21.3234 SinB F - 1.80728 SinB F>>
200 10
Quantile regression through linear programming.nb 21

In[73]:= grData = ListPlot@dataSet, PlotRange Ø AllD;

grFit = Plot@Evaluate@MapThread@Tooltip@Ò1, Ò2D &, 8qrFuncs, qs<DD,
8x, Min@dataSetPAll, 1TD, Max@dataSetPAll, 1TD<,
PlotStyle Ø AbsoluteThickness@1.4DD;
Show@8grData, grFit<, ImageSize Ø 500D

Out[74]= 5

20 40 60 80 100

References
[1] Anton Antonov, Quantile regression Mathematica package, source code at GitHub,
https://github.com/antononcube/MathematicaForPrediction, package QuantileRegres-
sion.m, (2013).
[2] Roger Koenker, Gilbert Bassett Jr., “Regression Quantiles”, Econometrica, 46(1), 1978,
pp. 33-50.
JSTOR URL: http://links.jstor.org/sici?sici=0012-9682%28197801 %2946 %3 A1 %3 C33
%3 ARQ %3 E2 .0.CO%3 B2-J .
[3] Wikipedia, Quantile regression, http://en.wikipedia.org/wiki/Quantile_regression .
[4] Brian Cade, Barry Noon, “A gentle introduction to quantile regression for ecologists”,
Front. Ecol. Environ. 1(8), 2003, pp. 412–420.

Quantitive Finances Exam Commentaries
No ratings yet
Quantitive Finances Exam Commentaries
20 pages
RosenthalSolutions3 20 2016
100% (3)
RosenthalSolutions3 20 2016
66 pages
Commerce 1DA3 Notes-6
No ratings yet
Commerce 1DA3 Notes-6
256 pages
Bruno Lecture Notes PDF
No ratings yet
Bruno Lecture Notes PDF
251 pages
Lecture 4
No ratings yet
Lecture 4
161 pages
Applied Regression Analysis: Third Edition
0% (1)
Applied Regression Analysis: Third Edition
9 pages
An Introduction To Bayesian VAR (BVAR) Models R-Econometrics
No ratings yet
An Introduction To Bayesian VAR (BVAR) Models R-Econometrics
16 pages
00 Intro (F)
No ratings yet
00 Intro (F)
11 pages
Solutions Manual to Accompany Introduction to Quantitative Methods in Business: with Applications Using Microsoft Office Excel
From Everand
Solutions Manual to Accompany Introduction to Quantitative Methods in Business: with Applications Using Microsoft Office Excel
Bharat Kolluri
No ratings yet
Quantile Regression: EC 823: Applied Econometrics
No ratings yet
Quantile Regression: EC 823: Applied Econometrics
20 pages
Quantile Regression (Final) PDF
100% (1)
Quantile Regression (Final) PDF
22 pages
Competitive Advantage Porter
No ratings yet
Competitive Advantage Porter
580 pages
Flowchart Symbols
No ratings yet
Flowchart Symbols
5 pages
Fullpassion 1
100% (1)
Fullpassion 1
9 pages
Avramov Doron Financial Econometrics
No ratings yet
Avramov Doron Financial Econometrics
554 pages
MacKinnon Critical Values For Cointegration Tests Qed WP 1227
No ratings yet
MacKinnon Critical Values For Cointegration Tests Qed WP 1227
19 pages
A Crash Course in Statistics - Handouts
No ratings yet
A Crash Course in Statistics - Handouts
46 pages
Quantile Regression
No ratings yet
Quantile Regression
11 pages
Consumer Behavior Project
No ratings yet
Consumer Behavior Project
28 pages
CT4 Q&A Bank Part 1 Questions
No ratings yet
CT4 Q&A Bank Part 1 Questions
12 pages
Data Analysis With Stata: Creating A Working Dataset: Gumilang Aryo Sahadewo October 9, 2017 Mep Feb Ugm
No ratings yet
Data Analysis With Stata: Creating A Working Dataset: Gumilang Aryo Sahadewo October 9, 2017 Mep Feb Ugm
25 pages
Penalized Regression
No ratings yet
Penalized Regression
19 pages
Bayes' Theorem: Probability Theory Statistics
No ratings yet
Bayes' Theorem: Probability Theory Statistics
9 pages
Practice Questions Additional PDF
No ratings yet
Practice Questions Additional PDF
33 pages
The Normal Distribution Is The Distribution
100% (1)
The Normal Distribution Is The Distribution
34 pages
Regression
No ratings yet
Regression
46 pages
Czekanowski Index-Based Similarity As Alternative Correlation Measure in N-Asset Portfolio Analysis
No ratings yet
Czekanowski Index-Based Similarity As Alternative Correlation Measure in N-Asset Portfolio Analysis
1 page
The Advantages of Least Squares Monte Carlo
0% (1)
The Advantages of Least Squares Monte Carlo
9 pages
Solutions Manual to accompany Miller & Freund’s Probability and Statistics for Engineers 8th edition 0321640772 - Quick Download In Full PDF Format With All Chapters
100% (4)
Solutions Manual to accompany Miller & Freund’s Probability and Statistics for Engineers 8th edition 0321640772 - Quick Download In Full PDF Format With All Chapters
57 pages
Probability - Statistics and Random Processes by Veerarajan
46% (24)
Probability - Statistics and Random Processes by Veerarajan
14 pages
Odds Ratio, Hazard Ratio and Relative Risk: Janez Stare Delphine Maucort-Boulch
No ratings yet
Odds Ratio, Hazard Ratio and Relative Risk: Janez Stare Delphine Maucort-Boulch
9 pages
Micro Economics Theory Practice and Evaluation DR Javed Akbar Ansari
No ratings yet
Micro Economics Theory Practice and Evaluation DR Javed Akbar Ansari
706 pages
Probability and Statistics
No ratings yet
Probability and Statistics
110 pages
Doing Bayesian Data Analysis With JASP: Darrell A. Worthy
No ratings yet
Doing Bayesian Data Analysis With JASP: Darrell A. Worthy
76 pages
(Ebook) Bayesian Statistics for Beginners: A Step-By-Step Approach by Therese M Donovan; Ruth M Mickey ISBN 9780198841302, 0198841302 2024 scribd download
100% (9)
(Ebook) Bayesian Statistics for Beginners: A Step-By-Step Approach by Therese M Donovan; Ruth M Mickey ISBN 9780198841302, 0198841302 2024 scribd download
55 pages
SAS Modeling Tool To Access Credit Risk
No ratings yet
SAS Modeling Tool To Access Credit Risk
19 pages
Bahan Univariate Linear Regression
No ratings yet
Bahan Univariate Linear Regression
64 pages
Favero Applied Macroeconometrics
No ratings yet
Favero Applied Macroeconometrics
292 pages
HR Case Final
No ratings yet
HR Case Final
6 pages
Introduction To IBM SPSS Statistics
No ratings yet
Introduction To IBM SPSS Statistics
85 pages
ST102 Michaelmas Term Revision LSE London School of Economics
No ratings yet
ST102 Michaelmas Term Revision LSE London School of Economics
40 pages
Time Series Components
No ratings yet
Time Series Components
3 pages
Imstat
No ratings yet
Imstat
549 pages
Linear Algebra For Business Analytics
No ratings yet
Linear Algebra For Business Analytics
27 pages
Robust Statistics: Theory and Methods (with R)
From Everand
Robust Statistics: Theory and Methods (with R)
Ricardo A. Maronna
No ratings yet
Gurobi Optimization
No ratings yet
Gurobi Optimization
26 pages
Multiple Regression Tutorial 3
100% (2)
Multiple Regression Tutorial 3
5 pages
Solutions Manual to accompany Miller & Freund’s Probability and Statistics for Engineers 8th edition 0321640772 - Download Instantly To Experience The Full Content
100% (3)
Solutions Manual to accompany Miller & Freund’s Probability and Statistics for Engineers 8th edition 0321640772 - Download Instantly To Experience The Full Content
51 pages
13 Pag Design and Analysis of Experiments in The Health Sciences
No ratings yet
13 Pag Design and Analysis of Experiments in The Health Sciences
13 pages
Introduction To Market Design.2011
No ratings yet
Introduction To Market Design.2011
41 pages
Exploratory Data Analysis - Komorowski PDF
No ratings yet
Exploratory Data Analysis - Komorowski PDF
20 pages
Econometric Theory Goldberger
No ratings yet
Econometric Theory Goldberger
3 pages
BSC Economics and Maths
No ratings yet
BSC Economics and Maths
5 pages
The Expectation-Maximisation Algorithm: 14.1 The EM Algorithm - A Method For Maximising The Likeli-Hood
No ratings yet
The Expectation-Maximisation Algorithm: 14.1 The EM Algorithm - A Method For Maximising The Likeli-Hood
21 pages
LM03 Probability Concepts IFT Notes
No ratings yet
LM03 Probability Concepts IFT Notes
19 pages
Econometrics Mock Exam
No ratings yet
Econometrics Mock Exam
4 pages
Takeshi Amemiya-Advanced Eco No Metrics
100% (1)
Takeshi Amemiya-Advanced Eco No Metrics
258 pages
An Introduction to Metric Spaces and Fixed Point Theory
From Everand
An Introduction to Metric Spaces and Fixed Point Theory
Mohamed A. Khamsi
No ratings yet
A Career in Statistics: Beyond the Numbers
From Everand
A Career in Statistics: Beyond the Numbers
Gerald J. Hahn
3/5 (1)
Quantile Regression: Estimation and Simulation
From Everand
Quantile Regression: Estimation and Simulation
Marilena Furno
3.5/5 (1)
Fuzzy Queueing Model Using DSW Algorithm: Abstract
No ratings yet
Fuzzy Queueing Model Using DSW Algorithm: Abstract
6 pages
MBAIB WMST Sem2
No ratings yet
MBAIB WMST Sem2
2 pages
Jacobi Method
100% (1)
Jacobi Method
2 pages
Chapter 10 Test PDF
No ratings yet
Chapter 10 Test PDF
7 pages
Unknown
No ratings yet
Unknown
17 pages
Data Mining Assignment
No ratings yet
Data Mining Assignment
4 pages
Limites
No ratings yet
Limites
45 pages
IBC CHED Basic Calculus Formula
No ratings yet
IBC CHED Basic Calculus Formula
8 pages
Competences Stem Pc11T-Iih-1, Stem Pc11T-Iih-2 The Learner Will Ne Able To
No ratings yet
Competences Stem Pc11T-Iih-1, Stem Pc11T-Iih-2 The Learner Will Ne Able To
24 pages
Moment Distribution Method
No ratings yet
Moment Distribution Method
118 pages
Introduction of Asymptotic Notation: Dr. Munesh Singh
No ratings yet
Introduction of Asymptotic Notation: Dr. Munesh Singh
14 pages
Nda+2020 +Differentiation+in+1+Shot+
No ratings yet
Nda+2020 +Differentiation+in+1+Shot+
93 pages
Linear Algebra & Calculus (20A54101) : Lecture Notes
No ratings yet
Linear Algebra & Calculus (20A54101) : Lecture Notes
213 pages
Linear Control Systems: Amin Rezaeizadeh Fall 1396
No ratings yet
Linear Control Systems: Amin Rezaeizadeh Fall 1396
15 pages
Differential Equations Notes Chapters 1-3
No ratings yet
Differential Equations Notes Chapters 1-3
7 pages
Thesis Presentation Slides UNIBG
No ratings yet
Thesis Presentation Slides UNIBG
33 pages
Domain and Range PP TX
No ratings yet
Domain and Range PP TX
8 pages
7.3 Problem Set 7 - 3
No ratings yet
7.3 Problem Set 7 - 3
6 pages
Ma 1151
50% (2)
Ma 1151
11 pages
Tutorial Function, Domain and Range
No ratings yet
Tutorial Function, Domain and Range
3 pages
OR-Chapter 7 PROGECT NETWORK ANALASIS
No ratings yet
OR-Chapter 7 PROGECT NETWORK ANALASIS
48 pages
Prof. Januario Flores JR
No ratings yet
Prof. Januario Flores JR
14 pages
Figure 1.1: Diagram of Mathematical Modeling
No ratings yet
Figure 1.1: Diagram of Mathematical Modeling
27 pages
1D Spring/Truss Elements: MCEN 4173/5173
No ratings yet
1D Spring/Truss Elements: MCEN 4173/5173
32 pages
Matrix Operation Linear Programming: Algorithms
No ratings yet
Matrix Operation Linear Programming: Algorithms
28 pages
Subhangi Notes
No ratings yet
Subhangi Notes
32 pages
Recurrence Relations: Week 5
No ratings yet
Recurrence Relations: Week 5
21 pages
Agency Assessment (30points) : Students Will Produce A 5-7 Page Double-Spaced Paper
No ratings yet
Agency Assessment (30points) : Students Will Produce A 5-7 Page Double-Spaced Paper
3 pages
Chapter 19 PowerPoint
No ratings yet
Chapter 19 PowerPoint
27 pages