Main

A convex program for inferring a formula from GCMS
steventeddy
July 8, 2023
1 Notation
Table 1: Symbols
Term Description
[n] shorthand for {1, 2, . . . , n} Pn
∆n probability simplex over n dimensions, ∆n = {p ∈ Rn : 0 ≤ pi ≤ 1, i ∈ [n], i=1 pi = 1}
elementwise inequality
1n ones vector of length n
0n zeros vector of length n
In identity matrix of dimension n
Table 2: Variables
Term Description
n # of aromachemicals present in GCMS
m # of naturals
y formula from GCMS, a vector in ∆n
x our formula, with naturals, a vector in ∆n+m
A a column stochastic matrix of size n × m representing our materials, each column ai ∈ ∆n
2 Original Problem
Assumptions:
1. If the GCMS of a perfume leaves out some material, then that material is left out of the GCMS’ of all
naturals.
2. We know the exact composition of each natural used in the perfume.
Goal:
1. Approximate the aromachemicals in the GCMS, i.e. the vector y by some convex combination x of the
materials.
The result of the GCMS y ∈ ∆n is the combination of aromachemicals (and possibly) naturals. Denote
the combination by x ∈ ∆n+m . Write
x> = (xac xnat )>
1
where xac represents the amount of direct adds, and xnat is the indirect additions via naturals.
Fix the order of aromachemicals in the GCMS and say linalool is the first aromachemical. This means
that y1 is the amount of linalool in the perfume. Thus, each aromachemical is represented as a canonical
basis vector, ei ∈ [0, 1]n where ei is 1 in the ith position and 0 elsewhere. The contribution of aromachemicals
is In xac . I.e. one could get the GCMS result by directly adding every aromachemical.
Let B ∈ [0, 1]n×m represent the column stochastic matrix of naturals. Let the columns of B be written
as
B = (b1 , b2 , . . . , bm ).
B being column stochastic means bi ∈ ∆n for all i ∈ [m]. Say the first natural is lavender 40/42, and that
linalool/linalyl acetate are the first two aromachemicals in that order. Then,
b11 = 0.4 and b12 = 0.42,
recording the amounts of linalool and linalyl acetate. Bxnat records the contributions to the n aromachem-
icals from the naturals. I.e. take the column combination view of matrix vector multiplication.
If we concatenate the matrices horizontally, we have A ∈ [0, 1]n×(n+m) column stochastic and defined as

A = In B .
The ordinary least squares problem is
min kIn xac + Bxnat − yk22

x∈Rn+m
min kAx − yk22 s.t. 0n xac 1n
x
or equivalently
s.t. x ∈ ∆n+m 0m xnat 1m
xac> 1n + xnat> 1m = 1
3 Letting the Compositions of Naturals be Unknown

We want to drop assumption 2 from above. Rather than knowing the exact composition of natural i (say
lavender EO) as bi ∈ ∆n , say we know upper and lower bounds for each of its constituents. Say
bì bi bui .
Of course, the proportions of materials in a natural has to sum to 1, so the possible lavenders given the
bounds are in the set
{b ∈ ∆n : bì bi bui }
One approach is to formulate the new problem where we try to infer the composition of the naturals as
min kxac + Bxnat − yk22

x∈Rn+m
B∈Rn×m
s.t. bì bi bui , ∀i ∈ [m]

0n bi 1n , ∀i ∈ [m]
b>
i 1n = 1, ∀i ∈ [m]
0n xac 1n
0m xnat 1m
Suppose instead that we write the B from above (which can vary) as B ref + P where P is a matrix of
pertubations and is the same size as B ref . B ref can be the compositions of naturals you own. Of course,
2
we require that B ref + P is column stochastic. Letting pi be the ith column of P and putting arrows on
vectors, the term in the objective is
m
X m
X
Bxnat = (B ref + P )xnat = ~bi xi + p~i xi
i=1 i=1
We want to write the perturbation term (the second sum) without reference to x. We will substitute ~νi for
p~i xi and show how to change the constraints.
Writing the constraints for column i, substituting ~bi = ~bref
i + p~i ,
~b` ~bi ~bu ⇒ ~b` ~bref + p~i ~bu

i i i i i
0n bi 1n ⇒ 0n ~bref + p~i 1n
i
b>
i 1n =1 ⇒ (~bref
i + p~i )> 1n = 1
Putting in ~νi /xi = p~i ,
~b` ~bi ~bu ⇒ ~b` − ~bref ~νi ~bu − ~bref ⇒ xi (~bì − ~bref νi xi (~bui − ~bref
i i i i i i i )~ i )
xi
~νi
0 n bi 1 n ⇒ −~bref
i 1n − ~bref
i ⇒ −xi~bref
i ~νi xi (1n − ~bref
i )
xi
>
b> ⇒ ~bref + ~νi ⇒ ~νi> 1n = 0
i 1n = 1 i 1n = 1
xi
where the last statement is due to ~bi ∈ ∆n by assumption.

An equivalent formulation is
m 2
X
min xac + B ref xnat + ~νi − y
x∈Rn+m
~
νi ,i∈[m] i=1 2
s.t. xi (~bì− ~bref

i )~νi xi (~bui − ~bref
i ), ∀i ∈ [m]
ref ref
− xi~bi ~νi xi (1n − ~bi ), ∀i ∈ [m]
~νi> 1n = 0, ∀i ∈ [m]
0n xac 1n
0m xnat 1m
The program has a quadratic objective and linear constraints.

Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Main

Uploaded by

Copyright:

Available Formats

A convex program for inferring a formula from GCMS

b11 = 0.4 and b12 = 0.42,

The ordinary least squares problem is

min kIn xac + Bxnat − yk22

3 Letting the Compositions of Naturals be Unknown

min kxac + Bxnat − yk22

s.t. b`i bi bui , ∀i ∈ [m]

~b` ~bi ~bu ⇒ ~b` ~bref + p~i ~bu

Putting in ~νi /xi = p~i ,

where the last statement is due to ~bi ∈ ∆n by assumption.

s.t. xi (~b`i− ~bref

The program has a quadratic objective and linear constraints.

You might also like

Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Main

Uploaded by

Copyright:

Available Formats

A convex program for inferring a formula from GCMS

b11 = 0.4 and b12 = 0.42,

The ordinary least squares problem is

min kIn xac + Bxnat − yk22

3 Letting the Compositions of Naturals be Unknown

min kxac + Bxnat − yk22

s.t. b`i  bi  bui , ∀i ∈ [m]

~b`  ~bi  ~bu ⇒ ~b`  ~bref + p~i  ~bu

Putting in ~νi /xi = p~i ,

where the last statement is due to ~bi ∈ ∆n by assumption.

s.t. xi (~b`i− ~bref

The program has a quadratic objective and linear constraints.

You might also like

s.t. b`i bi bui , ∀i ∈ [m]

~b` ~bi ~bu ⇒ ~b` ~bref + p~i ~bu