Linear Programming Challenge
Linear Programming Challenge
Given a complex function, a binary matrix input and a real number vector as an
output, estimate the other vector and real number variables using optimization.
Solution Deadline:
Sep-1- 2016
If you would like to join an online group who will be working on this problem,
please tell us why you are interested in this challenge, including any relevant
skills or experience that you will bring to the team. If you would like to work
alone, or as part of your own self-formed team, please tell us here.
We are seeking people who are able to commit to two online meetings per week
and contribute in a meaningful way to this project. Do not apply if you are not
100% sure that you are able to make this commitment.
Please indicate what days and times you would normally be available to meet
with your team to work on this challenge. Typically, teams meet online 1-2
times per week.
Challenge Summary
MATHEMATICS
LINEAR PROGRAMMING AND OPTIMIZATION
Given a complex function, a binary matrix input and a real number vector as an
output, estimate the other vector and real number variables using optimization.
Challenge Details
Overview
The goal of this challenge is to determine the influence of specific genes and the
epistatic effects, given the genetic makeup of the plant and plant
height. Simply put, to solve the equation for the other variables.
Problem Statement
Rationale
Suppose vector and matrix represent the phenotype (such as plant height)
and genotype information of plants, respectively.
Each row in corresponds to a plant, and each column represents a gene that
could be responsible for the phenotype. Suppose each gene has two versions,
represented by 0 and 1.
The height of a plant with all genes being version 0 is estimated to be , as a
baseline value.
For , the additional effect of gene being version 1 over version 0 is estimated to
be , which may be positive, negative, or zero.
The height could also be affected by an epistatic effect, which means that a
certain combination of genes contributes an additional effect, , to the height.
This combination is defined by vectorsand . In order for plant to receive the
epistatic effect, must be 1 for all such that=1 and must be 0 for allsuch that =1.
For example, if , and, then only those plants that simultaneously have ,, andwill
receive the additional effect .
For simplicity, we assume for now that the epistatic effect is denoted by only
one combination.
A similar model could also be used for predicting consumers' preference of a
product from their previous purchasing records, or a stock's future trend based
on its historical performance.
Epistatic effects are common even outside the plant genetics field. For example,
the combination of low air pressure and higher-than-freezing temperature is a
good predictor of rain.
Bounding: m and n can theoretically be any size, however from a practical
standpoint, it is likely that m (The number of plants or observations) will be
very large (could be millions), while n will very likely be limited by processing
time (very large values of mcould take years). Additionally, large values
of n would increase the number of plants (m) required to solve. Most likely the
seeker will begin with n values of ~30.
Model
Examples
These examples are solely meant to help you visualize the types of
data/inputs. The data provided is an example only, and is not meant to be
used, as this is not a data manipulation problem.
Example1
Example2
In this example, Each row is an individual plant. The binaries in A are genes
that are on (1) or off (0). The value in h is the height of that plant.
Criteria
Deliverables: