Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Large neighborhood search for the most strings with few bad columns problem

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In this work, we consider the following NP-hard combinatorial optimization problem from computational biology. Given a set of input strings of equal length, the goal is to identify a maximum cardinality subset of strings that differ maximally in a pre-defined number of positions. First of all, we introduce an integer linear programming model for this problem. Second, two variants of a rather simple greedy strategy are proposed. Finally, a large neighborhood search algorithm is presented. A comprehensive experimental comparison among the proposed techniques shows, first, that larger neighborhood search generally outperforms both greedy strategies. Second, while large neighborhood search shows to be competitive with the stand-alone application of CPLEX for small- and medium-sized problem instances, it outperforms CPLEX in the context of larger instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer.

References

  • Boucher C, Landau GM, Levy A, Pritchard D, Weimann O (2013) On approximating string selection problems with outliers. Theor Comput Sci 498:107–114

    Article  MathSciNet  MATH  Google Scholar 

  • Gusfield D (1997) Algorithms on strings, trees, and sequences. Computer science and computational biology. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Hsu WJ, Du MW (1984) Computing a longest common subsequence for a set of strings. BIT Numer Math 24(1):45–59. doi:10.1007/BF01934514

    Article  MathSciNet  MATH  Google Scholar 

  • Landau GM, Schmidt JP, Sokol D (2001) An algorithm for approxixmate tandem repeat. J Comput Biol 8(1):1–18

    Article  Google Scholar 

  • Lizárraga E, Blesa MJ, Blum C, Raidl GR (2015) On solving the most strings with few bad columns problem: an ILP model and heuristics. In: Proceedings of INISTA 2015—international symposium on innovations in intelligent systems and applications, IEEE Press, pp 1–8

  • López-Ibáñez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The \(\sf irace\) package, iterated race for automatic algorithm configuration. Technical Report TR/IRIDIA/2011-004, IRIDIA, Université libre de Bruxelles, Belgium

  • Meneses C, Oliveira C, Pardalos P (2005) Optimization techniques for string selection and comparison problems in genomics. IEEE Eng Med Biol Mag 24(3):81–87

    Article  Google Scholar 

  • Mousavi S, Babaie M, Montazerian M (2012) An improved heuristic for the far from most strings problem. J Heuristics 18:239–262

    Article  Google Scholar 

  • Pappalardo E, Pardalos PM, Stracquadanio G (2013) Optimization approaches for solving string selection problems. SpringerBriefs in optimization. Springer, New York

    Book  MATH  Google Scholar 

  • Pisinger D, Ropke S (2010) Large neighborhood search. In: Gendreau M, Potvin JY (eds) Handbook of metaheuristics, International series in operations research and management science, vol 146. Springer, New York, pp 399–419

  • Rajasekaran S, Hu Y, Luo J, Nick H, Pardalos PM, Sahni S, Shaw G (2001) Efficient algorithms for similarity search. J Comb Optim 5(1):125–132

    Article  MathSciNet  MATH  Google Scholar 

  • Rajasekaran S, Nick H, Pardalos PM, Sahni S, Shaw G (2001) Efficient algorithms for local alignment search. J Comb Optim 5(1):117–124

    Article  MathSciNet  MATH  Google Scholar 

  • Smith T, Waterman M (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197

    Article  Google Scholar 

  • Voß S, Fink A, Duin C (2005) Looking ahead with the pilot method. Ann Oper Res 136(1):285–302

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

All experiments were executed in the High Performance Cluster managed by the Research and Development Lab (RDlab) of the Computer Science Dept. at the Universitat Politècnica de Catalunya (http://rdlab.cs.upc.edu). We thank all the RDlab staff for their support. A preliminary version of this work appeared at the IEEE 2015 International Symposium on INnovations in Intelligent SysTems and Applications (INISTA), September 2–4, 2015, Madrid, Spain. This work was supported by project TIN2012-37930-C02-02 (Spanish Ministry for Economy and Competitiveness, FEDER funds from the European Union) and project SGR 2014-1034 (AGAUR, Generalitat de Catalunya). Additionally, Christian Blum acknowledges support from IKERBASQUE. Evelia Lizárraga acknowledges support from the Mexican National Council for Science and Technology (CONACYT, Doctoral Grant Number 253787).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Blum.

Ethics declarations

Conflict of interest

Evelia Lizárraga, Maria J. Blesa, Christian Blum, and Günther R. Raidl declare that they have no conflict of interest.

Ethical standard

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by C. Analide.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lizárraga, E., Blesa, M.J., Blum, C. et al. Large neighborhood search for the most strings with few bad columns problem. Soft Comput 21, 4901–4915 (2017). https://doi.org/10.1007/s00500-016-2379-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2379-4

Keywords