Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Assignment of orthologous genes in unbalanced genomes using cycle packing of adjacency graphs

  • Published:
Journal of Heuristics Aims and scope Submit manuscript

Abstract

The adjacency graph is a structure used to model genomes in several rearrangement distance problems. In particular, most studies use properties of a maximum cycle packing of this graph to develop bounds and algorithms for rearrangement distance problems, such as the reversal distance, the reversal and transposition distance, and the double cut and join distance. When each genome has no repeated genes, there exists only one cycle packing for the graph. However, when each genome may have repeated genes, the problem of finding a maximum cycle packing for the adjacency graph (adjacency graph packing) is NP-hard. In this work, we develop a randomized greedy heuristic and a genetic algorithm heuristic for the adjacency graph packing problem for genomes with repeated genes and unequal gene content. We also propose new algorithms with simple implementation and good practical performance for reversal distance and reversal and transposition distance in genomes without repeated genes, which we combine with the heuristics to find solutions for the problems with repeated genes. We present experimental results and compare the application of these heuristics with the application of the MSOAR framework in rearrangement distance problems. Lastly, we apply our genetic algorithm heuristic to real genomic data to validate its practical use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Algorithm 2
Fig. 4
Fig. 5
Algorithm 3
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://github.com/compbiogroup/Heuristics-for-Cycle-Packing-of-Adjacency-Graphs.

  2. https://github.com/fishinucr/msoar2.0.

  3. https://gitlab.ub.uni-bielefeld.de/gi/dingiiofficial.

  4. Illustration created using treeio R package (Wang et al. 2020).

References

  • Alexandrino, A.O., Oliveira, A.R., Dias, U., Dias, Z.: Genome rearrangement distance with reversals, transpositions, and indels. J. Comput. Biol. 28(3), 235–247 (2021)

    Article  MathSciNet  Google Scholar 

  • Alexandrino, A.O., Oliveira, A.R., Dias, U., Dias, Z.: Labeled cycle graph for transposition and indel distance. J. Comput. Biol. 29(03), 243–256 (2022)

    Article  MathSciNet  Google Scholar 

  • Bafna, V., Pevzner, P.A.: Genome rearrangements and sorting by reversals. SIAM J. Comput. 25(2), 272–289 (1996)

    Article  MathSciNet  Google Scholar 

  • Bafna, V., Pevzner, P.A.: Sorting by transpositions. SIAM J. Discrete Math. 11(2), 224–240 (1998)

    Article  MathSciNet  Google Scholar 

  • Bergeron, A., Mixtacki, J., Stoye, J.: A unifying view of genome rearrangements. In: International Workshop on Algorithms in Bioinformatics, pp. 163–173 (2006). Springer, Berlin

  • Bohnenkämper, L., Braga, M.D.V., Doerr, D., Stoye, J.: Computing the rearrangement distance of natural genomes. J. Comput. Biol. 28(4), 410–431 (2021)

    Article  MathSciNet  Google Scholar 

  • Braga, M.D., Willing, E., Stoye, J.: Double cut and join with insertions and deletions. J. Comput. Biol. 18(9), 1167–1184 (2011)

    Article  MathSciNet  Google Scholar 

  • Brito, K.L., Alexandrino, A.O., Oliveira, A.R., Dias, U., Dias, Z.: Reversals and transpositions distance with proportion restriction. J. Bioinform. Comput. Biol. 19(04), 2150013 (2021)

    Article  Google Scholar 

  • Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Comput. Biol. Bioinf. 2(4), 302–315 (2005)

    Article  Google Scholar 

  • Christie, D.A.: Genome Rearrangement Problems. Ph.D. thesis, Department of Computing Science, University of Glasgow (1998)

  • Fu, Z., Chen, X., Vacic, V., Nan, P., Zhong, Y., Jiang, T.: MSOAR: a high-throughput ortholog assignment system based on genome rearrangement. J. Comput. Biol. 14(9), 1160–1175 (2007)

    Article  MathSciNet  Google Scholar 

  • Garczarek, L., Guyet, U., Doré, H., Farrant, G.K., Hoebeke, M., Brillet-Guéguen, L., Bisch, A., Ferrieux, M., Siltanen, J., Corre, E., et al.: Cyanorak v2. 1: a scalable information system dedicated to the visualization and expert curation of marine and brackish picocyanobacteria genomes. Nucleic Acids Res. 49, 1 (2020)

  • Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J. ACM 46(1), 1–27 (1999)

    Article  MathSciNet  Google Scholar 

  • Kahn, C., Raphael, B.: Analysis of segmental duplications via duplication distance. Bioinformatics 24(16), 133–138 (2008)

    Article  Google Scholar 

  • Makarenkov, V., Leclerc, B.: Circular orders of tree metrics, and their uses for the reconstruction and fitting of phylogenetic trees. In: DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pp. 183–208 (1997)

  • Mitchell, M.: Introduction to Genetic Algorithms. Springer, Cambridge (2008)

    Google Scholar 

  • Oliveira, A.R., Brito, K.L., Alexandrino, A.O., Siqueira, G., Dias, U., Dias, Z.: Rearrangement distance problems: an updated survey. ACM Comput. Surv. 56(8) (2024)

  • Oliveira, A.R., Brito, K.L., Dias, U., Dias, Z.: On the complexity of sorting by reversals and transpositions problems. J. Comput. Biol. 26, 1223–1229 (2019)

    Article  MathSciNet  Google Scholar 

  • Penny, D., Hendy, M.: The use of tree comparison metrics. Syst. Zool. 34(1), 75–82 (1985)

    Article  Google Scholar 

  • Pinheiro, P.O., Alexandrino, A.O., Oliveira, A.R., de Souza, C.C., Dias, Z.: Heuristics for breakpoint graph decomposition with applications in genome rearrangement problems. In: Proceedings of the 13th Brazilian Symposium on Bioinformatics (BSB’2020), pp. 129–140 (2020)

  • Radcliffe, A.J., Scott, A.D., Wilmer, E.L.: Reversals and transpositions over finite alphabets. SIAM J. Discrete Math. 19(1), 224–244 (2005)

    Article  MathSciNet  Google Scholar 

  • Shao, M., Lin, Y., Moret, B.M.: An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes. J. Comput. Biol. 22(5), 425–435 (2015)

    Article  MathSciNet  Google Scholar 

  • Siqueira, G., Oliveira, A.R., Alexandrino, A.O., Dias, Z.: Heuristics for cycle packing of adjacency graphs for genomes with repeated genes. In: Proceedings of the 14th Brazilian Symposium on Bioinformatics (BSB’2021), pp. 93–105 (2021)

  • Walter, M.E.M.T., Dias, Z., Meidanis, J.: Reversal and transposition distance of linear chromosomes. In: Proceedings of the 5th International Symposium on String Processing and Information Retrieval (SPIRE’1998), pp. 96–102. IEEE Computer Society, Los Alamitos, CA, USA (1998)

  • Wang, L.-G., Lam, T.T.-Y., Xu, S., Dai, Z., Zhou, L., Feng, T., Guo, P., Dunn, C.W., Jones, B.R., Bradley, T., et al.: Treeio: an R package for phylogenetic tree input and output with richly annotated and associated data. Mol. Biol. Evol. 37(2), 599–603 (2020)

    Article  Google Scholar 

  • Willing, E., Stoye, J., Braga, M.: Computing the inversion-indel distance. IEEE/ACM Trans. Comput. Biol. Bioinf. 18(6), 2314–2326 (2021)

    Article  Google Scholar 

  • Zhai, S., Zhang, P., Zhu, D., Tong, W., Xu, Y., Lin, G.: An approximation algorithm for genome sorting by reversals to recover all adjacencies. J. Comb. Optim. 37(4), 1170–1190 (2019)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brasil (CAPES)—Finance Code 001 and the São Paulo Research Foundation, FAPESP (Grants 2013/08293-7, 2015/11937-9, 2021/13824-8, and 2022/13555-0).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gabriel Siqueira.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this work appeared in the 14th Brazilian Symposium on Bioinformatics (BSB’2021) (Siqueira et al. 2021).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Siqueira, G., Oliveira, A.R., Alexandrino, A.O. et al. Assignment of orthologous genes in unbalanced genomes using cycle packing of adjacency graphs. J Heuristics 30, 269–289 (2024). https://doi.org/10.1007/s10732-024-09528-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10732-024-09528-z

Keywords