Issue Downloads
Asteria-Pro: Enhancing Deep Learning-based Binary Code Similarity Detection by Incorporating Domain Knowledge
Widespread code reuse allows vulnerabilities to proliferate among a vast variety of firmware. There is an urgent need to detect these vulnerable codes effectively and efficiently. By measuring code similarities, AI-based binary code similarity detection ...
Adonis: Practical and Efficient Control Flow Recovery through OS-level Traces
Control flow recovery is critical to promise the software quality, especially for large-scale software in production environment. However, the efficiency of most current control flow recovery techniques is compromised due to their runtime overheads along ...
A First Look at Dark Mode in Real-world Android Apps
Android apps often have a “dark mode” option used in low-light situations, for those who find the conventional color palette problematic, or because of personal preferences. Typically developers add a dark mode option for their apps with different ...
Programming by Example Made Easy
Programming by example (PBE) is an emerging programming paradigm that automatically synthesizes programs specified by user-provided input-output examples. Despite the convenience for end-users, implementing PBE tools often requires strong expertise in ...
What Constitutes the Deployment and Runtime Configuration System? An Empirical Study on OpenStack Projects
Modern software systems are designed to be deployed in different configured environments (e.g., permissions, virtual resources, network connections) and adapted at runtime to different situations (e.g., memory limits, enabling/disabling features, database ...
API Entity and Relation Joint Extraction from Text via Dynamic Prompt-tuned Language Model
Extraction of Application Programming Interfaces (APIs) and their semantic relations from unstructured text (e.g., Stack Overflow) is a fundamental work for software engineering tasks (e.g., API recommendation). However, existing approaches are rule based ...
Finding Near-optimal Configurations in Colossal Spaces with Statistical Guarantees
A Software Product Line (SPL) is a family of similar programs. Each program is defined by a unique set of features, called a configuration, that satisfies all feature constraints. “What configuration achieves the best performance for a given workload?” is ...
An Interleaving Guided Metamorphic Testing Approach for Concurrent Programs
Concurrent programs are normally composed of multiple concurrent threads sharing memory space. These threads are often interleaved, which may lead to some non-determinism in execution results, even for the same program input. This poses huge challenges to ...
Framework for SQL Error Message Design: A Data-Driven Approach
Software developers use a significant amount of time reading and interpreting error messages. However, error messages have often been based on either anecdotal evidence or expert opinion, disregarding novices, who arguably are the ones who benefit the ...
Testing Causality in Scientific Modelling Software
- Andrew G. Clark,
- Michael Foster,
- Benedikt Prifling,
- Neil Walkinshaw,
- Robert M. Hierons,
- Volker Schmidt,
- Robert D. Turner
From simulating galaxy formation to viral transmission in a pandemic, scientific models play a pivotal role in developing scientific theories and supporting government policy decisions that affect us all. Given these critical applications, a poor ...
Horus: Accelerating Kernel Fuzzing through Efficient Host-VM Memory Access Procedures
Kernel fuzzing is an effective technique in operating system vulnerability detection. Fuzzers such as Syzkaller and Moonshine frequently pass highly structured data between fuzzer processes in guest virtual machines and manager processes in the host ...
Automated and Efficient Test-Generation for Grid-Based Multiagent Systems: Comparing Random Input Filtering versus Constraint Solving
Automatic generation of random test inputs is an approach that can alleviate the challenges of manual test case design. However, random test cases may be ineffective in fault detection and increase testing cost, especially in systems where test execution ...
Towards Causal Analysis of Empirical Software Engineering Data: The Impact of Programming Languages on Coding Competitions
There is abundant observational data in the software engineering domain, whereas running large-scale controlled experiments is often practically impossible. Thus, most empirical studies can only report statistical correlations—instead of potentially more ...
The Human Side of Fuzzing: Challenges Faced by Developers during Fuzzing Activities
Fuzz testing, also known as fuzzing, is a software testing technique aimed at identifying software vulnerabilities. In recent decades, fuzzing has gained increasing popularity in the research community. However, existing studies led by fuzzing experts ...
Automatically Detecting Incompatible Android APIs
Fragmentation is a serious problem in the Android ecosystem, which is mainly caused by the fast evolution of the system itself and the various system customizations. Many efforts have attempted to mitigate its impact via approaches to automatically ...
StubCoder: Automated Generation and Repair of Stub Code for Mock Objects
- Hengcheng Zhu,
- Lili Wei,
- Valerio Terragni,
- Yepang Liu,
- Shing-Chi Cheung,
- Jiarong Wu,
- Qin Sheng,
- Bing Zhang,
- Lihong Song
Mocking is an essential unit testing technique for isolating the class under test from its dependencies. Developers often leverage mocking frameworks to develop stub code that specifies the behaviors of mock objects. However, developing and maintaining ...
On the Caching Schemes to Speed Up Program Reduction
- Yongqiang Tian,
- Xueyan Zhang,
- Yiwen Dong,
- Zhenyang Xu,
- Mengxiao Zhang,
- Yu Jiang,
- Shing-Chi Cheung,
- Chengnian Sun
Program reduction is a highly practical, widely demanded technique to help debug language tools, such as compilers, interpreters and debuggers. Given a program P that exhibits a property ψ, conceptually, program reduction iteratively applies various ...
Testing Abstractions for Cyber-Physical Control Systems
Control systems are ubiquitous and often at the core of Cyber-Physical Systems, like cars and aeroplanes. They are implemented as embedded software that interacts in closed loop with the physical world through sensors and actuators. As a consequence, the ...
Differentiable Quantum Programming with Unbounded Loops
The emergence of variational quantum applications has led to the development of automatic differentiation techniques in quantum computing. Existing work has formulated differentiable quantum programming with bounded loops, providing a framework for ...
TopicAns: Topic-informed Architecture for Answer Recommendation on Technical Q&A Site
Technical Q&A sites, such as Stack Overflow and Ask Ubuntu, have been widely utilized by software engineers to seek support for development challenges. However, not all the raised questions get instant feedback, and the retrieved answers can vary in ...
GraphPrior: Mutation-based Test Input Prioritization for Graph Neural Networks
Graph Neural Networks (GNNs) have achieved promising performance in a variety of practical applications. Similar to traditional DNNs, GNNs could exhibit incorrect behavior that may lead to severe consequences, and thus testing is necessary and crucial. ...
Seed Selection for Testing Deep Neural Networks
Deep learning (DL) has been applied in many applications. Meanwhile, the quality of DL systems is becoming a big concern. To evaluate the quality of DL systems, a number of DL testing techniques have been proposed. To generate test cases, a set of initial ...
Snippet Comment Generation Based on Code Context Expansion
Code commenting plays an important role in program comprehension. Automatic comment generation helps improve software maintenance efficiency. The code comments to annotate a method mainly include header comments and snippet comments. The header comment ...
LaF: Labeling-free Model Selection for Automated Deep Neural Network Reusing
Applying deep learning (DL) to science is a new trend in recent years, which leads DL engineering to become an important problem. Although training data preparation, model architecture design, and model training are the normal processes to build DL models,...
A First Look at On-device Models in iOS Apps
Powered by the rising popularity of deep learning techniques on smartphones, on-device deep learning models are being used in vital fields such as finance, social media, and driving assistance. Because of the transparency of the Android platform and the ...
Testing RESTful APIs: A Survey
In industry, RESTful APIs are widely used to build modern Cloud Applications. Testing them is challenging, because not only do they rely on network communications, but also they deal with external services like databases. Therefore, there has been a large ...
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks
Recent decades have seen the rise of large-scale Deep Neural Networks (DNNs) to achieve human-competitive performance in a variety of AI tasks. Often consisting of hundreds of million, if not hundreds of billion, parameters, these DNNs are too large to be ...
Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks - RCR Report
This is the Replicated Computational Results (RCR) Report for our TOSEM paper “Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks”, where we propose a novel client-server architecture allowing to leverage the high ...