Domain-specific language

In software development and domain engineering, a domain-specific language (DSL) is a programming language or specification language dedicated to a particular problem domain, a particular problem representation technique, and/or a particular solution technique. The concept isn't new—special-purpose programming languages and all kinds of modeling/specification languages have always existed, but the term has become more popular due to the rise of domain-specific modeling.

Examples of domain-specific languages include Logo for children, Verilog and VHSIC hardware description languages, R, S and Q languages for statistics, Mata for matrix programming, Mathematica and Maxima for symbolic mathematics, spreadsheet formulas and macros, SQL for relational database queries, YACC grammars for creating parsers, regular expressions for specifying lexers, the Generic Eclipse Modeling System for creating diagramming languages, Csound, a language for digital synthesis, and the input languages of GraphViz and GrGen, software packages used for graph layout and graph rewriting.

The opposite is:

a general-purpose programming language, such as C or Java,
or a general-purpose modeling language such as the Unified Modeling Language (UML).

Creating a domain-specific language (with software to support it) can be worthwhile if the language allows a particular type of problems or solutions to them to be expressed more clearly than pre-existing languages would allow, and the type of problem in question reappears sufficiently often. Language Oriented Programming considers the creation of special-purpose languages for expressing problems a standard part of the problem solving process.

Overview

A domain-specific language is created specifically to solve problems in a particular domain and is not intended to be able to solve problems outside it (although that may be technically possible). In contrast, general-purpose languages are created to solve problems in many domains. The domain can also be a business area. Some examples of business areas include:

domain-specific language for life insurance policies developed internally in large insurance enterprise
domain-specific language for combat simulation
domain-specific language for salary calculation
domain-specific language for billing

A domain-specific language is somewhere between a tiny programming language and a scripting language, and is often used in a way analogous to a programming library. The boundaries between these concepts are quite blurry, much like the boundary between scripting languages and general-purpose languages.

In design and implementation

Domain-specific languages are languages (or most often, declared syntaxes or grammars) with very specific goals in design and implementation. A domain-specific language can be either a visual diagramming languages, such as those created by the Generic Eclipse Modeling System, programatic abstractions, such as the Eclipse Modeling Framework, or textual languages. For instance, the command line utility grep has a regular expression syntax which matches patterns in lines of text. The sed utility defines a syntax for matching and replacing regular expressions. Often, these tiny languages can be used together inside a shell to perform more complex programming tasks.

The line between domain-specific languages and scripting languages is somewhat blurred, but domain-specific languages often lack low-level functions for filesystem access, interprocess control, and other functions that characterize full-featured programming languages, scripting or otherwise. Many domain-specific languages do not compile to byte-code or executable code, but to various kinds of media objects: GraphViz exports to PostScript, GIF, JPEG, etc, where Csound compiles to audio files, and a ray-tracing domain-specific language like POV compiles to graphics files. A computer language like SQL presents an interesting case: it can be deemed a domain-specific language because it is specific to a specific domain (in SQL's case, accessing and managing relational databases), and is often called from another application, but SQL has more keywords and functions than many scripting languages, and is often thought of as a language in its own right, perhaps because of the prevalence of database manipulation in programming and the amount of mastery required to be an expert in the language.

Further blurring this line, many domain-specific languages have exposed APIs, and can be accessed from other programming languages without breaking the flow of execution or calling a separate process, and can thus operate as programming libraries.

Programming tools

Some domain-specific languages expand over time to include full-featured programming tools, which further complicates the question of whether a language is domain-specific or not. A good example is the functional language XSLT, specifically designed for transforming one XML graph into another, which has been extended since its inception to allow (particularly in its 2.0 version) for various forms of filesystem interaction, string and date manipulation, and data typing.

In model-driven engineering many examples of domain-specific languages may be found like OCL, a language for decorating models with assertions or QVT, a domain specific transformation language. However languages like UML are typically general purpose modeling languages.

To summarize, an analogy might be useful: a Very Little Language is like a knife, which can be used in thousands of different ways, from cutting food to cutting down trees. A domain-specific language is like an electric drill: it is a powerful tool with a wide variety of uses, but a specific context, namely, putting holes in things. A General Purpose Language is a complete workbench, with a variety of tools intended for performing a variety of tasks. Domain-specific languages should be used by programmers who, looking at their current workbench, realize they need a better drill, and find that a specific domain-specific language provides exactly that.

Domain-specific language topics

Usage patterns

There are several usage patterns for domain-specific languages:^[1]^[2]

processing with standalone tools, invoked via direct user operation, often on the command line or from a Makefile (e.g., the GraphViz tool set)
domain-specific languages which are implemented using programming language macro systems, and which are converted or expanded into a host general purpose language at compile-time or read-time
embedded (or internal) domain-specific languages, implemented as libraries which exploit the syntax of their host general purpose language or a subset thereof, while adding domain-specific language elements (data types, routines, methods, macros etc.). The distinction between an embedded DSL and a generic library or API is fuzzy and mostly relates to stylistic and pragmatic concerns: most APIs are designed to expose their features in a straightforward and transparent way, rather than create a usable and distinctive 'language'.^[3]^[4]
domain-specific languages which are called (at runtime) from programs written in general purpose languages like C or Perl, to perform a specific function, often returning the results of operation to the "host" programming language for further processing; generally, an interpreter or virtual machine for the domain-specific language is embedded into the host application
domain-specific languages which are embedded into user applications (e.g., macro languages within spreadsheets) and which are (1) used to execute code that is written by users of the application, (2) dynamically generated by the application, or (3) both

Many domain-specific languages can be used in more than one way.^{[citation needed]}

Design goals

Adopting a domain-specific language approach to software engineering involves both risks and opportunities. The well-designed domain-specific language manages to find the proper balance between these.

Domain-specific languages have important design goals that contrast with those of general-purpose languages:

domain-specific languages are less comprehensive.
domain-specific languages are much more expressive in their domain.
domain-specific languages should exhibit minimum redundancy according to the following subjective definition.

Redundancy of a program is defined as the average number of textual insertions, deletions, or replacements necessary to correctly implement a single stand-alone change in requirements. For a language, this is averaged over programs in the problem domain. This measure is useful because, the smaller it is, the less likely that bugs can be introduced by incompletely implementing changes.

Idioms

In programming, idioms are methods imposed by programmers to handle common development tasks, e.g.:

Ensure data is saved before the window is closed.
Before conducting expensive tests, perform cheap tests that can rule out need for expensive tests.
Edit code whenever command-line parameters change because they affect program behavior.

General purpose programming languages rarely support such idioms, but domain-specific languages can describe them, e.g.:

A script can automatically save data.
A smart test harness can learn what good tests are.
A domain-specific language can parameterize command line input.

Examples

Unix shell scripts

Unix shell scripts give a good example of a domain-specific language for data organization. They can manipulate data in files or user input in many different ways. Domain abstractions and notations include streams (such as stdin and stdout) and operations on streams (such as redirection and pipe). These abstractions combine to make a robust language to talk about the flow and organization of data.

The language consists of a simple interface (a script) for running and controlling processes that perform small tasks. These tasks represent the idioms of organizing data into a desired format such as tables, graphs, charts, etc.

These tasks consist of simple control-flow and string manipulation mechanisms that cover a lot of common usages like searching and replacing string in files, or counting occurrences of strings (frequency counting).

Even though Unix scripting languages are Turing complete, they differ from general purpose languages.

In practice, scripting languages are used to weave together small Unix tools such as AWK (e.g., gawk), ls, sort or wc.

ColdFusion Markup Language

ColdFusion's associated scripting language is another example of a domain-specific language for data-driven websites. This scripting language is used to weave together languages and services such as Java, .NET, C++, SMS, email, email servers, http, ftp, exchange, directory services, and file systems for use in websites.

The ColdFusion Markup Language includes a set of tags that can be used in ColdFusion pages to interact with data sources, manipulate data, and display output. CFML tag syntax is similar to HTML element syntax.

Erlang OTP

The Erlang Open Telecom Platform was originally designed for use inside Ericsson as a domain specific language. The language itself offers a platform of libraries to create finite state machines, generic servers and event managers that quickly allow an engineer to deploy applications, or support libraries, that have been shown in industry benchmarks to outperform other languages intended for a mixed set of domains, such as C and C++. The language is now officially open source and can be downloaded from their website.

FilterMeister

FilterMeister is a programming environment, with a programming language that is based on C, for the specific purpose of creating Photoshop-compatible image processing filter plug-ins; FilterMeister runs as a Photoshop plug-in itself and it can load and execute scripts or compile and export them as independent plug-ins. Although the FilterMeister language reproduces a significant portion of the C language and function library, it contains only those features which can be used within the context of Photoshop plug-ins and adds a number of specific features only useful in this specific domain.

MediaWiki templates

The Template feature of MediaWiki is an embedded domain-specific language whose fundamental purpose is to support the creation of page templates and the transclusion (inclusion by reference) of MediaWiki pages into other MediaWiki pages.

A detailed description of that domain-specific language can be found at the corresponding article at the Wikimedia Foundation's Meta-Wiki.

Software engineering uses

There has been much interest in domain-specific languages to improve the productivity and quality of software engineering. Domain-specific language could possibly provide a robust set of tools for efficient software engineering. Such tools are beginning to make their way into development of critical software systems.

The Software Cost Reduction Toolkit [1] is an example of this. The toolkit is a suite of utilities including a specification editor to create a requirements specification, a dependency graph browser to display variable dependencies, a consistency checker to catch missing cases in well-formed formulas in the specification, a model checker and a theorem prover to check program properties against the specification, and an invariant generator that automatically constructs invariants based on the requirements.

A newer development is Language-oriented programming, an integrated software engineering methodology based mainly on creating, optimizing, and using domain-specific languages.

Metacompilers

Complementing language-oriented programming, as well as all other forms of domain-specific languages, are the class of compiler writing tools called metacompilers. A metacompiler is not only useful for generating parsers and code generators for domain specific languages, but a metacompiler is also itself a domain-specific language for the domain of compiler writing. The feature that sets a metacompiler apart from a standard compiler-compiler is that a metacompiler is written in its own language and translates itself—the grammar productions defining itself written in its own specialized language—into the executable form of itself. Defining itself and translating itself constitute the meta-step that sets a metacompiler apart from other compiler-compilers.

Besides parsing domain-specific languages, metacompilers are useful for generating a wide range of software engineering and analysis tools.

Metacompilers that played a significant role in both computer science and the computer industry include Meta-II^[5] and its descendent TreeMeta^[6].

Unreal Engine and other games

Unreal and Unreal Tournament unveiled a language called UnrealScript. This allowed for rapid development of modifications compared to the competitor Quake (using the Id Tech engine). The Id Tech engine uses standard C code meaning C had to be learned and properly applied, while UnrealScript was optimized for ease of use and efficiency. Similarly, the development of more recent games introduced their own specific languages, one more common example is Lua for scripting.

Advantages and disadvantages

Some of the advantages:^[1]^[2]

Domain-specific languages allow solutions to be expressed in the idiom and at the level of abstraction of the problem domain. Consequently, domain experts themselves can understand, validate, modify, and often even develop domain-specific language programs.
Self-documenting code.
Domain-specific languages enhance quality, productivity, reliability, maintainability, portability and reusability.
Domain-specific languages allow validation at the domain level. As long as the language constructs are safe any sentence written with them can be considered safe.

Some of the disadvantages:

Cost of learning a new language vs. its limited applicability
Cost of designing, implementing, and maintaining a domain-specific language as well as the tools required to develop with it (IDE)
Finding, setting, and maintaining proper scope.
Difficulty of balancing trade-offs between domain-specificity and general-purpose programming language constructs.
Potential loss of processor efficiency compared with hand-coded software.
Proliferation of similar non-standard domain specific languages, i.e. a DSL used within insurance company A versus a DSL used within insurance company B.^{[citation needed]}

Tools

References

^ ^a ^b Marjan Mernik, Jan Heering, and Anthony M. Sloane. When and how to develop domain-specific languages. ACM Computing Surveys, 37(4):316–344, 2005. doi:10.1145/1118890.1118892
^ ^a ^b Diomidis Spinellis. Notable design patterns for domain specific languages. Journal of Systems and Software, 56(1):91–99, February 2001. doi:10.1016/S0164-1212(00)00089-3
^ Fowler, Martin. "Using Domain Specific Languages".
^ Fowler, Martin. "Fluent Interfaces".
^ Shorre, D.V., META II a syntax-oriented compiler writing language, Proceedings of the 1964 19th ACM National Conference, pp. 41.301-41.3011, 1964
^ C. Stephen Carr, David A. Luther, Sherian Erdmann, 'The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645', University of Utah Technical Report RADC-TR-69-83.

External links

[Mernik05-1] Marjan Mernik, Jan Heering, and Anthony M. Sloane. When and how to develop domain-specific languages. ACM Computing Surveys, 37(4):316–344, 2005. doi:10.1145/1118890.1118892

[Spinellis01-2] Diomidis Spinellis. Notable design patterns for domain specific languages. Journal of Systems and Software, 56(1):91–99, February 2001. doi:10.1016/S0164-1212(00)00089-3

[3] Fowler, Martin. "Using Domain Specific Languages".

[4] Fowler, Martin. "Fluent Interfaces".

[5] Shorre, D.V., META II a syntax-oriented compiler writing language, Proceedings of the 1964 19th ACM National Conference, pp. 41.301-41.3011, 1964

[6] C. Stephen Carr, David A. Luther, Sherian Erdmann, 'The TREE-META Compiler-Compiler System: A Meta Compiler System for the Univac 1108 and General Electric 645', University of Utah Technical Report RADC-TR-69-83.

[1]

[2]

[3]

[4]

[5]

[6]

v t e Programming languages
Comparison Timeline History
Ada ALGOL Simula APL Assembly BASIC Visual Basic classic .NET C C++ C# COBOL Erlang Forth Fortran Go Haskell Java JavaScript Julia Kotlin Lisp Lua MATLAB ML Pascal Object Pascal Perl PHP Prolog Python R Ruby Rust SQL Scratch Shell Smalltalk Swift more...
Lists: Alphabetical Categorical Generational Non-English-based Category