Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
111 views

Text2App - A Framework For Creating Android Apps From Text Descriptions

The document describes Text2App, a framework that allows users to create functional Android apps from natural language specifications. It does this by translating the natural language into a simplified intermediate formal language representing the app components, then compiling this representation into the source code for an Android app. The framework reduces the complexity of the task by breaking it into modular components and only applying machine learning where needed, like translating the natural language into the simplified formal representation. It aims to overcome limitations of previous work by not directly generating source code from natural language instructions.

Uploaded by

sarte00
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

Text2App - A Framework For Creating Android Apps From Text Descriptions

The document describes Text2App, a framework that allows users to create functional Android apps from natural language specifications. It does this by translating the natural language into a simplified intermediate formal language representing the app components, then compiling this representation into the source code for an Android app. The framework reduces the complexity of the task by breaking it into modular components and only applying machine learning where needed, like translating the natural language into the simplified formal representation. It aims to overcome limitations of previous work by not directly generating source code from natural language instructions.

Uploaded by

sarte00
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Text2App: A Framework for Creating Android Apps from Text

Descriptions

Masum Hasan1* , Kazi Sajeed Mehrab1* , Wasi Uddin Ahmad2 , and Rifat Shahriyar1
1
Bangladesh University of Engineering and Technology (BUET)
2
University of California, Los Angeles (UCLA)
1
masum@ra.cse.buet.ac.bd, 1505025.ksh@ugrad.cse.buet.ac.bd, rifat@cse.buet.ac.bd
2
wasiahmad@cs.ucla.edu

Abstract Natural language description:


Create an app with a textbox, a
button named “Speak”, and a
We present Text2App – a framework that al- text2speech. When the button is
lows users to create functional Android appli-
arXiv:2104.08301v1 [cs.CL] 16 Apr 2021

clicked, speak the text in the text box.


cations from natural language specifications.
Simplified App Representation:
The conventional method of source code gen- On click
eration tries to generate source code directly, <complist>
<textbox>
which is impractical for creating complex soft- <button> STRING0
Happy text to
app
ware. We overcome this limitation by trans- </button>
<text2speech>
forming natural language into an abstract in- </complist>
termediate formal language representing an ap-
<code>
plication with a substantially smaller number <button1_clicked>
of tokens. The intermediate formal representa- <speak>
tion is then compiled into target source codes. <textbox1text>
</speak>
This abstraction of programming details al- </button1_clicked>
lows seq2seq networks to learn complex ap- </code>
plication structures with less overhead. In or- Literal Dictionary:
der to train sequence models, we introduce a {
“STRING0”: “Speak”
data synthesis method grounded in a human }
survey. We demonstrate that Text2App gener-
alizes well to unseen combination of app com-
ponents and it is capable of handling noisy nat- Figure 1: An example app created by our system that
ural language instructions. We explore the pos- speaks the textbox text on button press. The natural lan-
sibility of creating applications from highly ab- guage is machine translated to a simpler intermediate
stract instructions by coupling our system with formal language which is compiled into an app source
GPT-3 – a large pretrained language model. code. Literals are separated before machine translation.
The source code, a ready-to-run demo note-
book, and a demo video are publicly available
the road. To date, however, the task of source code
at http://text2app.github.io.
generation has turned out to be highly difficult – the
1 Introduction best deep neural networks, consisting of hundreds
of millions of parameters and trained with hundreds
Mobile application developers often have to build of gigabytes of data, fails to achieve an accuracy
applications from natural language requirements higher than 20% (Ahmad et al., 2021; Lu et al.,
provided by their clients or managers. An auto- 2021). With the task of source code generation
mated tool to build functional applications from lagging behind, the ambition to produce software
such natural language descriptions will signifi- automatically from natural language descriptions
cantly value this application development process. has remained a distant reality.
For many years, researchers have been trying to In this work, we present Text2App, a novel
generate source code from natural language de- pipeline to generate Android Mobile Applications
scriptions (Yin and Neubig, 2017; Ling et al., (app) from natural language (NL) descriptions. In-
2016), with the aspiration to automatically gen- stead of an end-to-end learning-based model, we
erate full-fledged software systems further down break down the challenging task of app develop-
* Equal contribution ment into modular components and apply learning-
based methods only where necessary. We create a 3 Text2App
formal language named Simplified App Represen-
Text2App is a framework that aims to build oper-
tation (SAR), to represent an app with a minimal
ational mobile applications from natural language
number of tokens and train a sequence-to-sequence
(NL) specifications. We make this possible by trans-
neural network to generate this formal representa-
lating a specification to an intermediate, compact,
tion from an NL. Fig. 1 shows an example of a
formal representation which is compiled into the
formal representation created from a given natu-
application source code in a later step. This inter-
ral language. Using a custom-made compiler, we
mediate language helps our system represent an
convert the simplified formal representation to the
application with a substantially smaller number of
application source code from which a functional
tokens, allowing seq2seq models to generate intri-
app can be built. We create a data synthesis method
cate apps in a few decoding steps, which otherwise
and a BERT-based natural language augmentation
would be unsolvable by current sequential models.
method to synthesize realistic natural language and
We design a formal language named Simplified
associated SAR parallel corpus.
App Representation (SAR) that captures the app
We demonstrate that the compact app represen- design, components, and functionalities in a small
tation allows seq2seq models to generate app from number of tokens. Our SAR compiler converts a
significantly noisy input, even being able to pre- SAR code to an application source code. Using
dict combinations it has not seen during training. MIT App Inventor1 – a popular, accessible appli-
Moreover, we open source our implementation to cation development tool – we can build functional
the community, and lay down the groundwork to application from the app source code in a matter
extend the features and functionalities of Text2App of minutes. We conduct a human survey to under-
beyond what we demonstrated in this paper. stand user perception of text-based app develop-
ment. Based on our observations from the survey,
2 Related Works we create a data synthesis approach to automati-
cally generate fluent natural language descriptions
Historically, deep learning based program genera- of apps along with corresponding SARs. Using
tion tended to focus on generating unit functions or the synthesized parallel NL-SAR data, we train
methods from natural language instructions using a sequence-to-sequence neural network to predict
sequence-to-sequence or sequence-to-tree architec- SAR from a given text description of an app, which
tures (Ling et al., 2016; Yin and Neubig, 2017; is then compiled into functional apps. Fig. 2 de-
Brockschmidt et al., 2019; Parisotto et al., 2017; scribes each step in our natural language to app
Rabinovich et al., 2017; Ahmad et al., 2021; Lu generation process. The system is built to be mod-
et al., 2021). Although these approaches are inter- ular, where each module is self-contained: inde-
esting, they are limited by sequential models’ abil- pendent and with a single, well-defined purpose.
ity to generate long sequences. For reference, the This allows us to modify one part of the system
best Transformer based models generates a func- without affecting the others and debug the system
tion with an accuracy of around 20% (Ahmad et al., to pinpoint any error.
2021; Lu et al., 2021). Deeming large scale soft- Literals like strings, numbers are separated dur-
ware development infeasible in this method. ing the preprocessing and are re-introduced during
The second type of works in program generation compilation. Contrary to conventional program-
that sparked researcher’s interest is generating GUI ming languages, unless a user specifies a detail of
source code from a screenshot, hand-drawn image, an app component, a suitable default is assumed.
or text description of the GUI (Beltramelli, 2018; This allows the user to describe an app more natu-
Jain et al., 2019; Robinson, 2019; Zhu et al., 2019; rally and also reduces unnecessary overhead from
Moran et al., 2020; Kolthoff, 2019). These works the sequential model.
are limited to generating GUI design only, and 3.1 Survey on Natural Language based App
does not naturally extend to functionality based Development
programming. To the best of our knowledge, ours
is the first work on developing working software In order to understand how an end user would per-
with interdependent functional components from ceive a text based app development system, early in
1
natural language description. https://appinventor.mit.edu/
Simplified App
{ "STRING0": "Speak" }
Representation (SAR)
Create an app with a create an app with a
<complist> <textbox>
textbox and a button text box and a button Encoder Decoder <button> STRING0
named "Speak". When named STRING0 . </button> </complist>
... when ... <code> ...

Natural Language Tokenized and


Description of App Pre-processed
Seq2seq Neural Network

{ "STRING0": "Speak" }

.apk
MIT App Inventor MIT App Inventor SAR Compiler
Ready to Use! backend Compiler Source (.scm, .bky)

Figure 2: Text2App Prediction Pipeline. A given text is formatted and passed to a seq2seq network to be translated
into SAR. Using a SAR Compiler, it is converted to App Inventor project, which can be built into an application.

our study we performed a semi-structured human hargsi → hargi hargsi | hargi


survey among participants with some level of pro- hargi → ‘<ARG>’ ‘<VAL>’ ‘<ARG>’ | ‘<VAL>’
gramming experience. We asked them to describe
one or more mobile applications from a given set Here, <SAR> is the starting symbol and the
of app components. We received a total of 57 re- tokens inside quotes are terminals. A mobile ap-
sponses2 from 30 participants. We observe that plication in our system firstly consists of screens.
36 out of the 57 responses contains enough details Each screen contains an ordered list of visible (e.g.
for automatic app creation. 19 of the responses video player, textbox) or invisible (e.g. accelerom-
contains information that are not contained in the eter, text2speech) components, which are identi-
description, and it would require external knowl- fied with the <COMPLIST> tokens. Next, the ap-
edge to be converted to app (e.g. “make a photo plication logic is defined within the <CODE> to-
editing app” – requires a system to know what a kens. One functionality in our system is a unit
photo editor is). The observations from this survey tuple containing an event, an action, and a value.
helps us to create a data synthesis method, and it <EVENT> is an external or internal process that
works as a general guideline for our study design. triggers an action. <ACTION> is a process that
performs a certain operation. Both <EVENT>
3.2 Simplified App Representation (SAR) and <ACTION> components often have proper-
SAR is an abstract, intermediate, formal language ties that determine their identity or behavior. For
that represents a mobile application in our system. example, an animated ball has properties ‘color’,
We design SAR to be minimal and compact, at the ‘speed’, ‘radius’, etc. Each of these properties are
same time, to completely describe an application. referred to as an argument or <ARG> and the val-
We formally define the Context Free Grammar of ues of such arguments are indicated by <VAL>.
SAR using the following production rules: As for example, Figure 1 demonstrates the SAR
code for an app containing button, and text2speech.
hSARi → hscreensi
Here, <button1 clicked> event triggers the ac-
hscreensi → hscreeni ‘<NEXT>’ hscreensi | hscreeni tion <speak> from the text2speech component,
hscreeni → hcomplisti hcodei which consumes the <textbox1text> - the text
hcomplisti → ‘<COMPLIST>’ hcompsi ‘</COMPLIST>’ written in the textbox1 as a value.
hcompsi → hcompsi hcompi | hcompi
hcompi → ‘<COMP>’ hargsi ‘</COMP>’ | ‘<COMP>’
3.3 Converting SAR to Mobile Apps
hcodei → ‘<CODE>’ heventsi ‘</CODE>’ We convert SAR to MIT App Inventor (MIT AI)
heventsi → heventi heventsi | heventi project using a custom written compiler. The
heventi → ‘<EVENT>’ hactionsi ‘</EVENT>’ project is then compiled into functioning app (.apk)
hactionsi → hactionsi hactioni | hactioni
using MIT AI server. MIT App Inventor is a pop-
ular tool for app development large community
hactioni → ‘<ACTION>’ hargsi ‘</ACTION>’
of active developers, rich and growing functional-
2
http://bit.ly/AppDescriptions ities, and relatively simpler internal structure for
apps. MIT AI file structure mainly consists of a component portion of the SAR is added to an .scm
Scheme (.scm) file consisting of components and file, and the code portion to a .bky file. Literal
their properties and a Blockly (.bky) file consisting dictionary consists of literal values that were sep-
code functionalities. Appendix A and B respec- arated from the NL during preprocessing and are
tively shows the .scm and .bky files for the example reintroduced during compilation.
shown in Fig. 1. The following algorithm roughly Once .scm and .bky files are produced, they are
shows our SAR to source conversion process. compressed into an MIT AI project file (.aia). The
project file have to be uploaded to the publicly
Algorithm 1: Compiling SAR to .scm, .bky available MIT AI server. Then the user can debug
1 Input SAR tokens, LiteralDict the app, or download it as an executable (.apk) file.
2 Output .scm, .bky 3.4 Synthesising Natural Language and SAR
3 scm = initializeSCM() bky = Parallel Data
initializeBKY()
Based on the findings of our survey described in
4 for token in complist do
Section 3.1, we develop a data synthesis method
5 if isComponentStart(token) then
for generating natural language app description
6 uuid = generateNumUUID()
and SAR data parallelly. First, from a list of al-
7 n = getCompNum(token)
lowed components, we randomly select a certain
8 if token.hasArgument() then
number of components. Most components (i.e.
9 args = fetchArgs(token,
button, textbox) are allowed to be repeated a
LiteralDict)
certain number of times, but some components (e.g.
10 t = getTemplate(token)
text2speech, accelerometer) can only ap-
11 t.set(n, uuid, args)
pear once. The selected components are sorted into
12 scm.add(t) three groups, event component, action component,
13 write(scm) and value component (detailed in Section 3.2). For
14 for token in code do each event, action, and value components, their
15 t = getTemplate(token) functionalities are selected randomly from a pre-
16 uuid = generateStringUUID() defined list. When the components, arguments,
17 if token.isLiteral() then and their functionalities are selected, we stochasti-
18 val = LiteralDict[token] cally create a natural language description by sam-
19 if val.isFileDir() & fileDoesNotExist pling natural language snippets from predefined
then lists. Furthermore, we deterministically create a
20 val = closestMatchingFile() SAR representation of an app and the functional-
ities. Figure 3 demonstrates creation of a simple
21 t.set(val)
app with three components. The random selection
22 else process and repetition of components allows our
23 if token.hasNumber() then synthesis method to create wide variety of apps.
24 number = regexMatch(token)
25 t.set(number) 3.5 BERT Based NL Augmentation
Data augmentation is common practice in com-
26 t.set(uuid)
puter vision, where an image is rotated, cropped,
27 bky.add(t)
scaled, or corrupted, in order to increase the data
28 write(bky) size and introducing variation to the dataset. To
add diversity to our synthetic dataset, we propose
All possible SAR tokens have corresponding a data augmentation method where we mutate a
predefined template source codes. The compiler certain percentage of words in our dataset using
parses the SAR token by token and fetches the cor- the Masked Language Modeling (MLM) property
responding template. Each template is provided of pretrained BERT (Devlin et al., 2019). First a
with a UUID3 , serial number, and necessary argu- selected word is replaced with a BERT [MASK] to-
ments and finally added to their respective file. The ken and a pretrained BERT model is asked to fill-up
the masked position. The ordered list of the top 10
3
Universal Unique Identifier predictions are collected, and one word from this
Allowed components

Sample

<complist> <textbox> <button> Create an app with a textbox, a


<text2speech> </complist>
button text2speech textbox button and a text to speech
Components SAR Event Action Value

NL

<button1_clicked> <speak> <textbox1text> </speak> </button1_clicked>


Code SAR
When the button is clicked , speak the text in the textbox

Figure 3: Automatic synthesis of NL and SAR parallel corpus. Bold-italic indicates text is selected stochastically.

Original: Create an app that has an audio player with pretrained weights, and the other one with Code-
source string0, a switch. If the switch is flipped, play BERT base (Feng et al., 2020) pretrained weights.
player.
RoBERTa is pretrained with natural language With
Augmentation: Create an app that has an external MLM objectives, that showed excellent results in
player with source string0, a switch. If the switch gets
numerous NLU tasks. CodeBERT is pretrained
flipped, play player.
similarly on an amalgam of natural languages and
Table 1: BERT mask filling based data augmentation source codes, thereby making it more familiar with
method. Mutated words are highlighted green. programming structure and terminologies.

list is selected with a descending weighted prob- 3.7 Simplifying Abstract Natural Language
ability. That means, the top predictions are more Instructions using GPT-3
likely to be selected, however, all 10 predictions
have a chance. This augmentation technique intro- In our survey sessions (Section 3.1) we found that
duces contextually correct unseen vocabulary to the users often provide highly abstract instructions
dataset, and familiarizes the sequential model with which requires external knowledge to understand
realistic natural language noises. Table 1 presents (e.g. “Create a photo editor app” – expects knowl-
an example of our BERT based augmentation. edge how a photo editor looks and works). Large
pretrained language models (LMs), such as GPT-3
3.6 NL to SAR Translation using Seq2Seq (Brown et al., 2020), have shown to understand ab-
Networks stract natural language concepts and even explain
We generate 50,000 unique NL and SAR parallel them in simple terms (Mayn, 2020). We incorpo-
data using our data synthesis method, and mutate rate this capability of GPT-3 to enable our model
1% of the natural language tokens. We split this to create applications from highly abstract instruc-
dataset into train-validation-test sets in 8:1:1 ratio tions. We provide GPT-3 with nine abstract app
and train three different models. concepts and their corresponding NL descriptions
Pointer Network: We train a Pointer Network such that it can be generated using our system (Ap-
(See et al., 2017) consisting of a randomly initial- pendix C). We then instruct it to describe an unseen
ized bidirectional LSTM encoder with hidden lay- abstract concept, and the description is sent to our
ers of size 500 and 250. As our output vocabulary seq2seq network in order to produce SAR. Table
is fixed, we do not use copy mechanism. 4 shows some successful (1, 2) and unsuccessful
Transformer with pretrained encoders: We (3, 4) app descriptions generated by GPT-3. Al-
create two sequence-to-sequence Transformer though the LM generates plausible sequences, it
(Vaswani et al., 2017) networks each having 12 fails to limit its prediction to our supported func-
encoder layers and 6 decoder layers. Every layer tionalities. With more features and functionalities
has 12 self-attention headsof size 64. The hidden added to Text2App, LM-based explanations can
dimension is 768. The encoder of one of the models be a viable method of creating apps from abstract
is initialized with RoBERTa base (Liu et al., 2020) specifications.
BERT Mutation
Test Unseen Pair
#Epoch 2% 5% 10%
BLEU EM BLEU EM BLEU EM BLEU EM BLEU EM
Unmutated Training Data
PointerNet 13.6 94.64 79.24 94.16 72.06 91.80 56.14 88.96 40.78 96.75 82.91
RoBERTa init 3 97.20 77.80 96.83 73.20 94.97 61.68 92.86 48.06 98.11 79.66
CodeBERT init 8 97.42 80.02 97.18 76.02 95.37 64.38 93.29 51.24 98.47 83.50
Training with 1% Mutation
PointerNet 23.2 95.03 81.40 94.85 79.46 93.85 72.04 92.53 63.68 96.68 83.33
RoBERTa init 3 97.66 81.76 97.60 80.66 96.91 76.16 96.04 70.10 98.64 84.68
CodeBERT init 7 97.64 81.66 97.51 80.20 96.74 74.98 95.71 67.58 98.62 84.51

Table 2: Comparison between Pointer Network and seq2seq Transformer with encoder initialized with RoBERTa
and CodeBERT pretrained weights. BLEU indicates BLEU-4 and Exact Match (EM) is shown in percent.

1. number adding app - make an app with a textbox, a From Table 2 we can see that adding as little as
textbox, and a button named ”+”. 1% mutation to the training data notably improves
SAR: <complist> <textbox> <textbox> <button> +
</button> </complist> all models’ ability to handle noisy input (up to
2. twitter app - make an app with a textbox, a button
22.04%). We also see that the RoBERTa initialized
named “tweet”, and a label. When the button is pressed, model performs best in all evaluation categories.
set the label to textbox text. Note that, all predictions reported in Table 2 are
SAR: <complist> <textbox> <button> tweet
</button> <label> label1 </label> </complist> valid SAR format.
<code> <button1clicked> <label1> <textboxtext1>
</label1> </button1clicked> </code> 5 Future Work
3. browser app - create an app with a textbox, a button
named “go”, and a button named “back”. When the button The core contribution of our project lies in the de-
“go” is pressed, go to the url in the textbox. When the velopment endeavour of building the SAR, the SAR
button “back” is pressed, go back to the previous page. compiler, and the SAR-NL parallel data synthesizer.
4. Google front page - make an app with a textbox, a but- Although, we are working on adding new features
ton named “google”, and a button named “search”. When
the button “google” is pressed, search google. When the
to Text2App, the total implemented functionalities
button “search” is pressed, search the web. is a small fraction of what is possible in our system.
For each new component added to our system, the
Table 3: Abstract instructions to simpler app descrip- possible app space grows exponentially. With more
tion using GPT-3. Unsupported functionalities in red development effort, we can expect notably more
and italic. utility from Text2App. We invite both software
development and NLP community to contribute to
this project and turn it into a general-purpose app
4 Evaluation development platform. Our short term goal with
In this section, we evaluate the three seq2seq net- Text2App is to add more functionalities and app
works mentioned in Section 3.6 – PointerNetwork, components. In the long term, we would like to
seq2seq Transformer initialized with RoBERTa, build SAR compilers for native application devel-
and seq2seq Transformer initialized with Code- opment platforms, such as, Android, iOS, etc., and
BERT. We evaluate the models in 3 different set- cross-platform frameworks like Flutter, Ionic, etc.
tings – firstly, in a held our test set, secondly, with
6 Conclusion
increasing amount of mutation in the test set (2%,
5% 10%) (Section 3.5), and finally, using data In this paper, we explore creating functional mo-
containing specific combinations of components bile applications from natural language text de-
(<button1clicked>, <text2speech>) that were scriptions using seq2seq networks. We propose
excluded during the training. These establishes the Text2App, a novel framework for natural language
models’ ability to handle unstructured NL instruc- to app translation with the help of a simpler inter-
tions and to generalize beyond the training patterns mediate representation of the application. The in-
it was trained on. Model checkpoints are selected termediate formal representation allows to describe
based on validation BLEU score. an app with significantly smaller number of tokens
than native app development languages. We also for Computational Linguistics: Human Language
design a data synthesis method guided by a human Technologies, Volume 1 (Long and Short Papers),
pages 4171–4186, Minneapolis, Minnesota. Associ-
survey, that automatically generates fluent natural
ation for Computational Linguistics.
language app descriptions and their formal repre-
sentations. Our AI aware design approach for a Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xi-
formal language can guide future programming lan- aocheng Feng, Ming Gong, Linjun Shou, Bing Qin,
Ting Liu, Daxin Jiang, and Ming Zhou. 2020. Code-
guage and frameworks development, where further BERT: A pre-trained model for programming and
source code generation works can benefit from. natural languages. In Findings of the Association
for Computational Linguistics: EMNLP 2020, pages
Acknowledgement 1536–1547, Online. Association for Computational
Linguistics.
We thank Prof. Zhijia Zhao from UCR for propos-
ing the problem that inspired this project idea. Vanita Jain, Piyush Agrawal, Subham Banga, Rishabh
Kapoor, and Shashwat Gulyani. 2019. Sketch2code:
We also thank OpenAI, Google Colaboratory,
Transformation of sketches to ui in real-time using
Hugging Face, MIT App Inventor community, the deep neural network.
survey participants, and Prof. Anindya Iqbal for
feedback regarding modularity. This project was K. Kolthoff. 2019. Automatic generation of g raphical
user interface prototypes from unrestricted natural
funded under the ‘Innovation Fund’ by the ICT language requirements. In 2019 34th IEEE/ACM In-
Division, Government of the People’s Republic of ternational Conference on Automated Software En-
Bangladesh. gineering (ASE), pages 1234–1237.

Wang Ling, Phil Blunsom, Edward Grefenstette,


Karl Moritz Hermann, Tomáš Kočiský, Fumin
References Wang, and Andrew Senior. 2016. Latent predictor
Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi networks for code generation. In Proceedings of the
Ray, and Kai-Wei Chang. 2021. Unified pre-training 54th Annual Meeting of the Association for Compu-
for program understanding and generation. In Pro- tational Linguistics (Volume 1: Long Papers), pages
ceedings of the 2021 Conference of the North Amer- 599–609, Berlin, Germany. Association for Compu-
ican Chapter of the Association for Computational tational Linguistics.
Linguistics.
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Man-
Tony Beltramelli. 2018. pix2code: Generating code dar Joshi, Danqi Chen, Omer Levy, Mike Lewis,
from a graphical user interface screenshot. In Pro- Luke Zettlemoyer, and Veselin Stoyanov. 2020.
ceedings of the ACM SIGCHI Symposium on Engi- Ro{bert}a: A robustly optimized {bert} pretraining
neering Interactive Computing Systems, pages 1–6. approach.

Marc Brockschmidt, Miltiadis Allamanis, Alexander L. Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey
Gaunt, and Oleksandr Polozov. 2019. Generative Svyatkovskiy, Ambrosio Blanco, Colin B. Clement,
code modeling with graphs. In International Con- Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Li-
ference on Learning Representations. dong Zhou, Linjun Shou, Long Zhou, Michele Tu-
fano, Ming Gong, Ming Zhou, Nan Duan, Neel Sun-
Tom Brown, Benjamin Mann, Nick Ryder, Melanie daresan, Shao Kun Deng, Shengyu Fu, and Shujie
Subbiah, Jared D Kaplan, Prafulla Dhariwal, Liu. 2021. Codexglue: A machine learning bench-
Arvind Neelakantan, Pranav Shyam, Girish Sastry, mark dataset for code understanding and generation.
Amanda Askell, Sandhini Agarwal, Ariel Herbert- CoRR, abs/2102.04664.
Voss, Gretchen Krueger, Tom Henighan, Rewon
Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Andrew Mayn. 2020. Openai api alchemy: Sum-
Clemens Winter, Chris Hesse, Mark Chen, Eric marization – @andrewmayne. https://
Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, andrewmayneblog.wordpress.com/2020/06/
Jack Clark, Christopher Berner, Sam McCandlish, 13/openai-api-alchemy-summarization/.
Alec Radford, Ilya Sutskever, and Dario Amodei. (Accessed on 03/22/2021).
2020. Language models are few-shot learners. In
Advances in Neural Information Processing Systems, K. Moran, C. Bernal-Cárdenas, M. Curcio, R. Bonett,
volume 33, pages 1877–1901. Curran Associates, and D. Poshyvanyk. 2020. Machine learning-based
Inc. prototyping of graphical user interfaces for mobile
apps. IEEE Transactions on Software Engineering,
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and 46(2):196–221.
Kristina Toutanova. 2019. BERT: Pre-training of
deep bidirectional transformers for language under- Emilio Parisotto, Abdel rahman Mohamed, Rishabh
standing. In Proceedings of the 2019 Conference Singh, Lihong Li, Dengyong Zhou, and Pushmeet
of the North American Chapter of the Association Kohli. 2017. Neuro-symbolic program synthesis.
In Proceedings of the 5th International Conference
on Learning Representations (ICLR 2017), Toulon,
France.
Maxim Rabinovich, Mitchell Stern, and Dan Klein.
2017. Abstract syntax networks for code generation
and semantic parsing. In Proceedings of the 55th An-
nual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), pages 1139–
1149, Vancouver, Canada. Association for Computa-
tional Linguistics.
Alex Robinson. 2019. Sketch2code: Generating a web-
site from a paper mockup.
Abigail See, Peter J. Liu, and Christopher D. Manning.
2017. Get to the point: Summarization with pointer-
generator networks. In Proceedings of the 55th An-
nual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers), pages 1073–
1083, Vancouver, Canada. Association for Computa-
tional Linguistics.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob
Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz
Kaiser, and Illia Polosukhin. 2017. Attention is all
you need. In Advances in Neural Information Pro-
cessing Systems, volume 30. Curran Associates, Inc.
Pengcheng Yin and Graham Neubig. 2017. A syntactic
neural model for general-purpose code generation.
In Proceedings of the 55th Annual Meeting of the As-
sociation for Computational Linguistics (Volume 1:
Long Papers), pages 440–450, Vancouver, Canada.
Association for Computational Linguistics.

Zhihao Zhu, Zhan Xue, and Zejian Yuan. 2019. Auto-


matic graphics program generation using attention-
based hierarchical decoder. In Computer Vision –
ACCV 2018, pages 181–196, Cham. Springer Inter-
national Publishing.
A Sceme (.scm) File for Visual
Components Listing 2: A sample bky file representing the logical
components of the app in Fig. 1. The colored lines
#| represent different blocks.
$JSON
{"authURL":["ai2.appinventor.mit.edu"],
"YaVersion":"208", C GPT-3 Prompt
"Source":"Form",
"Properties":{"$Name":"Screen1","$Type":
"Form","$Version":"27",
How to make an app with these components : button,
"AppName":"speak_it","Title":"Screen1",
switch, textbox, accelerometer, audio player, video player,
"Uuid":"0",
"$Components":[{"$Name":"TextBox1", text2speech
"$Type":"TextBox","$Version":"6", random video player app. – make an app with a video
"Hint":"Hint for player with a random video, a button named “play” and a
TextBox1","Uuid":"913409813"},{"$Name": button named “pause”. When the first button is pressed,
"Button1",
"$Type":"Button", start the video. When the second button is pressed pause
"$Version":"6","Text":"Speak","Uuid": the video.
"955068562"}, A time speaking app – an app with a button, a clock and a
{"$Name":"TextToSpeech1", text2speech. When the button is clicked, speak the time.
"$Type":"TextToSpeech","$Version":"5",
"Uuid":"1305598760"}]}} Display time app – create an app with a button, a
|# timepicker, and a label. When the button is pressed, set
the label to the time.
Listing 1: A sample scm file representing the visual A messeging app – create an app with a with a textbox,
components of the app in Fig. 1. The blue portion lists and a button named ”send”, and a label. When the button
is pressed, set label to textbox text.
the components of the app.
Login form – create an app with a textbox, a passwordbox,
and a button named ”login”.
Search interface – make an application with a textbox, and
B Blockly (.bky) Logical Components a button named “search”.
siren app – create an app with a music player with source
“siren sound.mp3”, and a button. When the button is
<xml pressed, play the audio.
xmlns="http://www.w3.org/1999/xhtml"> An arithmatic addition app gui – make an app with a
<block type="component event" textbox, a textbox, and a button named “+”.
id="gnc7Dj5so‘[8HB}z|Ohk" x="-184" vibration alert app – create an app with an accelerometer,
y="91">
<mutation component type="Button" and a text2speech. When the accelerometer is shaken,
is generic="false" speak “vibration detected”.
instance name="Button1" {A new prompt} –
event name="Click"></mutation>
<field name="COMPONENT SELECTOR">
Button1 </field> Table 4: Prompt used to generate app description with
<statement name="DO">
GPT-3. A new unseen prompt is added at the end and
<block type="component method"
id="-7*:E7Xk@uO5?b32/Gq3"> the model is tasked to continue generating text in the
<mutation same pattern. This method is known as Few Shot text
component type="TextToSpeech"
method name="Speak" generation (Brown et al., 2020).
is generic="false"
instance name="TextToSpeech1">
</mutation>
<field name="COMPONENT SELECTOR">
TextToSpeech1 </field>
<value name="ARG0">
<block type="component set get"
id="wS:Fm{EYxQ]B1%*LO2zp">
<mutation
component type="TextBox"
set or get="get"
property name="Text"
is generic="false"
instance name="TextBox1">
</mutation>
<field
name="COMPONENT SELECTOR">
TextBox1 </field>
<field name="PROP">Text</field>
</block>
</value>
</block>
</statement>
</block>
<yacodeblocks ya-version="208"
language-version="33"></yacodeblocks>
</xml>

You might also like