Prompt Optimization Methods for Large Language Mod
Prompt Optimization Methods for Large Language Mod
Abstract—When faced with long text input, the generated knowledge and experience. This paper combines the Prompt
results from large language models sometimes fail to meet user optimization method for large language models with long text
expectations. Due to the length and complexity of the input input and the generation of military content, aiming to explore
content, users often do not know how to modify the input to
obtain the desired results. To address this dilemma, we propose the impact of different input semantic segments on the gener-
a Prompt optimization method for large language models with ation of military content text, assisting military enthusiasts in
long text input. This method determines the influence weights of utilizing large language models to obtain military information
different semantic segments on the results, providing guidance quickly and accurately.
for users to generate desired text using large language models.
Experimental results show that by evaluating the importance II. R ELATED C ONCEPTS OF M ILITARY
of different semantic segments in military question-answering Q UESTION -A NSWERING S YSTEMS
system text and improving the input content, the quality and
usability of the generated military question-answering text can Question-answering systems for military forums are becom-
be enhanced. ing increasingly complex and intelligent. Wang Xiaoming and
Index Terms—Long text input, Large language model, Prompt, Li Xiaohong [1] studied the key technologies involved, such as
Question-answering system
multi-round dialogue for mining users’ deep information needs
and semantic matching for precise interaction with knowledge
I. I NTRODUCTION
graphs. The construction of knowledge graphs is also gaining
Large language models, as a product of the combination of attention, with Liu Xiaoming [2] exploring methods suitable
”big data + high computing power + strong algorithms,” are a for building military domain knowledge graphs. Meanwhile,
collection of implicit knowledge extracted from massive train- in the context of big data, traditional matching methods face
ing data. In particular, large language models represented by challenges, and Michael Gray [3] proposed deep semantic
ChatGPT have demonstrated outstanding performance in the matching techniques for better knowledge association. It is
field of text generation. However, when using large language evident that research on military forum question-answering
models for text generation, especially when the user input systems has begun to take shape, and key technologies are
information is lengthy, if the content generated by the large continuously developing. In the future, it will be necessary to
language model does not meet the user’s expectations, users build even larger knowledge graphs, achieve dynamic knowl-
usually attempt to modify the input to guide the large language edge updates, and enable multi-round dialogue mechanisms to
model to generate content that aligns with their expectations. reflect personalized user interest models, making the question-
Nevertheless, due to the length of the input text, users find answering services more intelligent.
it challenging to grasp the key points when modifying the
input. Even after multiple adjustments, the desired output III. C OMPOSITION OF M ILITARY Q UESTION -A NSWERING
results may still not be obtained which is shown in Figure S YSTEMS
11. To solve this problem, this paper proposes a Prompt Military question-answering systems are complex and pre-
optimization method for large language models with long cise frameworks designed to accurately parse user queries
text input. This method determines the influence weights of and provide comprehensive answers. The system first uses a
different semantic segments on the results, providing guidance question parsing module to understand the user’s query intent
for users to generate desired text using large language models. and key points, identifying the question type and extracting
On this basis, this paper applies the method to the generation key entities. The subsequent content encoding module is
of military forum question-answering system text to verify its responsible for constructing and continuously enriching the
effectiveness. military domain knowledge graph and providing necessary
Military forum question-answering systems serve as an knowledge support by computing entity embedding vectors.
important entry point for military enthusiasts to quickly learn This process ensures the system’s deep understanding of the
about past battles and weapons and equipment, playing a cru- military domain and accurate encoding of information.
cial role in military education. They assist relevant personnel The matching and retrieval module employs decision tree-
in understanding and analyzing past battles and equipment, based algorithms to achieve deep semantic matching with the
enabling them to quickly and accurately acquire relevant knowledge graph, effectively retrieving and linking to the most
Copyright: © 2024 The Author(s); CC BY-NC 4.0. © 2024 IJETAA. All rights reserved.
V olume1, Issue2 International Journal of Emerging Technologies and Advanced Applications Feb,2024
relevant information, ensuring that the answers provided to InstructGPT [10]. RLHF technology can help models better
users are highly relevant to their queries. Once the necessary understand human instructions and ensure that the generated
information is retrieved, the reply generation module begins content is useful and harmless.
its work, organizing the reply framework and utilizing a rich
corpus for training to generate complete and accurate answers. B. Prompt Tuning
To provide more personalized services, the user interest When fine-tuning for downstream tasks, there may be cases
modeling module analyzes users’ historical topic interest where the gap between the downstream task objective and the
preferences, enabling personalized information provision. Fur- pre-training objective is too large, resulting in insignificant
thermore, the knowledge adjustment module relies on user training effects. To address this, GPT3 proposed a fine-tuning
feedback to promptly update and adjust the knowledge graph, paradigm called Prompt-Tuning [8].
ensuring the timeliness and accuracy of the system’s content. So far, three Prompt techniques have been proposed and
The smooth operation of this entire process ensures that the proven to have good effects: In-Context Learning (ICL),
military question-answering system can effectively meet users’ Instruction Fine-tuning (IFT), and Chain-of-Thought (CoT).
queries for military information and provide high-quality per- In May 2020, OpenAI first introduced the concept of In-
sonalized services. Context Learning in GPT3, which selects a small number of
labeled samples from the training set and designs task-relevant
IV. R ELATED W ORK
instruction templates to guide the generation of corresponding
A. Large Language Models results for test samples. However, this method suffers from
In 2018, OpenAI proposed the GPT (Generative Pre- high variance and instability. In October 2021, Google released
Training Transformer) model [4], which uses the Decoder part FLAN [11] and proposed Instruction Fine-tuning. The data
of the Transformer [5] architecture with certain modifications for IFT is typically a collection of human-written instructions
made to the original Decoder. However, due to the difficulty and instruction instances guided by language models. These
of the generative direction, its performance was not as good instruction data consist of three main components: instruction,
as the BERT model [6] proposed by the Google team in the input, and output. For a given instruction, there can be multiple
same year. Subsequently, the OpenAI team proposed GPT2 input and output instances. To enhance the ability of large
[7], which expanded the model parameters from 117 million to models to solve mathematical reasoning problems, in 2022,
1.5 billion and the training data from 5GB to 40GB compared Google released LAMDA (137B) [12] and introduced the
to GPT. The larger model brought better results, and OpenAI Chain-of-Thought mechanism. By providing the model with
shifted its focus to zero-shot learning. GPT2 demonstrated reasoning step prompts, the model learns to think and reason
excellent performance in zero-shot learning, but still had gaps step by step like humans, enabling it to possess basic reasoning
compared to traditional models. In GPT3, OpenAI changed capabilities and ultimately solve simple or even relatively
zero-shot learning to few-shot learning [8]. GPT3’s parameter complex mathematical problems.
scale is over a hundred times that of GPT2, reaching an
astonishing 175 billion, while the training data expanded a C. Generating Military Question-Answering Systems Using
thousandfold to 45TB. From a performance perspective, GPT3 Knowledge Graphs
can generate news articles that are difficult for humans to Currently, most information generation methods applied in
distinguish as being generated by a model. However, from military question-answering systems use intelligent algorithms
the perspective of safety and other aspects, GPT3 still has to realize the mapping from concept models to simulation
numerous issues: GPT3 cannot guarantee the correctness of scenarios. For example, knowledge graph techniques [13] are
its output and may generate negative or even harmful infor- used to complete the generation of simulation scenarios; by
mation. In 2022, OpenAI combined RLHF (Reinforcement representing concepts of weapons and equipment and combat
Learning from Human Feedback) [9] with GPT3 and proposed actions in the combat domain, a domain knowledge base is
semantic segments on the output are calculated to help users semantic segment is utilized by the large language model. This
more efficiently modify the input to obtain the desired output. paper defines the modification of the input semantic segment
Specifically, this paper uses a weight calculation method in this case as an invalid modification.
based on keyword hits. The calculation method is as follows: If the generated results of the large language model only
For the i-th semantic segment of the long text input, in contain content related to the added input information but do
keywords appearing in it are manually selected, and each not generate new content related to the battle but unrelated
occurrence of these keywords in the output is recorded as to the added information, this paper defines the modification
a hit. Define the total number of hits of the keywords in of the input semantic segment in this case as a non-important
the i-th input semantic segment in the output text as Shoti . modification.
In a long text input composed of n semantic segments, the If the generated results of the model not only contain
output influence weight of the i-th input semantic segment is content related to the added input information but also generate
P Shoti
i=1n Shoti . By sorting the obtained influence weights, the new content related to the battle but unrelated to the added
input semantic segments with the greatest influence on the information, this paper defines the modification of the input
output can be determined. semantic segment in this case as an important modification.
If the modifications made to the input semantic segments
VI. E XPERIMENT
with higher influence weights during the experiment are al-
Using the battle mentioned in the composition of the mil- ways important modifications, while the modifications made
itary question-answering system in Chapter 2 as an example, to the input semantic segments with lower influence weights
it involves parts such as the combat background, objectives, are always non-important modifications or even invalid mod-
force composition, combat preparations, basic tactics of each ifications, it can be considered that the algorithm proposed in
party, combat plans, combat actions, etc. The content involved this paper is effective.
in these components overlaps. To clarify the format of the
military information text in the experiment, this paper defines VII. E XPERIMENTAL R ESULTS
its format before using the large language model to generate Through experiments on 30 battle texts, the experimental
the military information text. This paper summarizes the above results are obtained as shown in Table I.
parts into the following six parts: combat background, force
deployment, combat objectives, combat plan, combat process, TABLE I
E XPERIMENTAL R ESULTS
and combat results. The summarized mapping relationship is
shown in Figure 4. Semantic Segment Name Influence Weight
Combat Objectives 47%
A. Experimental Data and Preprocessing Force Deployment 38%
Combat Background 15%
In terms of experimental data, this paper obtained 30 battle-
related texts from forums. Since the combat plan, combat
process, and combat results are the generated content of the The experimental results show that the two semantic seg-
large language model and cannot be used as input, these ments of combat objectives and force deployment are relatively
contents are removed during the preprocessing stage. important for using large models to generate battle text.
Scenario designers can improve the quality of the generated
B. Experimental Process battle text by focusing on describing the combat objectives
The preprocessed text data is input into the Seqmodel for and force deployment parts.
semantic segmentation. This paper provides a comparison example of combat objec-
The segmented text by Seqmodel is input into the large tives. Figure 5 shows the generated results without a detailed
language model to obtain the output results of the large description of the combat objectives semantic segment.
language model. Only the combat objectives semantic segment is modified,
Finally, the influence weights of different input semantic while the other input semantic segments remain unchanged.
segments on the generation of battle text are calculated ac- The generation effect is shown in Figure 6.
cording to the output influence weight algorithm defined in The generated results in Figure 6 not only contain content
this paper. related to the added input information but also generate
new content related to the battle but unrelated to the added
C. Algorithm Effectiveness Verification
information, such as electronic warfare, which aligns with the
This paper verifies the effectiveness of the proposed algo- definition of important modification in Chapter 5 of this paper.
rithm by modifying the input semantic segments with higher
influence weights. Specifically, new content is added to the VIII. C ONCLUSION
input semantic segments with larger weights, and the changes This paper proposes a Prompt optimization method for large
in the model-generated results are observed. language models with long text input, aiming to help users
If the generated results of the large language model do not better utilize large models to generate desired text by detecting
contain content related to the added input information, it can- the influence weights of different semantic segments in the
not be determined whether the added information in the input input on the output of the large model. At the same time,