Learning Unit 1
Learning Unit 1
Learning Unit 1
1
1.1 Introduction
In this study unit, we will first provide an understanding of the concept of digitalisation of
the economy, the use of Big Data technology, and the role of data in an organisation. We
will then look at the types of data input, processing and output and typical processing
systems and also learn about some of the methods used to process data into
information in an organisation.
3
Big Data in their activities.
Lizhe Wang, yan Ma, Jining Yan, Victor Chang, and Albert Y. Zomaya (2018), in their
study on massive, large-region coverage, multi-temporal, multi-spectral remote sensing
(RS) datasets, postulated that these are employed widely due to the increasing
requirements for accurate and up-to-date information about resources and the
environment for regional and global monitoring. In general, RS data processing involves
a complex multi-stage processing sequence, which comprises several independent
processing steps according to the type of RS application.
Maxwell (2021) provides that while data on learning outcomes is essential for
monitoring improvement on those learning outcomes, other data are important for
interpreting the learning outcomes data. The learning outcome data include, for
example, student background and interests, student and parent expectations,
5
perceptions and satisfaction, school climate, teaching practices and interactions with
students, and student destinations and accomplishments. These data could also be
considered process data, since they reference teaching styles and teaching qualities.
However, since they involve reflection on the teaching after it has occurred, perhaps
they should be considered outcome data. He referenced from Bernhardt’s (1998) claim
that it is clear that what separates successful from unsuccessful (schools) institutions
of learning, is their use of data, saying that those who analyse and utilise information
about their institutional communities make better decisions about not only what to
change, but how to institutionalise systemic change.
Voithofer and Golan (2018) state that currently, the dominant form of learning with
technology through Learning Management Systems (LMS) can be found at all levels of
education. Accordingly, learning analytics, which is the process of collecting, organising
and analysing large amount of e-learning data, can benefit administrative decision
making, resource allocation, highlight an institution’s success challenges, and increase
organisational productivity. According to Maxwell (2021), attention should be directed
to some future challenges that may change not only the way we think about data use
to improve student learning but also the way schools (institutions of learning) may
operate.
More examples: each time a customer completes an order form, the organisation
produces an inventory item; and each time a potential employee completes a job
application form, data is generated.
Example 1.1
Refer
to
Input of data (also called data entry) (b), processing of data (c), output (e) and storage
of data and information (d) will now be discussed in more detail.
7
the CIS and not yet processed is called raw data (UNISA 2022).
Raw data has little or no value for the organisation in the decision-making process. It
only becomes valuable when processed (c) into information (e) and an organisation then
uses this information for decision making (f) as also explained in the example above about
students LMS.
As linked in the above example of customer order, for example, having a list of all the
sales in a specific branch (raw data) is not useful in itself, but when this raw data is
processed into information (for example, total sales for a month compared to other
months and branches in the organisation), this information now has become useful and
valuable. As a foundation you must fully understand the form of structured and
unstructured data discussed above in sub-paragraphs 1.5.1 and 1.5.2 respectively.
.
Because information is obtained by processing raw data, it is essential that
the raw data captured in the CIS should be accurate, complete, reliable and
verifiable (UNISA 2022).
The quality of information is directly linked to the quality of raw data entered or captured
– in other words, inaccurate data captured or used will lead to inaccurate information,
and incomplete data will lead to incomplete information, which in turn will result in
ineffective decisions. The principle of “garbage-in-garbage-out” is especially true in a
CIS. Many organisations have embarked on expensive data cleaning projects (correcting
and completing data) because they understand the impact of inadequate data on
information and decision making.
NOTE
Many medium and small organisations still use paper source documents (e.g., invoices,
application forms, etc) to collect data. Most large organisations, however, have moved
to electronic source documents by capturing data directly through computer data entry
screens or by using barcode scanners. In topic 7 (Pastel) you will practise using
computer data entry screens to capture invoices and other source documents.
These documents are, first and foremost, important because they serve as physical
evidence that a financial transaction has actually occurred. Nowadays, these documents
do not necessarily need to be a physical hard copy; they may be in a traceable electronic
form.
These documents are also essential because when companies undergo an audit, the
auditor’s access to a clear and accessible paper trail of all transactions enhances the
overall legitimacy and independence of the audit. In order to reaffirm the accuracy of the
company’s balances in individual accounts, auditors need full access to all the
documents. The optical scanner reads handwritten numbers and letters and transfers
this information to an on-line computer and is automatically placed on tape, ready for
computer processing. This data entry method is extremely accurate and fast. A
comprehensive computer edit program is used to check the presence of required data
as well as its validity, compatibility, and arithmetic accuracy. Time and cost savings
together with information that can be related to the auditing process has provided
support and participation by management (UNISA 2022).
1.6.1.3 Source documents for running a business more smoothly and enhance
transparency
All its source documents should be kept and stored for future reference.
9
1.6.2 Characteristics of a good source document
(i) It serves as a good internal control and provides evidence that a transaction
occurred.
(ii) It should capture the key information about a transaction, such as the names of
the parties involved, amounts paid (if any), the date, and the substance of the
transaction.
(iii) It is good when the document describes the basic facts of the transaction such
as the date, the amount, the purpose, and all parties involved in the transaction.
(iv) They are frequently identified with a unique number, so that they can be
differentiated in the accounting system.
(v) Nowadays, these documents do not necessarily need to be a physical hard copy
– they may be in a traceable electronic form.
1.6.4 Migration
Migration is the process of moving data from one platform/format to another
platform/format, involving migrating data from a legacy system to the new system
without impacting active applications, and finally redirecting all input/output activity to
the new device. In many cases it is extracted from several different databases, then
managed and finally stored in yet another database. Thus, a high amount of data is
being managed by databases and applications in companies today (UNISA 2022).
Sarmah (2018) mentions that best practices must be followed to avoid migration being
very expensive and incurring hidden costs not identified at the early stage. According
to Sarmah (2018), data needs to be transportable from physical and virtual
environments for concepts such as virtualisation. Thus, to avail clean and accurate data
for consumption, a data migration strategy should be designed in an effective way such
that it will enable an organisation to ensure that tomorrow’s purchasing decisions fully
meet both present and future business needs and render maximum return on
investment. Thus, migrating data can be a complex process during which testing must
be conducted to ensure the quality of the data, taking into account testing scenarios
and the accompanying risks . In simple terms, it is a process of bringing data from
various source systems into a single target system (UNISA 2022).
a. Schema migration:
It may be necessary to move from one database vendor to another, or to upgrade the
version of database software being used. With the upgrade of software, it is less likely
to require a physical data migration, but in major upgrades, a physical data migration
may be necessary. Thus, a physical transformation process may be required due to the
possible significant underlying data format change. If so, behaviour in the applications
layer may not be affected, unless the data manipulation language or protocol has
11
changed – but modern applications are written to be agnostic to the database technology
so that a change from Sybase, MySQL, DB2 or SQL Server to Oracle should only require
a testing cycle to be confident that both functional and non-functional performance have
not been adversely affected.
b. Application migration:
When changing application vendors, for example a new CRM or ERP platform,
inevitably, it entails a substantial transformation as almost every application or suite
operates on its own specific data model and also interacts with other applications and
systems within the enterprise application integration environment. Furthermore, to allow
the application to be sold to the widest possible market, commercial off-the-shelf
packages are generally configured for each customer using metadata. Application
programming interfaces (APIs) may be supplied by vendors to protect the integrity of the
data they have to handle.
d. Legacy migration:
Legacy migration can be classified into well-defined interfaces, applications, and
database services. Its strategies are easy to apply, fast to implement, and can be widely
applied to industry software projects. However, it is very difficult to incorporate with
newer systems such as open-source operating systems because non-extensibility,
incompatibility, and less openness of the underlying hardware and software of the legacy
systems. Its life cycle includes the following procedures: Before migration, (i) Plan,
assess and prepare, (ii) Assess hardware, software and network readiness and plan for
the future, (iii) Clean up by eliminating useless data, consolidating resources, monitoring
everything. During migration: (i) Prototype, pilot and deploy migration (ii) Use powerful
database modeling to simulate migration, resolving issues before commit (iii)Track
migration. After migration: Maintain and manage new environment.
c. Legacy migration. They can choose from several strategies, which depend on
the project requirements and available resources. Examples of such migration
drivers are mergers and acquisitions, business optimisation and reorganisation to
attack new markets or respond to competitive threat (UNISA 2022).
15
1.7. System integrating management information across the entire enterprise
Companies are always seeking to become nimbler in their operations and more
innovative with their data analysis and decision-making processes. They are realising
that time lost in these processes can lead to missed business opportunities. In principle,
the core Big Data challenge is for companies to gain the ability to analyse and
understand internet-scale information just as easily as they can now analyse and
understand smaller volumes of structured information. A lack of access and/or timely
access therefore reduces the organisation’s ability to make timely and effective
decisions. Many organisations have realised that timely access to appropriate, accurate
data and information is the key to success or failure. Therefore, for an organisation to
be competitive, data must be collected, processed into useful information, stored and
used in decision making (Li, Jiang & Zomaya 2017). Accordingly, organisations (and
individuals) usually make better informed decisions (f) if they have access to more data
and valuable information.
Klaus, Rosemann, & Gable (2000) pointed out that many organisations prefer to use
one computer system that can be used throughout the organisation by integrating all
the functions. The integration of transaction-oriented data and business process
throughout an organisation is enabled by enterprise systems. These enterprise systems
are commercial software packages and include Enterprise Resource Planning (ERP)
and related packages as software for advanced planning and scheduling sales force
automation and product configuration, among others. Khoualdi and Basahel (2014)
explain that an ERP is a system integrating management information through the
management of the flow of data across the entire enterprise. According to Sadyrin,
Syrovatskay and Leonova (2021), the use of integral systems for interaction with
customers (CRM systems) and resource management and planning systems (ERP
systems) is common business practice for many companies, but at the same time,
modern management faces all the new tasks that are associated not only with storage,
analysis , assessment, verification and protection of very large amounts of data as well
as the need to use new data types in the development of control solutions. This
digitalisation of the economy and the gradual transition to a new technological order
open up new opportunities and prospects in the use of digital and information
technologies in analytical work. In many cases, it is necessary to use both structured
and unstructured data, the sources of which can be a variety of technical and electronic
devices. In this case, the data can have completely different formats.
Also refer to AIN1501 study unit 2 (UNISA 2022) for the AIS
implementation in an ERP environment and decision-making, including
advantages and disadvantages of an ERP system and the relationship
between SAP and ERP software.
1.7.1 Capturing
Data is first captured on paper (hard copy) or electronic (soft copy) source documents.
The electronic source documents referred to are documents created outside of the
organisation’s CISs. For example, the organisation receives an electronic file from one
of its trading partners, containing a batch of electronic source documents. The
organisation’s processes will determine the frequency of capturing the data – for
example, employee data may only be captured once a month before the payroll run,
supplier invoices may be captured twice a week and goods received notes may be
captured daily. Thus, data can be captured through batch input or online input.
Batch input is used where a huge number of similar source documents must be
captured, and up-to-date data and information are only required on the same
frequency as the frequency of data capturing. An example of batch input is Unisa main
campus receiving completed MCQ mark sheets from Unisa regional offices and through
the post. These completed mark sheets are batched together and captured daily into
Unisa’s CIS. It involves similar source documents being grouped together (batch) and
then entered in the CIS periodically, say, daily, weekly or monthly. It will require
additional batch controls and procedures to be implemented in the organisation’s
control environment (UNISA 2022).
You will learn more about these controls and procedures in auditing (AUE2601).
According to Souza, Silva, Coutinho, Valduriez and Mattoso (2016), when this is
accomplished online (i.e., without requiring users to stop execution to reduce the data
and resume execution), it can save much time and user interactions can integrate within
workflow execution. Thus, reducing input data is a powerful way to reduce overall
execution time in such workflows. However, because data is captured directly and
immediately, any corrections to the data must also be made immediately in order for the
data capturing process to be completed.
Taking the example of a supermarket pay point, where barcode scanners and terminals
for online inputting are used when customers buy inventory items by scanning
(capturing) the inventory item at the point of sale (pay point), there could be experiences
where the barcode of an item is not being recognised by the CIS (error). As a result, the
person at the till has to manually enter the correct barcode (correction) in order to
complete the transaction.
1.7.2.1 Advantages
The main advantage of using online input is that the data in the CIS is always up to
date. (Please note that although the raw data may be up to date, the information in the
CIS may not be up to date as the data may not be processed yet. Refer to section
1.4.2.)
1.7.2.2 Disadvantages
A disadvantage is that online inputting is more costly because hardware is required
to capture the data at each point where the activity to be captured takes place. Another
major problem, according to Souza, Silva, Coutinho, Valduriez and Mattoso (2016), is
to determine which subset of the input data in a scientific workflow system should be
removed. This includes other problems relating to guaranteeing that the workflow
system will maintain execution and data consistent after reduction, and keeping track
of how users interacted with execution. According to Carvalho, Essawy, Garijo,
Medeiros and Gil (2017), scientists workflow systems are not designed to help
scientists create and track the many related workflows that they build as variants, trying
different software implementations and distinct ways to process data and deciding what
to do next by looking at previous workflow results. During a scientific workflow
execution, a major challenge is how to manage the large volume of data to be
processed, which is even more complex in cloud computing where all resources are
configurable in a pay-per-use model (UNISA 2022).
For raw data to become information, it must be processed (c) (i.e., data
processing) (UNISA 2022).
1.8.1 The main directions in the analysis and processing of big data
Jelonek (2017) states that the term "big data" refers to datasets whose size exceeds the
19
capabilities of typical databases for entering, storing, managing and analysing
information. Also, in the study of evaluation of Using Big Data Technologies in Russian
Financial Institutions (Bataev 2018), data growth occurs throughout the world, and the
Russian Federation is no exception. In 2017 the data volume reached 580 exabytes,
and this figure reached about 980 exabytes in 2020. According to Jelonek (2017), it
suggests something more than just an analysis of huge amounts of information. The
question is not that organisations create huge amounts of data, but that most of them
are presented in a format that does not conform to the traditional structured database
format – they are weblogs, video recordings, text documents, machine code or, for
example, geospatial data. All this is stored in a variety of different repositories,
sometimes even outside the organisation. As a result, corporations have access to a
huge volume of their data and do not have the tools necessary to establish relationships
between this data and draw meaningful conclusions based on them. If we add that the
data is updated more and more often, it turns out that traditional methods of analysing
information cannot keep up with the huge amount of constantly updated data, which
eventually opens the way for big data technologies.
Running queries involves a high network overhead because data has to be exchanged
between cluster nodes and hence, the network becomes a critical part of the system.
To avoid the network bottleneck, it is essential for distributed data processing systems
(DDPS) to be aware of the network rather than treating it as a black box. Thus, query
throughput in a DDPS can significantly be improved by performing partial data reduction
within the network. Therefore, an in-network, processing was proposed as a way of
achieving network awareness to decrease bandwidth usage by custom routing,
redundancy elimination, and on-path data reduction, thus increasing the query
throughput of a DDPS. The challenges of an in-network processing system range from
design issues, such as performance and transparency, to the integration with query
optimisation and deployment in data centres. These challenges are formulated as
possible research directions and provide a prototype implementation (UNISA 2022).
Near-Data processing
According to Vinçon, Koch and Petrov (2019), Near-Data Processing refers to an
architectural hardware and software paradigm, based on the co-location of storage and
compute units. Ideally, it will allow for the execution of application-defined data- or
compute-intensive operations in-situ, i.e., within (or close to) the physical data storage.
Thus, Near-Data Processing seeks to minimise expensive data movement, improving
performance, scalability, and resource-efficiency. Processing-in-Memory is a sub-class
of Near-Data Processing that targets data processing directly within memory (DRAM)
chips. The effective use of Near-Data Processing mandates new architectures,
algorithms, interfaces, and development toolchains.
Scientific simulations
The overall goal of Wang (2015), in his study, was to provide high-performance data
management and data processing support on array-based scientific data, targeting data-
intensive applications and various scientific array storages. In this regard, he believed
that such high-performance support can significantly reduce the prohibitively expensive
costs of data translation, data transfer, data ingestion, data integration, data processing,
21
and data storage involved in many scientific applications, leading to better performance,
ease of use, and responsiveness. According to him, scientific simulations were being
performed at finer temporal and spatial scales, leading to an explosion of the output data
(mostly in array-based formats), and challenges in effectively storing, managing,
querying, disseminating, analysing, and visualising these datasets. Many paradigms
and tools used for large-scale scientific data management and data processing were
often too heavy-weight and had inherent limitations, making it extremely hard to cope
with the `Big Data’ challenges in a variety of scientific domains. In contrast to offline
processing, implementation of high-performance data management and data
processing support on array-based scientific data, could avoid, either completely or to a
very large extent, both data transfer and data storage costs.
Big Data in financial analysis
According to Ziora (2015), Big Data is a term connected with an analysis of all the
aspects of huge volumes of data, which can be also conducted in real time.
Technologies of data gathering, data processing, presentation and making available of
information are used in a novel way by e-entrepreneurship in order to create new
business ventures, distribute information and to cooperate with customers and
partners. The basic reasons why organisations implement big data solutions is for
gaining competitive advantage and optimizing business processes. Sadyrin,
Syrovatskay and Leonova (2021) in their article discuss the promising directions of
using big data in financial analysis procedures, which, being integrated into the system
of forming various management decisions, can significantly increase their efficiency.
They posit that the ongoing digital transformation of the national economy inevitably
sets the task of introducing digital technologies and tools into the practice of economic
activities of organisations and enterprises.
One of the areas of digitalisation is Big Data technology, which in many ways is already
being used in the field of finance. However, for the effective use of big data in the
practice of financial analysis of economic activity, it is necessary to solve a variety of
significant problems. This was in consideration of the necessary elements of the
financial analysis system, which, first of all, should be focused on the use of big data,
as well as aspects that, when using digital technologies, can provide the maximum
effect. An analysis of the current methodological and procedural framework for financial
analysis of economic activity shows that it is far from fully consistent with modern
economic, informational, and technological realities, since it was developed long before
the start of digital transitions. In previous years, the main direction of improving financial
analysis was the automation of accounting and analytical procedures based on the use
of computers and various software products. To improve financial analysis, there is
currently a fairly large number of software and hardware tools, the scale of which can
range from a separate accounting automation program to complex management
systems for large international companies. In Russia, one of the leaders in the use of
big data is the financial institutions, which have accumulated a huge amount of
information, in particular on the client base that needs further processing. According to
Sadyrin, Syrovatskay and Leonova (2021), information and analytical systems of
financial analysis should rely on opening up opportunities. One of the most promising
areas of improving financial analysis and increasing its efficiency is the use of various
Big Data technologies to develop new methodological foundations of financial analysis
in order to form effective technological solutions for specific tasks.
The field of applying Big Data technologies is relatively new and involves a large number
of issues and problems.
Business intelligence
As cited by Ziora (2015), Ohlhorst mentions that Big Data solutions can include such
analytics concepts and technologies as "Traditional Business Intelligence (BI), which as
he mentions consists of a broad category of applications and technologies for gathering,
storing, analysing, and providing access to data. BI delivers actionable information
which helps enterprise users make better business decisions, using fact-based support
systems. It allows for conducting in-depth analysis of detailed business data, provided
by databases, application data, and other tangible data sources". Business intelligence
solutions can improve the decisionmaking process in terms of the acceleration of
decision-making process at all levels of management and increasing its efficiency and
efficacy.
Data mining
Data mining is a process in which data is analysed from different perspectives and then
turned into summary data that are useful. Data mining is normally used with data at rest
or with archival data. Data mining techniques focus on modelling and knowledge
discovery for predictive, rather than purely descriptive, purposes such as uncovering
new patterns from large data sets. Big Data can be applied in hybrid systems as well
and accelerated decision-making and efficient enterprise management support can be
achieved by deploying the right methods and techniques of data mining. Statistical
applications. These look at data using algorithms based on statistical principles and
normally concentrate on data sets related to polls, census, and other static data sets
(UNISA 2022).
Statistical applications
They deliver sample observations that can be used to study populated data sets for the
purpose of estimating, testing, and predictive analysis. Empirical data, such as surveys
and experimental reporting, are the primary sources for analysable information.
Predictive analysis is a subset of statistical applications in which data sets are examined
to come up with predictions, based on trends and information gleaned from databases
(UNISA 2022).
Data modelling
Another form of big data technology also exists, namely data marts and data
warehouses, which are frequently key components of business intelligence systems.
Other pros of big data application include the following: the recommendation engine
allowing online retailers to match and recommend users to one another or to products
and services based on analysis of user profile and behavioural data; sentiment analysis
used in conjunction with Hadoop, advanced text analytics tools analyse the unstructured
text of social media and social networking posts (UNISA 2002).
Risk modelling
Risk modelling allows for analysis of large volumes of transactional data to determine
risk and exposure of financial assets, to prepare for potential “what-if” scenarios based
on simulated market behaviour, and to score potential clients for risk; fraud detection
where big data techniques are used to combine customer behaviour, historical and
transactional data to detect fraudulent activity; customer churn analysis where
enterprises use Hadoop and big data technologies to analyse customer behaviour data
to identify patterns that indicate which customers are most likely to leave for a competing
vendor or service; social graph analysis, which helps enterprises determine their “most
important” customers; customer experience analytics allowing for integration of data
from previous customer interaction channels such as call centres, online chat, etc, to
gain a complete view of the customer experience (UNISA 2022)
Network monitoring
Logical calculations include comparing the data to other data or calculations – for
example, is the data the same (=), not the same (<>), greater than (>), smaller than (<),
greater than or equal to (>=) or smaller than or equal to (<=)? The logical calculation will
have a true or false answer, and based on this answer, further processing will take place.
The IF function is one of various Microsoft Excel functions that can be used for logical
calculations. See study unit 5 in which the IF function is explained in detail.
The main drawback of this method of processing is that the master files are only up to
date once the updating of the transaction files has occurred. When using this method
of processing, users who utilise information and data must be aware of which
transaction files were updated and which files were not, and therefore how up to date
the information they are looking at is. An example of batch processing is the marking
(processing) of all the captured AIN2601 Assignment 1 QUIZ answers on a specific
date. Students would therefore need to be aware that year marks will only be updated
after the assignment marks have been processed. Transaction files and master files
are explained in study unit 2.
The immediate update of the transaction files to the master files as the
transaction occurs is called real-time processing (UNISA 2022).
An example of real-time processing is buying a movie ticket at a movie theatre. The movie
you want to see and number of tickets (data) entered (either by yourself at the ticket
machine or by the salesperson) is immediately updated to the master files (seating plan)
so that the same seat cannot be sold twice to two different persons.
Real-time processing ensures that the master files are always up to date, and this is
also the greatest advantage of processing. Transaction files and master files are
explained in study unit 2.
1.9 Output/information
Processed data becomes information, and this information can be retrieved by users
through batch output or interactive output.
Batch output occurs when all requests for information (i.e., reports, queries,
etc) are batched together and periodically extracted from the CIS (UNISA
2022).
Since requests are batched before being extracted, users have to wait to receive their
required outputs. Batch output is often used for routine reports that must be extracted
at the same time each day, week or month (e.g., sales reports for the day, week or
month). These batched reports are pre-specified and include the same parameters
each time.
A bank generating monthly bank statements for clients is an example of batch output, as
the bank will extract the required information once a month on a specific date from their
CIS; i.e., all clients' bank statements are extracted in one batch.
The benefit of using batch output is that reports are consistent between periods.
For example, the sales reports will include the same branches in the different
geographical areas each time the specific report is extracted. Another benefit of output
is that the extraction of reports can be over down times (over weekends, evenings, etc),
thereby optimising computer resources.
Interactive output occurs when users are directly connected to the CIS and
can request certain information and receive it immediately (UNISA 2022).
Using internet banking and viewing your transactions for a month or a period specified
by you is an example of interactive output.
The main benefit of using this method of output is that users can immediately receive
information for decision making. One of the drawbacks of interactive output is that
the computer resources are not optimally used and as a result, the performance of the
CIS may be negatively affected. For example, at month-end, numerous users extract
reports from the CIS while day-to-day transaction processing still continues. This
increase in use of the CIS can make the users experience a “very slow” CIS response
time (i.e., the computer is “slow”).
FIGURE 1.2: Batch input, batch processing and batch output (UNISA 2022)
For example, twice a week, the gym partner of a medical aid provides the medical aid
with a batch of electronic source documents in one file. This file contains the number of
times each member of the medical aid visited the gym. The batch source documents
contained in the electronic file are imported when received. The transaction files
containing the gym data are updated every Saturday. The members can view their
information on the medical aid’s secure website as soon as the transaction file has
been updated.
For example, each branch of an organisation enters its request for inventory online as
needed. The transaction file containing the different branches’ requests is updated to
the master file every two days. The branch manager can extract order information
directly from the operational system.
Online input, real-time processing and interactive output
Please note that an organisation’s CIS is not limited to one type of processing system.
Many organisations will use all the types of input, processing and output. The type used
will be determined by the activity performed, i.e., an organisation can use both batch
input and online input. Take for example the capturing of Unisa QUIZ assignment
answers. The answers submitted by students through the myUnisa interface is online
input and the capturing of the physical answer sheets will be captured using batch input
at the Unisa main campus.
Can you think of an example where a CIS uses both real-time processing and batch
processing? Pastel Partner (per the standard set up) will use batch processing for
invoices but will use real-time processing for good receive notes (GRN). You will learn
more about Pastel Partner in topic 7. Refer back to this study unit after you have
completed topic 7.
29
Banks use both batch output (printing of monthly bank statements) and interactive
output (request of a mini statement at an ATM).
Activity 1.1
Go to Discussion Forum 1.1 and discuss this with your fellow students.
Historically, computer systems’ data and information were stored in a flat file
environment, where files are not related to one another and the users of data and
information each keep their own data and information (UNISA 2022).
This is similar to an environment in which users each have their own Microsoft Excel
spreadsheet and do not share the data and information on their individual
spreadsheets. As computer systems evolved and the need arose for users to share data
and information, databases became the preferred method of storing data and
information. The database environment will be discussed in detail in study unit 2.
Simplistically, the flat file and database environment can be visualised as in figure 1.4
(flat file) and figure 1.5 (database).
FIGURE 1.4: Flat file environment (UNISA 2022)
31
FIGURE 1.5: Sharing data and information in a database environment (UNISA
2022)
1.12 Summary
In this study unit, we looked at typical processing systems based on the types of data input,
processing and output. We also gained an understanding of how data is processed into
information by sorting, classifying, calculating, summarising and transforming it. In the next
study unit, we will examine in detail the database environment used to store data and
information.
REFERENCES:
Amin, R., Vadlamudi, S. & Rahaman, M.M. (2021). Opportunities and challenges of data
migration in cloud. Engineering International, 9(1), pp. 41-50. Elfner, E.S. 1979. A
Comparative Study of Alternative Techniques for Analyzing Student Outcomes.
Artamonov, A., Ionkina, K., Tretyakov, E. & Timofeev, A. (2018). Electronic document
processing operating map development for the implementation of the data
management system in a scientific organization. Procedia computer science, 145, pp.
248-253.
Azeroual, O. & Jha, M. (2021). Without data quality, there is no data migration. Big Data
and Cognitive Computing, 5(2), p.24
Bataev, A.V. (2018, September). Evaluation of Using Big Data Technologies in Russian
Financial Institutions. In 2018 IEEE International Conference" Quality Management,
Transport and Information Security, Information Technologies"(IT&QM&IS) (pp. 573-577).
IEEE.
Bernhardt, V. L. (1998). Data analysis for comprehensive schoolwide improvement. Eye on
Education.p.1
Carvalho, L.A.M.C., Essawy, B.T., Garijo, D., Medeiros, C.B. & Gil, Y. (2017).
Requirements for supporting the iterative exploration of scientific workflow variants.
In Proceedings of the Workshop on Capturing Scientific Knowledge (SciKnow),
Austin, Texas (Vol. 2017).
Fisher, D. & Frey, N. (2015). Show & Tell: A Video Column/Don't Just Gather Data--Use
It. Educational Leadership, 73(3), pp.80-81.
Jelonek, D. (2017). Big Data Analytics in the Management of Business. In MATEC Web of
Conferences (Vol. 125, p. 04021). EDP Sciences.
Johnson, M.D. & Bull, S. (2015). Designing for visualisation of formative information on
learning. In Measuring and Visualizing Learning in the Information-Rich Classroom (pp. 237-
250).
Latt, W.Z, (2019). Data migration process strategies (Doctoral dissertation, MERAL Portal).
Li, K.C., Jiang, H. & Zomaya, A.Y. (eds). (2017). Big data management and processing.
CRC Press. www.accountingtools.com/articles/what-is-a-source-docu…(accessed
19/10/2022)
Maxwell, G.S. (2021). Different Approaches to Data Use. In Using Data to Improve Student
Learning (pp. 11-71). Springer, Cham.
Maxwell, G.S. (2021). Collecting Data and Creating Databases. In Using Data to Improve
Student Learning (pp. 113-141). Springer, Cham.
Sadyrin, I., Syrovatskay, O. & Leonova, O. (2021). Prospects for using big data in financial
analysis. In SHS Web of Conferences (Vol. 110). EDP Sciences.
Sakr, S. & Gaber, M. (eds). (2014). Large scale and big data: Processing and
management. Crc Press.
Sarmah, S.S. (2018). Data migration. Science and Technology, 8(1), pp.1-10.
33
Corresponding author: sarmah.simanta@gmail.com (Simanta Shekhar Sarmah) Published
online at http://journal.sapub.org/scit Copyright © 2018 Scientific & Academic Publishing.
All Rights Reserved.
Souza, R., Silva, V., Coutinho, A.L., Valduriez, P. & Mattoso, M. (2016, November). Online
input data reduction in scientific workflows. In WORKS: Workflows in Support of Large-
scale Science.
Souza, R., Silva, V., Coutinho, A.L., Valduriez, P. & Mattoso, M., (2020). Data reduction in
scientific workflows using provenance monitoring and user steering. Future
Generation Computer Systems, 110, pp. 481-501.
UNISA. (2022). Study material for Accounting Information System in a Computer
Environment AIN1501. Unisa, Pretoria.
UNISA. (2022). Study material for Practical Accounting Data Processing AIN2601. Unisa,
Pretoria.
Vinçon, T., Koch, A. & Petrov, I. (2019). Moving processing to data: on the influence of
processing in memory on data management. arXiv preprint arXiv:1905.04767
Voithofer, R. & Golan, A.M. (2018). Data Sources for Educators. Responsible Analytics
and Data Mining in Education: Global Perspectives on Quality, Support, and Decision
Making.
Wang, Y., (2015). Data management and data processing support on array-based
scientific data (Doctoral dissertation, The Ohio State University).
Warnecke, B. (2018). From Customizing back to SAP standard options and limits when
migrating to SAP S/4HANA. HMD Praxis der Wirtschaftsinformatik, 55(1), pp. 151-
162.
Ziora, A.C.L. (2015). The role of big data solutions in the management of organizations.
Review of selected practical examples. Procedia Computer Science, 65, pp. 1006-1012.
https://doi.org/10.1016/j.procs.2015.09.059