Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 47

Unit III: Data Collection

Q. 1. Explain static and dynamic characteristics of Instruments. Or Explain characteristics


of instruments used in Experiment Set-up.
Answer:

The performance characteristics of an instrument are mainly divided into two categories:
i) Static characteristics
ii) Dynamic characteristics

Static characteristics:
Static characteristics refer to the characteristics of the system when the input is either held
constant or varying very slowly. The items that can be classified under the heading static
characteristics are mainly:

Range (or span): It defines the maximum and minimum values of the inputs or the outputs for
which the instrument is recommended to use. For example, for a temperature measuring
instrument the input range may be 100-500 o C and the output range may be 4-20 mA.

Sensitivity: It can be defined as the ratio of the incremental output and the incremental input.
While defining the sensitivity, we assume that the input-output characteristic of the instrument is
approximately linear in that range. Thus if the sensitivity of a thermocouple is denoted as 10 0
μV C/ , it indicates the sensitivity in the linear range of the thermocouple voltage vs. temperature
characteristics. Similarly sensitivity of a spring balance can be expressed as 25 mm/kg (say),
indicating additional load of 1 kg will cause additional displacement of the spring by
25mm.Again sensitivity of an instrument may also vary with temperature or other external
factors. This is known as sensitivity drift. Suppose the sensitivity of the spring balance
mentioned above is 25 mm/kg at 20 o C and 27 mm/kg at 30o C. Then the sensitivity drift/o C is
0.2 (mm/kg)/o C. In order to avoid such sensitivity drift, sophisticated instruments are either kept
at controlled temperature, or suitable in-built temperature compensation schemes are provided
inside the instrument.

Linearity: Linearity is actually a measure of nonlinearity of the instrument. When we talk about
sensitivity, we assume that the input/output characteristic of the instrument to be approximately
linear. But in practice, it is normally nonlinear, as shown in Fig.1. The linearity is defined as the
maximum deviation from the linear characteristics as a percentage of the full scale output. Thus,
Hysteresis: Hysteresis exists not only in magnetic circuits, but in instruments also. For example,
the deflection of a diaphragm type pressure gage may be different for the same pressure, but one
for increasing and other for decreasing, as shown in Fig.2. The hysteresis is expressed as the
maximum hysteresis as a full scale reading, i.e., referring fig.2,

Resolution: In some instruments, the output increases in discrete steps, for continuous increase
in the input, as shown in Fig.3. It may be because of the finite graduations in the meter scale; or
the instrument has a digital display, as a result the output indication changes discretely. A3(1/2) -
digit voltmeter, operating in 0-2V range, can have maximum reading of 1.999V, and it cannot
measure any change in voltage below 0.001V. Resolution indicates the minimum change in input
variable that is detectable. For example, an eight-bit A/D converter with +5V input can measure

the minimum voltage of 8 − or 19.6 mv. Referring to fig.3, resolution is also defined
in terms of percentage as:

The quotient between the measuring range and resolution is often expressed as dynamic range
and is defined as:

And is expressed in terms of dB. The dynamic range of an n-bit ADC, comes out to be
approximately 6n dB.
Accuracy: Accuracy indicates the closeness of the measured value with the actual or true value,
and is expressed in the form of the maximum error (= measured value – true value) as a
percentage of full scale reading. Thus, if the accuracy of a temperature indicator, with a full scale
range of 0- 500 o C is specified as ± 0.5%, it indicates that the measured value will always be
within ± 2.5 oC of the true value, if measured through a standard instrument during the process
of calibration. But if it indicates a reading of 250 o C, the error will also be ± 2.5 oC, i.e. ± 1% of
the reading. Thus it is always better to choose a scale of measurement where the input is near
full-scale value. But the true value is always difficult to get. We use standard calibrated
instruments in the laboratory for measuring true value if the variable.

Precision: Precision indicates the repeatability or reproducibility of an instrument (but does not
indicate accuracy). If an instrument is used to measure the same input, but at different instants,
spread over the whole day, successive measurements may vary randomly. The random
fluctuations of readings, (mostly with a Gaussian distribution) is often due to random variations
of several other factors which have not been taken into account, while measuring the variable. A
precision instrument indicates that the successive reading would be very close, or in other words,
the standard deviation σ e of the set of measurements would be very small. Quantitatively, the
precision can be expressed as:

The difference between precision and accuracy needs to be understood carefully. Precision
means repetition of successive readings, but it does not guarantee accuracy; successive readings
may be close to each other, but far from the true value. On the other hand, an accurate instrument
has to be precise also, since successive readings must be close to the true value (that is unique).

Dynamic Characteristics:
Dynamic characteristics refer to the performance of the instrument when the input variable is
changing rapidly with time. For example, human eye cannot detect any event whose duration is
more than one-tenth of a second; thus the dynamic performance of human eye cannot be said to
be very satisfactory. The dynamic performance of an instrument is normally expressed by a
differential equation relating the input and output quantities. It is always convenient to express
the input-output dynamic characteristics in form of a linear differential equation. So, often a
nonlinear mathematical model is linearized and expressed in the form:

wherei x and 0 x are the input and the output variables respectively. The above expression can
also be expressed in terms of a transfer function, as:

Potentiometer
Displacement sensors using potentiometric principle (Fig.4) have no energy storing elements.
The output voltage eo can be related with the input displacement xi by an algebraic equation:

where is the total length of the potentiometer and E is the excitation voltage.. So, it can be
termed as a zeroth order system.
Thermocouple
A bare thermocouple (Fig.5) has a mass (m) of the junction. If it is immersed in a fluid at a
temperature Tf , then its dynamic performance relating the output voltage eo and the input
temperature Tf , can be expressed by the transfer function:
Hence, the bare thermocouple is a first order sensor. But if the bare thermocouple is put inside a
metallic protective well (as it is normally done for industrial thermocouples) the order of the
system increases due to the additional energy storing element (thermal mass of the well) and it
becomes a second order system.
Seismic
Sensor Seismic sensors are commonly used for vibration or acceleration measurement of
foundations. The transfer function between the input displacement i x and output displacement o
x can be expressed as:

From the above transfer function, it can be easily concluded that the seismic sensor is a second
order system.

Q.2. What are the good measurement practices?


Answer:

1. Follow the user manual by manufacturer for using and maintaining the instruments.
The sections of a user manual often include: A cover page, A title page and copyright
page. The User Manual contains all essential information for the user to make full use of
the information system. This manual includes a description of the system functions and
capabilities, contingencies and alternate modes of operation, and step-by-step procedures
for system access and use.
2. Use experienced staff, and provide training for measurement.
Training may contains the planning, installing, monitoring and maintaining control
systems and machinery within manufacturing environments. They typically work with
control processes that use sensors to provide feedback.
3. Calibrate the instruments periodically to ensure to control the calibration error.
Whenever possible, the calibration of an instrument should be checked before taking
data. If a calibration standard is not available, the accuracy of the instrument should be
checked by comparing with another instrument that is at least as precise, or by consulting
the technical data provided by the manufacturer. Calibration errors are usually linear
(measured as a fraction of the full scale reading), so that larger values result in greater
absolute errors.
4. The scale of instrument should be easily accessible to avoid human errors due to parallax.
Parallax (systematic or random) — This error can occur whenever there is some distance
between the measuring scale and the indicator used to obtain a measurement. If the
observer's eye is not squarely aligned with the pointer and scale, the reading may be too
high or low (some analog meters have mirrors to help with this alignment).
5. Keep the record of experiment with input parameters, date of experiment, time output
readings, parameters varied, objective of the experiments etc.
An experiment is a procedure carried out to support, refute, or validate a hypothesis.
Experiments vary greatly in goal and scale, but always rely on repeatable procedure and
logical analysis of the results. There also exists natural experimental studies.
6. Check or validate software, to make sure it works correctly.
The process of evaluating software during the development process or at the end of the
development process to determine whether it satisfies specified business
requirements. Validation Testing ensures that the product actually meets the client's
needs.
7. Write down readings at the time they are made.
8. Measure the reading at least three times and take average of the three reading to avoid
errors during measurement of the users.
9. Use rounding correctly in your calculations.
10. Keep good records of your measurements and calculations.
11. When repeated measurements give the different readings use standard deviation to judge
quality of measurements.
To calculate the standard deviation for a sample of N measurements:
 Sum all the measurements and divide by N to get the average, or mean.
 Now, subtract this average from each of the N measurements to obtain N "deviations".
 Square each of these N deviations and add them all up.
 Divide this result by (N − 1) and take the square root.

12. If past measurements are ever called into doubt, such records can be very useful.
13. Keep a note of any extra information that may be relevant.
Q.3. Differentiate between static and dynamic characteristics of an instrument.
Answer:
The performance characteristics of an instrument are mainly divided into two categories:
i) Static characteristics
ii) Dynamic characteristics

Static characteristics of an instrument:


The set of criteria defined for the instruments, which are used to measure the quantities which
are slowly varying with time or mostly constant, i.e., do not vary with time, is called  ‘static
characteristics’.

The various static characteristics are:


i) Accuracy
ii) Precision
iii) Sensitivity
iv) Linearity
v) Reproducibility
vi) Repeatability
vii) Resolution
viii) Threshold
ix) Drift
x) Stability
xi) Tolerance
xii) Range or span

Accuracy:
It is the degree of closeness with which the reading approaches the true value of the quantity to
be measured. The accuracy can be expressed in following ways:

a) Point accuracy: Such accuracy is specified at only one particular point of scale. It does not
give any information about the accuracy at any other Point on the scale.

b) Accuracy as percentage of scale span: When an instrument as uniform scale, its accuracy may
be expressed in terms of scale range.

c) Accuracy as percentage of true value: The best way to conceive the idea of accuracy is to
specify it in terms of the true value of the quantity being measured.

Precision: It is the measure of reproducibility i.e., given a fixed value of a quantity, precision is a
measure of the degree of agreement within a group of measurements. The precision is composed
of two characteristics:

* Conformity:
Consider a resistor having true value as 2385692 , which is being measured by an ohmmeter. But
the reader can read consistently, a value as 2.4 M due to the nonavailability of proper scale. The
error created due to the limitation of the scale reading is a precision error.

* Number of significant figures:


The precision of the measurement is obtained from the number of significant figures, in which
the reading is expressed. The significant figures convey the actual information about the
magnitude & the measurement  precision of the quantity. The precision can be mathematically
expressed as:
Where, P = precision
Xn = Value of nth measurement
Xn = Average value the set of measurement values

Sensitivity:

The sensitivity denotes the smallest change in the measured variable to which the instrument
responds. It is defined as the ratio of the changes in the output of an instrument to a change in the
value of the quantity to be measured. Mathematically it is expressed as,

Thus, if the calibration curve is liner, as shown, the sensitivity of the instrument is the slope of
the calibration curve. If the calibration curve is not linear as shown, then the sensitivity varies
with the input. Inverse sensitivity or deflection factor is defined as the reciprocal of sensitivity.
Inverse sensitivity or deflection factor = 1/ sensitivity
Reproducibility:
It is the degree of closeness with which a given value may be repeatedly measured. It is specified
in terms of scale readings over a given period of time.

Repeatability:
It is defined as the variation of scale reading & random in nature Drift. Drift may be classified
into three categories:
a) zero drift:
If the whole calibration gradually shifts due to slippage, permanent set, or due to undue warming
up of electronic tube circuits, zero drift sets in.
b) span drift or sensitivity drift
If there is proportional change in the indication all along the upward scale, the drifts is called
span drift or sensitivity drift.

c) Zonal drift:
In case the drift occurs only a portion of span of an instrument, it is called zonal drift.

Resolution:
If the input is slowly increased from some arbitrary input value, it will again be found that output
does not change at all until a certain increment is exceeded. This increment is called resolution.

Threshold:
If the instrument input is increased very gradually from zero there will be some minimum value
below which no output change can be detected. This minimum value defines the threshold of the
instrument.

Stability:
It is the ability of an instrument to retain its performance throughout is
specified operating life.

Tolerance:
The maximum allowable error in the measurement is specified in terms of some value which is
called tolerance.

Range or span:
The minimum & maximum values of a quantity for which an instrument is designed to measure
is called its range or span.
Dynamic characteristics:
The set of criteria defined for the instruments, which are changes rapidly with time, is
called ‘dynamic characteristics’.
The various static characteristics are:
i) Speed of response
ii) Measuring lag
iii) Fidelity
iv) Dynamic error

Speed of response:
It is defined as the rapidity with which a measurement system responds to changes in the
measured quantity.

Measuring lag:
It is the retardation or delay in the response of a measurement system to changes in the measured
quantity. The measuring lags are of two types:

a) Retardation type: In this case the response of the measurement system begins
immediately after the change in measured quantity has occurred.

b) Time delay lag: In this case the response of the measurement system begins after a dead
time after the application of the input.

Fidelity: It is defined as the degree to which a measurement system indicates changes in the
measurand quantity without dynamic error.

Dynamic error:
It is the difference between the true value of the quantity changing with time & the value
indicated by the measurement system if no static error is assumed. It is also called measurement
error.
Q. 4. Explain the following:

i) Calibration of Instrument:

Calibration is the act of ensuring that a method or instrument used in measurement will
produce accurate results. There are two common calibration procedures: using a working curve,
and the standard-addition method. Both of these methods require one or more standards of
known composition to calibrate the measurement. Instrumental methods are usually calibrated
with standards that are prepared (or purchased) using a non-instrumental analysis. There are two
direct analytical methods: gravimetry and coulometry. Titration is similar but requires
preparation of a primary standard. The chief advantage of the working curve method is that it is
rapid: a single set of standards can be used for the measurement of multiple samples. The
standard-addition method requires multiple measurements for each sample, but can reduce
inaccuracies due to interferences and matrix effects.

Calibration is a comparison between a known measurement (the standard) and the


measurement using your instrument. Typically, the accuracy of the standard should be ten times
the accuracy of the measuring device being tested. However, accuracy ratio of 3:1 is acceptable
by most standards organizations. Calibration of your measuring instruments has two objectives.
It checks the accuracy of the instrument and it determines the traceability of the measurement. In
practice, calibration also includes repair of the device if it is out of calibration. A report is
provided by the calibration expert, which shows the error in measurements with the measuring
device before and after the calibration.

Example: micrometer
Here, accuracy of the scale is the main parameter for calibration. In addition, these instruments
are also calibrated for zero error in the fully closed position and flatness and parallelism of the
measuring surfaces. For the calibration of the scale, a calibrated slip gauge is used. A calibrated
optical flat is used to check the flatness and parallelism.

The accuracy of all measuring devices degrade over time. This is typically caused by
normal wear and tear. However, changes in accuracy can also be caused by electric or
mechanical shock or a hazardous manufacturing environment (e.x., oils, metal chips etc.).
Depending on the type of the instrument and the environment in which it is being used, it may
degrade very quickly or over a long period of time. The bottom line is that, calibration improves
the accuracy of the measuring device. Accurate measuring devices improve product quality.
A measuring device should be calibrated:

 According to recommendation of the manufacturer.


 After any mechanical or electrical shock.
 Periodically (annually, quarterly, monthly)

Hidden costs and risks associated with the un-calibrated measuring device could be much higher
than the cost of calibration. Therefore, it is recommended that the measuring instruments are
calibrated regularly by a reputable company to ensure that errors associated with the
measurements are in the acceptable range.
ii) Instrument Error:
Errors of observation and measurement caused by the imperfection of instruments (such as the in
evitable differences between the real instrument and the “ideal,” which is represented by its geo-
etric scheme) and by the inaccurate installation of the instrument in operating position. Instrumen
t errors must be taken into account in measurements requiring high accuracy; if they are disregar
d-ed, systematic errors result, which may make the measurement data largely worthless.
Calculations of instrument errors are of particular importance in astronomy, geodesy, and other
sciences in which extremely precise measurements are required. In view of this, the development
of research methods on instrument errors and the elimination of their effects on observational an
d measurement data is one of the principal tasks of the theory of measuring instruments. Instrum
ent errors may be subdivided into three categories.
Errors related to imperfections in the manufacture of individual instrument parts may not be elim
inated or changed by the observer, but they are carefully studied, and the errors caused by them a
re eliminated by the introduction of the required corrections or by rationally devised measuremen
t methods that eliminate their effect on the final results. Instrument errors belonging to this categ
ory include errors of the scale divisions on divided circles, according to which the direction to an 
object under observation is read; errors of the scale divisions of measuring instruments; eccentric
ity errors, which arise from a discrepancy between the center of rotation of a divided circle or ali
dade and the center of the circle’s divisions; periodic and running errors of micrometer screws, c
aused by imperfections in their threading or assembly; errors caused by flexure of instrument par
ts; and errors associated with optical instruments, such as distortion, astigmatism, and coma.
Errors related to faulty assembly and adjustment of instruments and to insufficient accuracy in in
stallation in the position required by the theory of a particular method of observation include coll
imation error, which is a deviation from the 90° angle between the line of sight and the transit axi
s; errors associated with the inclination of an instrument’s horizontal axis to the horizon and its i
naccurate installation (setting) at the required azimuth; inaccurate centering of objective lenses; a
nd some errors of recording equipment. Instrument errors of this category, which may be detecte
d by instrument control tests, may be reduced to a minimum by adjustment of certain parts of the 
instrument, for which provision is made in their design. The small fractions of these errors that re
main uncorrected are determined by means of auxiliary devices, such as levels, nadir-horizons, a
nd collimators, or are inferred from the observations (for example, azimuthal error) and their effe
ct is taken into consideration during the processing of the observational data.
The category of errors associated with a change in the properties of an instrument with time, part
icularly those caused by temperature changes, also includes the overall effect of all other errors t
hat are not accounted for by the theory of the instrument. These instrument errors are the most co
mplex. They are especially detrimental, since they develop systematically and are not clearly det
ectable during observations and measurements. They become apparent only upon measurement o
f the same quantity with different instruments. Thus, systematic differences that usually exceed t
he random errors inherent in the methods and instruments by a factor of 1.5–2.0 (sometimes 5–6) 
are always detected in the comparison of star coordinates obtained from observations at different 
observatories or of precisely timed radio-signal corrections determined by different time services
. One of the important tasks is the detection, thorough investigation, and if possible, elimination 
of the sources of instrument errors of this category.
Instrumental Errors 
Instrumental errors will happen due to some of the following reasons

An inherent limitation of Devices 


These errors are integral in devices due to their features namely mechanical arrangement. These
may happen due to the instrument operation as well as the operation or computation of the
instrument. These types of errors will make the mistake to study very low otherwise very high.
For instance – If the apparatus uses the delicate spring then it offers the high-value of
determining measure. These will happen in the apparatus due to the loss of hysteresis or friction.
Abuse of Apparatus 
The error in the instrument happens due to the machinist’s fault. A superior device used in an
unintelligent method may provide a vast result. For instance – the abuse of the apparatus may
cause the breakdown to change the zero of tools, poor early modification, with lead to very high
resistance. Improper observes of these may not reason for lasting harm to the device, except all
the similar, they cause faults.
Effect of Loading 
The most frequent type of this error will occur due to the measurement work in the device. For
instance, as the voltmeter is associated to the high-resistance circuit which will give a false
reading, as well as after it is allied to the low-resistance circuit, this circuit will give the reliable
reading, and then the voltmeter will have the effect of loading on the circuit.
The fault which is caused by this effect will be beaten with the help of meters cleverly.
For illustration, once calculating a low-resistance with the method of ammeter-voltmeter, a
voltmeter will have an extremely high resistance value should be used.

iii) Calibration Error:


Instrument error can occur due to a variety of factors: drift, environment, electrical supply,
addition of components to the output loop, process changes, etc. Since a calibration is performed
by comparing or applying a known signal to the instrument under test, errors are detected by
performing a calibration. An error is the algebraic difference between the indication and the
actual value of the measured variable. Typical errors that occur include:
A) Span Error:
Fig: Span Error
B) Zero Error:

Fig: Zero Error

C) Combined Zero and Span Error:

Fig: Combined Zero and Span Error


D) Linearization Error:
Fig: Linearization Error
Zero and span errors are corrected by performing a calibration. Most instruments are provided
with a means of adjusting the zero and span of the instrument, along with instructions for
performing this adjustment. The zero adjustment is used to produce a parallel shift of the input-
output curve. The span adjustment is used to change the slope of the input-output curve.
Linearization error may be corrected if the instrument has a linearization adjustment. If the
magnitude of the nonlinear error is unacceptable and it cannot be adjusted, the instrument must
be replaced. To detect and correct instrument error, periodic calibrations are performed. Even if a
periodic calibration reveals the instrument is perfect and no adjustment is required, we would not
have known that unless we performed the calibration. And even if adjustments are not required
for several consecutive calibrations, we will still perform the calibration check at the next
scheduled due date. Periodic calibrations to specified tolerances using approved procedures are
an important element of any quality system.
Q.5. Explain five key steps for Instrument calibration and validation.
Answer:

Calibration and validation should be considered as a process that encompasses the entire system,
from sensor performance to the derivation of the data products. Long-term studies for
documenting and understanding global climate change require not only that the remote sensing
instrument be accurately characterized and calibrated, but also that the stability of the instrument
characteristics and calibration be well monitored and evaluated over the life of the mission
through independent measurements. Calibration has a critical role in measurements that involve
several sensors in orbit either simultaneously or sequentially and perhaps not even contiguously.
Finally, there is the need to validate the data products that are the raison d’etre for the sensor. As
illustrated in Figure, five steps are involved in the process of calibration and validation:
instrument characterization, instrument calibration, calibration verification, data quality
assessment, and data product validation. All five steps are necessary to add value to the data,
permitting an accurate, quantitative picture rather than a merely qualitative one.
1. Instrument characterization is the measurement of various specific properties of a sensor.
Most characterizations are performed before launch. They can be performed at the component or
subassembly level, or at the system level. For critical characteristics the measurements should be
performed before and after final assembly to reveal unpredicted sources of error. Characteristics
that are expected to change owing to the rigors of launch must be remeasured on orbit unless it
can be shown that the expected changes will be within the error budget allocated according to the
data product requirements.
2. Sensor calibration is carried out to determine the equivalent physical quantity—for example,
radiant temperature—that stimulates the sensor to produce a specific output signal, for example,
digital counts. This process determines a set of calibration coefficients, kelvin per digital count in
this example. The calibration coefficients transform sensor output into physically meaningful
units. The accuracy of the transformation to physical units depends on the accumulated
uncertainty in the chain of measurements leading back to one of the internationally accepted
units of physical measurement, the SI units. For optimum accuracy and the long-term stability of
an absolute calibration, it is necessary to establish traceability to an SI unit.

   FIGURE: Calibration and validation are considered in five key steps from prelaunch sensor
characterization and calibration to on-orbit data product validation.
However, it is more important to determine to which SI unit the measurement must be traceable
and whether traceability to an SI unit is even necessary. This can be done by examining the
requirements of the data product algorithm. The question of which SI unit to use is answered by
determining which chain of measurements has the lowest accumulated uncertainty. Often it is the
shortest measurement chain and sometimes the most convenient one. In the case of a relative
measurement, that is, in the measurement of a ratio (for example, reflectance), traceability to an
SI unit is meaningless. The accuracy in this case will be a function of all the uncertainties
accumulated in determining the ratio of the outgoing to the incoming radiation.

The specific characterization and calibration requirements and the needed accuracy are
determined by the needs of the algorithms for each data product—that is, the parameters that
must be measured and the accuracy to which they must be known. This list of requirements is
usually presented as a list of specifications in the contract to build the sensor. It is obvious that if
the accuracy requirements are set too high, needless expense will be incurred. If they are set too
low, the algorithm will not produce an acceptable data product. The optimum range should be set
by sensitivity analyses of the algorithms of several key data products.

3. Calibration verification is the process that verifies the estimated accuracy of the calibration
before launch and the stability of the calibration after launch. Prelaunch calibration verification
could take the form of documentation of accumulated uncertainty or it could be determined by
comparisons with other, similar, well-calibrated and well-documented systems. The latter
method of calibration verification is preferred since one or more sources of uncertainty may
have been overlooked in the calibration documentation assembled by a single laboratory. Post
launch calibration verification refers to measurements using the on-board calibration monitoring
system and vicarious calibrations. Vicarious calibrations are those obtained from on-ground
measurements (ground truth), from on-orbit, nearly simultaneous comparisons to other, similar,
well-calibrated sensors or, in the case of bands in the solar reflectance region, from
measurements of sunlight reflected from the Moon.

4. Data quality assessment is the process of determining whether the sensor performance is
impaired or whether the measurement conditions are less than optimal. Data quality assessment
ensures that the algorithm will perform as expected; it is a way of identifying the impacts of
changes in instrument performance on algorithm performance. Instrument degradation or a
partial malfunction will affect data quality. Less-than-ideal atmospheric and/or surface
conditions, and the uncertainties in decoupling atmosphere-surface interactions, will also degrade
the quality of a surface or atmospheric data product. It is necessary to publish the estimated
quality of the data products so that the data user will be able to reach well-informed conclusions.
5. Data product validation is the process of determining the accuracy of the data product,
including quantifying the various sources of errors and bias. It may consist of comparison with
other data products of known accuracy or with independent measurement of the geophysical
variable represented by the product. Data product validation provides a quantitative estimate of
product accuracy, facilitating the use of the products in numerical modeling and comparison of
similar products from different sensors.

Another key issue for calibration and validation is consideration of mission operations factors
such as orbit parameters, orbit maintenance, and launch schedules. These items, which are often
as important to calibration and validation as the properties of the instruments and software, are
not further elaborated on in this report, but the committee considers them important to include in
developing plans for calibration and validation in integrating climate research into NPOESS
operations.

Q.6. What do you mean by Sampling? Enlist the need for sampling. Explain the steps for
sample design.
Answer:
All items in any field of inquiry constitute a ‘Universe’ or ‘Population.’ A complete enumeration
of all items in the ‘population’ is known as a census inquiry. It can be presumed that in such an
inquiry, when all items are covered, no element of chance is left and highest accuracy is
obtained. But in practice this may not be true. Even the slightest element of bias in such an
inquiry will get larger and larger as the number of observation increases. Moreover, there is no
way of checking the element of bias or its extent except through a resurvey or use of sample
checks. Besides, this type of inquiry involves a great deal of time, money and energy. Therefore,
when the field of inquiry is large, this method becomes difficult to adopt because of the
resources involved. At times, this method is practically beyond the reach of ordinary researchers.
Perhaps, government is the only institution which can get the complete enumeration carried out.
Even the government adopts this in very rare cases such as population census conducted once in
a decade. Further, many a time it is not possible to examine every item in the population, and
sometimes it is possible to obtain sufficiently accurate results by studying only a part of total
population. In such cases there is no utility of census surveys. When field studies are undertaken
in practical life, considerations of time and cost almost invariably lead to a selection of
respondents i.e., selection of only a few items. There respondents selected should be as
representative of the total population as possible in order to produce a miniature cross-section.
The selected respondents constitute what is technically called a ‘sample’ and the selection
process is called ‘sampling technique.’ The survey so conducted is known as ‘sample survey’.
Algebraically, let the population size be N and if a part of size n (which is < N) of this population
is selected according to some rule for studying some characteristic of the population, the group
consisting of these n units is known as ‘sample’. Researcher must prepare a sample design for his
study i.e., he must plan how a sample should be selected and of what size such a sample would
be.

Sampling helps a lot in research. It is one of the most important factors which determine the
accuracy of your research/survey result. If anything goes wrong with your sample then it will be
directly reflected in the final result. There are lot of techniques which help us to gather sample
depending upon the need and situation. This blog post tries to explain some of those techniques.

STEPS IN SAMPLE DESIGN

Type of universe: The first step in developing any sample design is to clearly define the set of
objects, technically called the Universe, to be studied. The universe can be finite or infinite. In
finite universe the number of items is certain, but in case of an infinite universe the number of
items is infinite, i.e., we cannot have any idea about the total number of items. The population of
a city, the number of workers in a factory and the like are examples of finite universes, whereas
the number of stars in the sky, listeners of a specific radio programme, throwing of a dice etc. are
examples of infinite universes.
Sampling unit: A decision has to be taken concerning a sampling unit before selecting sample.
Sampling unit may be a geographical one such as state, district, village, etc., or a construction
unit such as house, flat, etc., or it may be a social unit such as family, club, school, etc., or it may
be an individual. The researcher will have to decide one or more of such units that he has to
select for his study.

Source list: It is also known as ‘sampling frame’ from which sample is to be drawn. It contains
the names of all items of a universe (in case of finite universe only). If source list is not
available, researcher has to prepare it. Such a list should be comprehensive, correct, reliable and
appropriate. It is extremely important for the source list to be as representative of the population
as possible.

Size of sample: This refers to the number of items to be selected from the universe to constitute a
sample. This a major problem before a researcher. The size of sample should neither be
excessively large, nor too small. It should be optimum. An optimum sample is one which fulfills
the requirements of efficiency, representativeness, reliability and flexibility. While deciding the
size of sample, researcher must determine the desired precision as also an acceptable confidence
level for the estimate. The size of population variance needs to be considered as in case of larger
variance usually a bigger sample is needed. The size of population must be kept in view for this
also limits the sample size. The parameters of interest in a research study must be kept in view,
while deciding the size of the sample. Costs to dictate the size of sample that we can draw. As
such, budgetary constraint must invariably be taken into consideration when we decide the
sample size.

Parameters of interest: In determining the sample design, one must consider the question of the
specific population parameters which are of interest. For instance, we may be interested in
estimating the proportion of persons with some characteristic in the population, or we may be
interested in knowing some average or the other measure concerning the population. There may
also be important sub-groups in the population about whom we would like to make estimates. All
this has a strong impact upon the sample design we would accept.

Budgetary constraint: Cost considerations, from practical point of view, have a major impact
upon decisions relating to not only the size of the sample but also to the type of sample. This fact
can even lead to the use of a non-probability sample.

Sampling procedure: Finally, the researcher must decide the type of sample he will use i.e., he
must decide about the technique to be used in selecting the items for the sample. In fact, this
technique or procedure stands for the sample design itself. There are several sample designs
(explained in the pages that follow) out of which the researcher must choose one for his study.
Obviously, he must select that design which, for a given sample size and for a given cost, has a
smaller sampling error.
Q. 7. Distinguish between the following:-
i) Accuracy & Precision ii) Statistics & Parameter
iii) Random Sampling & Non- Random Sampling

Accuracy & Precision


Accuracy refers to the closeness of a measured value to a standard or known value. By the term
‘accuracy’, we mean the degree of compliance with the standard measurement, i.e. to which
extent the actual measurement is close to the standard one, i.e. bulls-eye. It measures the
correctness and closeness of the result at the same time by comparing it to the absolute value.
Therefore, the closer the measurement, the higher is the level of accuracy. It mainly depends on
the way; data is collected.

Precision refers to the closeness of two or more measurements to each other. Precision
represents the uniformity or repeatability in the measurements. It is the degree of excellence, in
the performance of an operation or the techniques used to obtain the results. It measures the
extent to which the results are close to each other, i.e. when the measurements are clustered
together. Therefore, the higher the level of precision the less is the variation between
measurements. For instance: Precision is when the same spot is hit, again and again, which is not
necessarily the correct spot.

Key Differences between Accuracy and Precision


Sr. Accuracy Precision
No.
The level of agreement between the actual The level of variation that lies in the
1. measurement and the absolute measurement values of several measurements of the
is called accuracy. same factor is called as precision.

Accuracy represents the nearness of the Precision shows the nearness of an


2. measurement with the actual measurement. individual measurement with those of the
others.
Accuracy is the degree if conformity, i.e. the Precision is the degree of reproducibility,
3. extent to which measurement is correct when which explains the consistency of the
compared to the absolute value. measurements.
4. Accuracy is based on a single factor. Precision is based on more than one factor.
Accuracy is a measure of statistical bias. Precision is the measure of statistical
5.
variability.
Accuracy focuses on systematic errors, i.e. Precision is concerned with random error,
6. the errors caused by the problem in the which occurs periodically with no
instrument. recognizable pattern.

Examples

You can think of accuracy and precision in terms of a basketball player. If the player
always makes a basket, even though he strikes different portions of the rim, he has a high degree
of accuracy. If he doesn't make many baskets but always strikes the same portion of the rim, he
has a high degree of precision. A player whose free throws always make the basket the exact
same way has a high degree of both accuracy and precision.

Take experimental measurements for another example of precision and accuracy. If you
take measurements of the mass of a 50.0-gram standard sample and get values of 47.5, 47.6,
47.5, and 47.7 grams, your scale is precise, but not very accurate. If your scale gives you values
of 49.8, 50.5, 51.0, and 49.6, it is more accurate than the first balance but not as precise. The
more precise scale would be better to use in the lab, providing you made an adjustment for its
error.

Statistics & Parameter

A statistic is defined as a numerical value, which is obtained from a sample of data. It is a


descriptive statistical measure and function of sample observation. A sample is described as a
fraction of the population, which represents the entire population in all its characteristics. The
common use of statistic is to estimate a particular population parameter. From the given
population, it is possible to draw multiple samples, and the result (statistic) obtained from
different samples will vary, which depends on the samples.

A fixed characteristic of population based on all the elements of the population is termed as
the parameter. Here population refers to an aggregate of all units under consideration, which
share common characteristics. It is a numerical value that remains unchanged, as every member
of the population is surveyed to know the parameter. It indicates true value, which is obtained
after the census is conducted.

Key Differences between Statistic and Parameter

Sr. Statistic Parameter


No.
A statistic is a characteristic of a small part The parameter is a fixed measure which
1.
of the population, i.e. sample. describes the target population.
The statistic is a variable and known number The parameter is a fixed and unknown
2. which depend on the sample of the numerical value.
population.
Statistical notations for population sample Statistical notations for population
statistics are:- parameters are:-

 x̄ (x-bar) represents mean  µ (Greek letter mu) represents


3.
 p̂ (p-hat) denotes sample proportion mean,
 standard deviation is labeled as s  P denotes population proportion
 variance is represented by s2  standard deviation is labeled as
 n denotes sample size σ (Greek letter sigma)
 Standard error of mean is represented  variance is represented by σ2
by sx̄  population size is indicated by N
 standard error of proportion is  Standard error of mean is
labeled as sp represented by σx̄
 standardized variate (z) is represented  standard error of proportion is
by (x-x̄)/s labeled as σp
 Coefficient of variation is denoted by  standardized variate (z) is
s/(x̄) represented by (X-µ)/σ
 Coefficient of variation is
denoted by σ/µ.

A researcher wants to know the average weight of females aged 22 years or older in India. The
researcher obtains the average weight of 54 kg, from a random sample of 40 females.
Solution: In the given situation, the statistics are the average weight of 54 kg, calculated from a
simple random sample of 40 females, in India while the parameter is the mean weight of all
females aged 22 years or older.

Random Sampling & Non- Random Sampling

Random Sampling
Random sampling is also known as ‘probability sampling’ or ‘chance sampling’. Under this
sampling design, every item of the universe has an equal chance of inclusion in the sample. It is,
so to say, a lottery method in which individual units are picked up from the whole group not
deliberately but by some mechanical process. The results obtained from probability or random
sampling can be assured in terms of probability i.e., we can measure the errors of estimation or
the significance of results obtained from a random sample, and this fact brings out the superiority
of random sampling design over the deliberate sampling design. Random sampling ensures the
law of Statistical Regularity which states that if on an average the sample chosen is a random
one, the sample will have the same composition and characteristics as the universe. This is the
reason why random sampling is considered as the best technique of selecting a representative
sample. Random sampling from a finite population refers to that method of sample selection
which gives each possible sample combination an equal probability of being picked up and each
item in the entire population to have an equal chance of being included in the sample. This
applies to sampling without replacement i.e., once an item is selected for the sample, it cannot
appear in the sample again.
In brief, the implications of random sampling (or simple random sampling) are:
(a) It gives each element in the population an equal probability of getting into the sample; and all
choices are independent of one another.
(b) It gives each possible sample combination an equal probability of being chosen.
Non- Random Sampling
Non-random sampling is that sampling procedure which does not afford any basis for estimating
the probability that each item in the population has of being included in the sample. Non-random
sampling is also known by different names such as non-probability sampling, deliberate
sampling, purposive sampling and judgment sampling. In this type of sampling, items for the
sample are selected deliberately by the researcher; his choice concerning the items remains
supreme. For instance, if economic conditions of people living in a state are to be studied, a few
towns and villages may be purposively selected for intensive study on the principle that they can
be representative of the entire state. Thus, the judgment of the organizers of the study plays an
important part in this sampling design. In such a design, personal element has a great chance of
entering into the selection of the sample. The investigator may select a sample which shall yield
results favorable to his point of view and if that happens, the entire inquiry may get vitiated.
Thus, there is always the danger of bias entering into this type of sampling technique. Sampling
error in this type of sampling cannot be estimated and the element of bias, great or small, is
always there. As such this sampling design in rarely adopted in large inquires of importance.
However, in small inquiries and researches by individuals, this design may be adopted because
of the relative advantage of time and money inherent in this method of sampling. Quota
sampling is also an example of non-probability sampling. Under quota sampling the interviewers
are simply given quotas to be filled from the different strata, with some restrictions on how they
are to be filled. This type of sampling is very convenient and is relatively inexpensive. But the
samples so selected certainly do not possess the characteristic of random samples. Quota samples
are essentially judgment samples and inferences drawn on their basis are not amenable to
statistical treatment in a formal way.

Sr.
Random Sampling Non-Random Sampling
No.
The sampling technique, in which the A sampling method in which it is not known
subjects of the population get an equal that which individual from the population
1. opportunity to be selected as a will be chosen as a sample, is called non-
representative sample, is known as random sampling.
random sampling.
Random sampling is also known asNon-random sampling also known as non-
2. ‘probability sampling’ or ‘chance
probability sampling, deliberate sampling,
sampling’. purposive sampling and judgment sampling.
The basis of probability sampling is
In non-probability sampling randomization
randomization or chance, so it is also
technique is not applied for selecting a
3.
known as Random sampling. sample. Hence it is considered as Non-
random sampling.
4. In random sampling, the sampler In non- random sampling, the subject is
chooses the representative to be part of chosen arbitrarily, to belong to the sample by
the sample randomly. the researcher.
The chances of selection in are fixed  The selection probability is zero, i.e. it is
5.
and known.  neither specified not known.
It is used when the research is It is used when the research is exploratory
6.
conclusive in nature.
The results generated are free from The results are more or less biased.
7.
bias.
8. Random sampling test hypothesis. Non-random sampling generates hypothesis.

Q. 10. Explain in detail various methods of Data Collection.

Answer:

The task of data collection begins after a research problem has been defined and research
design/plan chalked out. While deciding about the method of data collection to be used for the
study, the researcher should keep in mind two types of data viz., primary and secondary. The
primary data are those which are collected afresh and for the first time, and thus happen to be
original in character. The secondary data, on the other hand, are those which have already been
collected by someone else and which have already been passed through the statistical process.

Data Collection
Methods

Primary Data Collection

There are several methods of collecting primary data, particularly in surveys and descriptive
researches. Important ones are: (i) observation method, (ii) interview method, (iii) through
questionnaires, (iv) through schedules, and (v) other methods.
Observation Method
The observation method is the most commonly used method specially in studies relating to
behavioral sciences. Observation becomes a scientific tool and the method of data collection for
the researcher, when it serves a formulated research purpose, is systematically planned and
recorded and is subjected to checks and controls on validity and reliability.

Interview Method
The interview method of collecting data involves presentation of oral-verbal stimuli and reply in
terms of oral-verbal responses. This method can be used through personal interviews and, if
possible, through telephone interviews.

Collection of Data through Questionnaires


This method of data collection is quite popular, particularly in case of big enquiries. It is being
adopted by private individuals, research workers, private and public organizations and even by
governments. In this method a questionnaire is sent (usually by post) to the persons concerned
with a request to answer the questions and return the questionnaire. A questionnaire consists of a
number of questions printed or typed in a definite order on a form or set of forms. The
questionnaire is mailed to respondents who are expected to read and understand the questions
and write down the reply in the space meant for the purpose in the questionnaire itself. The
respondents have to answer the questions on their own.

Collection of Data through Schedules


This method of data collection is very much like the collection of data through questionnaire,
with little difference which lies in the fact that schedules (proforma containing a set of questions)
are being filled in by the enumerators who are specially appointed for the purpose. These
enumerators along with schedules, go to respondents, put to them the questions from the
proforma in the order the questions are listed and record the replies in the space meant for the
same in the proforma. In certain situations, schedules may be handed over to respondents and
enumerators may help them in recording their answers to various questions in the said schedules.
Enumerators explain the aims and objects of the investigation and also remove the difficulties
which any respondent may feel in understanding the implications of a particular question or the
definition or concept of difficult terms.

Some Other Methods of Data Collection


Some other methods of data collection includes (a) warranty cards; (b) distributor or store
audits; (c) pantry audits; (d) consumer panels; (e) using mechanical devices; (f) through
projective techniques; (g) depth interviews, and (h) content analysis
Secondary Data Collection
Secondary data means data that are already available i.e., they refer to the data which have
already been collected and analyzed by someone else. When the researcher utilizes secondary
data, then he has to look into various sources from where he can obtain them. In this case he is
certainly not confronted with the problems that are usually associated with the collection of
original data. Secondary data may either be published data or unpublished data. Usually
published data are available in: (a) various publications of the central, state are local
governments; (b) various publications of foreign governments or of international bodies and their
subsidiary organizations; (c) technical and trade journals; (d) books, magazines and newspapers;
(e) reports and publications of various associations connected with business and industry, banks,
stock exchanges, etc.; (f) reports prepared by research scholars, universities, economists, etc. in
different fields; and (g) public records and statistics, historical documents, and other sources of
published information. The sources of unpublished data are many; they may be found in diaries,
letters, unpublished biographies and autobiographies and also may be available with scholars and
research workers, trade associations, labor bureaus and other public/ private individuals and
organizations. Researcher must be very careful in using secondary data. He must make a minute
scrutiny because it is just possible that the secondary data may be unsuitable or may be
inadequate in the context of the problem which the researcher wants to study. In this connection
Dr. A.L. Bowley very aptly observes that it is never safe to take published statistics at their face
value without knowing their meaning and limitations and it is always necessary to criticize
arguments that can be based on them.

Q.11. How does the case study method differ from the survey method? Enumerates the
meritsand limitations of case study method.

Answer:
Case Study Research:
 It is an empirical inquiry that investigates a contemporary phenomenon within its real-life
context; when the boundaries between phenomenon and context are not clearly evident;
and in which multiple sources of evidence are used.
 Case study research design has evolved over the past few years as a useful tool for
investigating trends and specific situations in many scientific disciplines e.g. social
science, psychology, anthropology and ecology.
 In doing case study research, the "case" being studied may be an individual, organization,
event, or action, existing in a specific time and place.
 If the case study is about a group, it describes the behavior of the group as a whole, not
the behavior of each individual in the group.
 However, when "case" is used in an abstract sense, as in a claim, a proposition, or an
argument, such a case can be the subject of many research methods, not just case study
research.
 Case studies may involve both qualitative and quantitative research methods.Case studies
are analyses of persons, events, decisions, periods, projects, policies, institutions, or other
systems that are studied holistically by one or more methods.
 The case that is the subject of the inquiry will be an instance of a class of phenomena that
provides an analytical frame — an object — within which the study is conducted and
which the case illuminates and explicates”.
Example: effect of medicine or drug on an individual Or a group or a segment; impact of a
policy on a certain segment of population etc

Survey Research:
 Survey research is a method of collecting information by asking questions. Sometimes
interviews are done face-to-face with people at home, in school, or at work. Other times
questions are sent in the mail for people to answer and mail back. Increasingly, surveys
are conducted by telephone.
 Survey Research may be defined as a technique whereby the researcher studies the whole
population with respect to certain sociological and psychological variables.
 The survey researcher is primarily interested in assessing the characteristics of the whole
population. It’s not possible (PRACTICALLY). So, a random sample (representative of
the population is taken).
 Survey Research depends upon 3 important factors : 1) Direct contact with the sample 2)
Success of survey research depends upon the willingness and cooperativeness of the
sample selected for study. 3) Researcher must be a trained personnel. • Social Intelligence
• Manipulative skill • Research insight
Example: Market research, process improvement, system overhaul, customer satisfaction etc

In any real life scenario we face things like - 1) descriptive - where we describe a things, a
process or a system 2) predictive - here we predict something based on our existing knowledge -
what might happen with certain variable change and 3) normative - here we standardize certain
thing and set criterion and fact and figure on that product and service to define its quality. Case
study is descriptive in nature. Survey research might be descriptive and predictive in
nature.These types of methods can be broken down into two types, quantitative and qualitative. 
Experimental research is quantitative.  The reasoning is deductive.  The researcher is verifying a
hypothesis.  The methods typically include facts, figures and calculations. Case studies and
action research are qualitative techniques.  The reasoning is inductive.  You have a lot of
information and the researcher interprets the themes that emerge from the data.  With case
studies, the researcher uses a lot of (rich) description about an individual, group or locale.  What
can we learn from this case study?  If multiple case studies are created, what are the themes that
emerge?
The case study brings together multiple sources of information and integrates them
relative to particular model of behavior. The survey approach selects a random sample of a
designated population and elicits their responses to a set of
questions about a given issue, which can be summed into a score which is part of a normal
distribution.

Merits of Case Study Method:


Following are the advantage of case study Method

1. Intensive Study. Case study method is responsible for intensive study of a unit. It is the
investigation and exploration of an event thoroughly and deeply.
2. No Sampling. It studies a social unit in its entire perspectives. It means there is no
sampling in case study method.
3. Continuous Analysis. It is valuable in analyzing continuously the life of a social unit
to dig out the facts.
4. Hypothesis Formulation. This method is useful for formulation of hypothesis for
further study.
5. Comparisons. It compares different type of facts about the study of a unity.
6. Increase in Knowledge. It gives the analytical power of a person to increase
knowledge about a social phenomenon.
7. Generalization of Data. Case study method provides grounds for generalization of
data for illustrating statistical findings.
8. Comprehensive. It is a comprehensive method of data collection in social research.
9. Locate Deviant Cases. The deviant cases are these units which behave against the
proposed hypothesis. So, it locate these deviant cases. The tendency is to ignore them
but are important for scientific study.
10. Farming Questionnaire or Schedule. Through case study method we can formulate
and develop a questionnaire and schedule.

Disadvantage of Case Study Method:


Case study method has the following disadvantages
1. Limited Representatives:Due to as narrow focuses a case study has limited
representatives and generalization is impossible.
2. No Classification: Any classification is not possible due to studying a small unit.
3. Possibility of Errors: Case study method may have the errors of memory and
judgment.
4. Subjective Method: It is a subjective method rather than objective.
5. No Easy and Simple:This method is very difficult and no layman can conduct this
method.
6. Bias Can Occur: Due to narrow study the discrimination & bias can occurs in the
investigation of a social unit.
7. No Fixed Limits:This method is depend on situation and have no fixed limits of
investigation of the researcher.
8. Costly and Time Consuming: This method is more costly and time consuming as
compare to other methods of data collection.

Q.13. Differentiate between collection of data through questionnaires and schedules.


Answer:
Both Questionnaire and schedule are popularly used methods of data collection. The difference
between two is given below:
(i) The questionnaire is sent by post/e-mail to respondents with a covering letter, without further
assistance from the researcher. The schedule is filled out by the researcher, who can interpret
questions if necessary.
(ii) To collect data through questionnaire is cheap since money spent only in preparing the
questionnaire and post/e-mail the same to respondents. Data collection by schedule is expensive
since more money is spent in preparing and taking the schedule to respondents.
(iii) No-response is high in questionnaire but it is low in schedule.
(iv) In questionnaire, respondent is not clear, but in schedule, respondent is known.
(v) The questionnaire method is slow, but the schedule method is not slow.
(vi) Personal contact is not possible in questionnaire method, personal contact is carried out
directly with respondents in schedule method.
(vii) Questionnaire method is used only, when the respondent is literate and cooperative. But
schedule method is used with illiterate respondent.
(viii) In questionnaire method, risk of collecting incomplete and wrong data is more. But in
schedule method, we get accurate data.

Q. 14. Differentiate between Survey and Experiment.


Answer:
Q. 15. Explain the procedure for Hypothesis Testing.
Answer:
The first step of a research is formulating a research problem. By hypothesis our research
problem is more focused. There are two methods of testing the hypothesis:
1) Null hypothesis H0:µ = µ
2) Alternative hypothesis Ha: µ≠ µ where µ- population mean.

Type I and Type II errors of Hypothesis:

In drawing conclusion about a hypothesis testing, two types of error can occur.
Type I error= Rejection of a null hypothesis when it is true.
Type II error= Acceptance of a null hypothesis when it is false.

Q.16. Differentiate between the following:


1)Null Hypothesis & Alternative hypothesis:
Null Hypothesis Alternative hypothesis
The alternative or experimental hypothesis reflects
The null hypothesis reflects that there will be no that there will be an observed effect for our
observed effect in our experiment. In a experiment. In a mathematical formulation of the
mathematical formulation of the null hypothesis, alternative hypothesis, there will typically be an
there will typically be an equal sign. This inequality, or not equal to symbol. This hypothesis
hypothesis is denoted by H0. is denoted by either Ha or by H1.
The null hypothesis is what we attempt to find The alternative hypothesis is what we are
evidence against in our hypothesis test. We hope attempting to demonstrate in an indirect way by
to obtain a small enough p-value that it is lower the use of our hypothesis test.
than our level of significance alpha and we are
justified in rejecting the null hypothesis. If our p-
value is greater than alpha, then we fail to
reject the null hypothesis.
If the null hypothesis is not rejected, then we If the null hypothesis is rejected, then we accept
must be careful to say what this means. The the alternative hypothesis.
thinking on this is similar to a legal verdict. Just
because a person has been declared "not guilty",
it does not mean that he is innocent. In the same
way, just because we failed to reject a null
hypothesis it does not mean that the statement is
true.
If the null hypothesis is not rejected, then we If the null hypothesis is not rejected, then we do
must be careful to say what this means. The not accept the alternative hypothesis. 
thinking on this is similar to a legal verdict. Just
because a person has been declared "not guilty",
it does not mean that he is innocent. In the same
way, just because we failed to reject a null
hypothesis it does not mean that the statement is
true
Going back to the above example of mean human
For example, we may want to investigate the body temperature, the alternative hypothesis is
claim that despite what convention has told us, “The average adult human body temperature is not
the mean adult body temperature is not the 98.6 degrees Fahrenheit.”
accepted value of 98.6 degrees Fahrenheit. The
null hypothesis for an experiment to investigate
this is “The mean adult body temperature for
healthy individuals is 98.6 degrees Fahrenheit.” If
we fail to reject the null hypothesis, then our
working hypothesis remains that the average
adult who is healthy has a temperature of 98.6
degrees. We do not prove that this is true.
If we are studying a new treatment, the null If we are studying a new treatment, then the
hypothesis is that our treatment will not change alternative hypothesis is that our treatment does, in
our subjects in any meaningful way. In other fact, change our subjects in a meaningful and
words, the treatment will not produce any effect measurable way.
in our subjects
Null hypothesis: “x is equal to y.” Alternative hypothesis “x is not equal to y.”
Null hypothesis: “x is at least y.” Alternative hypothesis “x is less than y.”
Null hypothesis: “x is at most y.” Alternative hypothesis “x is greater than y.”

ii)One-tailed test and Two-tailed Test:

One-tailed test Two-tailed Test


A one-tailed test allows you to determine if one A two-tailed test allows you to determine if
mean is greater or less than another mean, but two means are different from one another. A
not both. A direction must be chosen prior to direction does not have to be specified prior
testing. to testing.

In other words, a one-tailed test tells you the In other words, a two-tailed test will taken
effect of a change in one direction and not the into account the possibility of both a positive
other. Think of it this way, if you are trying to and a negative effect.
decide if you should buy a brand name product
or a generic product at your local drugstore, a
one-tailed test of the effectiveness of the product
would only tell you if the generic product
worked better than the brand name. You would
have no insight into whether the product was
equivalent or worse
Since the generic product is cheaper, you could Let’s head back to the drug store. If you were
see what looks like a minimal impact, but is in doing a two-tailed test of the generic against
fact a negative impact (meaning it doesn’t work the brand name product, you would have
very well at all!), but you go ahead and purchase insight into whether the effectiveness of the
the generic product because it is cheaper. product was equivalent or worse than the
brand name product. In this instance, you can
make a more educated decision because if the
generic product is equivalent, you would
purchase it because it is cheaper, but if it is far
less effective than the brand name product,
you’d probably shell out the extra money for
the brand name product. You wouldn’t want
to waste your money on an ineffective
product, would you?
If this is the case, you’re probably wondering So when should a two-tailed test be used?
when a one-tailed test should be used? One- Two-tailed tests should be used when you are
tailed tests should be used only when you are not willing to accept any of the following: one
worried about missing an effect in the untested mean being greater, lower or similar to the
direction. other.
But how does this impact optimization? If you’re And how does this impact
running a test and only using a one-tailed test, optimization? When running a test, if you are
you will only see significance if your new using a two-tailed test you will see
variant outperforms the default. There are 2 significance if your new variant’s mean is
outcomes: the new variants wins or we cannot different from that of the default. There are 3
distinguish it from the default. outcomes: the new variant wins, loses or is
similar to the default.

iii)Type-I error and Type-II error:

Type-I error Type-II error

The first kind of error that is possible involves The other kind of error that is possible occurs
the rejection of a null hypothesis that is when we do not reject a null hypothesis that is
actually true. This kind of error is called a false. This sort of error is called a type II error
type I error and is sometimes called an error and is also referred to as an error of the
of the first kind. second kind.

Type I errors are equivalent to false positives. If we think back again to the scenario in
Let’s go back to the example of a drug being which we are testing a drug, what would a
used to treat a disease. If we reject the null type II error look like?
hypothesis in this situation, then our claim is
that the drug does, in fact, have some effect
on a disease. 

But if the null hypothesis is true, then, in A type II error would occur if we accepted
reality, the drug does not combat the disease that the drug had no effect on a disease, but in
at all. The drug is falsely claimed to have a reality, it did.
positive effect on a disease.

The value of alpha, which is related to The value of alpha, which is related to
the level of significance that we selected has a the level of confidence that we selected has a
direct bearing on type I errors. Alpha is the direct bearing on type II errors. The
maximum probability that we have a type I probability of a type II error is given by the
error.  Greek letter beta

For a 95% confidence level, the value of We could decrease the value of alpha from
alpha is 0.05. This means that there is a 5% 0.05 to 0.01, corresponding to a 99% level of
probability that we will reject a true null confidence. However, if everything else
hypothesis. remains the same, then the probability of a
type II error will nearly always increase

In the long run, one out of every twenty This number is related to the power or
hypothesis tests that we perform at this level sensitivity of the hypothesis test, denoted by 1
will result in a type I error. – beta.
Q. 17. Explain Generalization & Interpretation applied statistics.

After collecting and analyzing the data, the researcher has to accomplish the task of drawing
inferences followed by report writing. This has to be done very carefully, otherwise misleading
conclusions may be drawn and the whole purpose of doing research may get vitiated. It is only
through interpretation that the researcher can expose relations and processes that underlie his
findings. In case of hypotheses testing studies, if hypotheses are tested and upheld several times,
the researcher may arrive at generalizations. But in case the researcher had no hypothesis to start
with, he would try to explain his findings on the basis of some theory. This may at times result in
new questions, leading to further researches. All this analytical information and consequential
inference(s) may well be communicated, preferably through research report, to the consumers of
research results who may be either an individual or a group of individuals or some public/private
organization.
Interpretation refers to the task of drawing inferences from the collected facts after an
analytical and/or experimental study. In fact, it is a search for broader meaning of research
findings. The task of interpretation has two major aspects viz., (i) the effort to establish
continuity in research through linking the results of a given study with those of another, and (ii)
the establishment of some explanatory concepts. “In one sense, interpretation is concerned with
relationships within the collected data, partially overlapping analysis. Interpretation also extends
beyond the data of the study to include the results of other research, theory and hypotheses.”1
Thus, interpretation is the device through which the factors that seem to explain what has been
observed by researcher in the course of the study can be better understood and it also provides a
theoretical conception which can serve as a guide for further researches.
Interpretation is essential for the simple reason that the usefulness and utility of research
findings lie in proper interpretation. It is being considered a basic component of research process
because of the following reasons:
(i) It is through interpretation that the researcher can well understand the abstract principle that
works beneath his findings. Through this he can link up his findings with those of other studies,
having the same abstract principle, and thereby can predict about the concrete world of events.
Fresh inquiries can test these predictions later on. This way the continuity in research can be
maintained.
(ii) Interpretation leads to the establishment of explanatory concepts that can serve as a guide for
future research studies; it opens new avenues of intellectual adventure and stimulates the quest
for more knowledge.
(iii) Researcher can better appreciate only through interpretation why his findings are what they
are and can make others to understand the real significance of his research findings.
(iv) The interpretation of the findings of exploratory research study often results into hypotheses
for experimental research and as such interpretation is involved in the transition from exploratory
to experimental research. Since an exploratory study does not have a hypothesis to start with, the
findings of such a study have to be interpreted on a post-factum basis in which case the
interpretation is technically described as ‘post factum’ interpretation.
Q.20.What are the different methods of data analysis? Explain.
Answer:
There are differences between qualitative data analysis and quantitative data analysis. In
qualitative researches using interviews, focus groups, experiments etc. data analysis is going to
involve identifying common patterns within the responses and critically analyzing them in order
to achieve research aims and objectives.Data analysis for quantitative studies, on the other hand,
involves critical analysis and interpretation of figures and numbers, and attempts to find rationale
behind the emergence of main findings. Comparisons of primary research findings to the
findings of the literature review are critically important for both types of studies – qualitative and
quantitative.Data analysis methods in the absence of primary data collection can involve
discussing common patterns, as well as, controversies within secondary data directly related to
the research area.
What is the first thing that comes to mind when we see data? The first instinct is to find patterns,
connections, and relationships. We look at the data to find meaning in it.Similarly, in research,
once data is collected, the next step is to get insights from it. For example, if a clothing brand is
trying to identify the latest trends among young women, the brand will first reach out to young
women and ask them questions relevant to the research objective. After collecting this
information, the brand will analyze that data to identify patterns for example, it may discover
that most young women would like to see more variety of jeans.
Data analysis is how researchers go from a mass of data to meaningful insights. There are many
different data analysis methods, depending on the type of research. Here are a few methods you
can use to analyze quantitative and qualitative data.

 Analyzing Quantitative Data:

Data Preparation

The first stage of analyzing data is data preparation, where the aim is to convert raw data into
something meaningful and readable. It includes four steps:

Step 1: Data Validation


The purpose of data validation is to find out, as far as possible, whether the data collection was
done as per the pre-set standards and without any bias. It is a four-step process, which includes…

 Fraud, to infer whether each respondent was actually interviewed or not.


 Screening, to make sure that respondents were chosen as per the research criteria.
 Procedure, to check whether the data collection procedure was duly followed.
 Completeness, to ensure that the interviewer asked the respondent all the questions,
rather than just a few required ones.
To do this, researchers would need to pick a random sample of completed surveys and validate
the collected data. (Note that this can be time-consuming for surveys with lots of responses.) For
example, imagine a survey with 200 respondents split into 2 cities. The researcher can pick a
sample of 20 random respondents from each city. After this, the researcher can reach out to them
through email or phone and check their responses to a certain set of questions

Step 2: Data Editing


Typically, large data sets include errors. For example, respondents may fill fields incorrectly or
skip them accidentally. To make sure that there are no such errors, the researcher should
conduct basic data checks, check for outliers, and edit the raw research data to identify and clear
out any data points that may hamper the accuracy of the results.
For example, an error could be fields that were left empty by respondents. While editing the data,
it is important to make sure to remove or fill all the empty fields. (Here are 4 methods to deal
with missing data.)

Step 3: Data Coding


This is one of the most important steps in data preparation. It refers to grouping and assigning
values to responses from the survey.

For example, if a researcher has interviewed 1,000 people and now wants to find the average age
of the respondents, the researcher will create age buckets and categorize the age of each of the
respondent as per these codes. (For example, respondents between 13-15 years old would have
their age coded as 0, 16-18 as 1, 18-20 as 2, etc.)Then during analysis, the researcher can deal
with simplified age brackets, rather than a massive range of individual ages.
Quantitative Data Analysis Methods: After these steps, the data is ready for analysis. The
two most commonly used quantitative data analysis methods are descriptive statistics and
inferential statistics.

Descriptive Statistics
Typically descriptive statistics (also known as descriptive analysis) is the first level of analysis. It
helps researchers summarize the data and find patterns. A few commonly used descriptive
statistics are:

Mean: numerical average of a set of values.


Median: midpoint of a set of numerical values.
Mode: most common value among a set of values.
Percentage: used to express how a value or group of respondents within the data relates to a
larger group of respondents.
Frequency: the number of times a value is found.
Range: the highest and lowest value in a set of values.
Descriptive statistics provide absolute numbers. However, they do not explain the rationale or
reasoning behind those numbers. Before applying descriptive statistics, it’s important to think
about which one is best suited for your research question and what you want to show. For
example, a percentage is a good way to show the gender distribution of respondents.

Descriptive statistics are most helpful when the research is limited to the sample and does not
need to be generalized to a larger population. For example, if you are comparing the percentage
of children vaccinated in two different villages, then descriptive statistics is enough. Since
descriptive analysis is mostly used for analyzing single variable, it is often called univariate
analysis.
 Analyzing Qualitative Data:

Qualitative data analysis works a little differently from quantitative data, primarily because
qualitative data is made up of words, observations, images, and even symbols. Deriving absolute
meaning from such data is nearly impossible; hence, it is mostly used for exploratory research.
While in quantitative research there is a clear distinction between the data preparation and data
analysis stage, analysis for qualitative research often begins as soon as the data is available.

Data Preparation and Basic Data Analysis:

Analysis and preparation happen in parallel and include the following steps:

1. Getting familiar with the data: Since most qualitative data is just words, the researcher
should start by reading the data several times to get familiar with it and start looking for
basic observations or patterns. This also includes transcribing the data.

2. Revisiting research objectives: Here, the researcher revisits the research objective and
identifies the questions that can be answered through the collected data.

3. Developing a framework: Also known as coding or indexing, here the researcher


identifies broad ideas, concepts, behaviors, or phrases and assigns codes to them. For
example, coding age, gender, socio-economic status, and even concepts such as the
positive or negative response to a question. Coding is helpful in structuring and labeling
the data.
4. Identifying patterns and connections: Once the data is coded, the research can start
identifying themes, looking for the most common responses to questions, identifying data
or patterns that can answer research questions, and finding areas that can be explored
further.

Qualitative Data Analysis Methods:

Several methods are available to analyze qualitative data. The most commonly used data analysis
methods are:

 Content analysis: This is one of the most common methods to analyze qualitative data. It
is used to analyze documented information in the form of texts, media, or even physical
items. When to use this method depends on the research questions. Content analysis is
usually used to analyze responses from interviewees.
 Narrative analysis: This method is used to analyze content from various sources, such as
interviews of respondents, observations from the field, or surveys. It focuses on using the
stories and experiences shared by people to answer the research questions.

 Discourse analysis: Like narrative analysis, discourse analysis is used to analyze


interactions with people. However, it focuses on analyzing the social context in which the
communication between the researcher and the respondent occurred. Discourse analysis
also looks at the respondent’s day-to-day environment and uses that information during
analysis.

 Grounded theory: This refers to using qualitative data to explain why a certain
phenomenon happened. It does this by studying a variety of similar cases in different
settings and using the data to derive causal explanations. Researchers may alter the
explanations or create new ones as they study more cases until they arrive at an
explanation that fits all cases.

You might also like