Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
42 views

Informatics: Information Sources

This document discusses various sources of information that can be collected and analyzed. It categorizes sources as either literal, meaning they are in a directly understandable format, or non-literal, requiring interpretation. Literal sources include open source information from the internet, human intelligence from interviews, communications intelligence from intercepted signals, and cyber intelligence from networks. Non-literal sources require analysis and include imagery from satellites and aircraft, signals from sensors, radar scans, and biometric data on individuals. The document cautions that not all data is perfectly accurate and some fidelity may be lost, providing challenges for analysis.

Uploaded by

Colin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

Informatics: Information Sources

This document discusses various sources of information that can be collected and analyzed. It categorizes sources as either literal, meaning they are in a directly understandable format, or non-literal, requiring interpretation. Literal sources include open source information from the internet, human intelligence from interviews, communications intelligence from intercepted signals, and cyber intelligence from networks. Non-literal sources require analysis and include imagery from satellites and aircraft, signals from sensors, radar scans, and biometric data on individuals. The document cautions that not all data is perfectly accurate and some fidelity may be lost, providing challenges for analysis.

Uploaded by

Colin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Informatics

Lecture 2
Information Sources

Introduction
This lecture is concerned with the
sources available to us for the huge
variety of information that could be
useful.
Later lectures will consider how to
process the information and prioritise
it.
To begin with consider why we are
collecting information in the first place

Why
We are presumably collecting data in
order to inform a decision that has to
be made by an organisation,
individual or government:
What products to manufacture
What items to market and to who
Make financial projections
Should we monitor an individual
Should we arrest an individual

And so.
The issues associated with this are
varied:
There is often no shortage of data
There is often too much
Its rare that the decision is obvious
Some data will contradict
Some data is simply incorrect
Some data is being hidden from us

The range of sources huge

Publications, books and reports


Retail data
Personal data NHS, education
Social media
Government passports, registers
Gadgets mobile phones etc.
Media news and viewer data
CCTV
Emails, texts, photos, twitter etc.

Off the grid


This range shows how difficult it can
be to be regarded as truly off the grid
Think of a typical day in your life and
the data that you are creating each
second that can be linked to you.
For this reason countries where
government has broken down are
always attractive to criminal elements

Type of content
The range of data sources is very
varied and so as a result is the range
of type of data a limited number of
examples could include:
Numeric quantities to calculate
totals and averages
Text names of people and
objects, significant words

Type of content
Locations and times derived from
GPS data
Video records of events place
people in time and place
Photos also place people,
relationships and identify locations
Narrative opinions and views
Dialogue deception, grooming

Fidelity of data
It is often thought that digital (binary)
data can be stored and transmitted
without any loss of fidelity
We must remember however that a
lot of data originates either from
sensors (e.g. cameras) or is
compressed to reduce file size
This can lead to a loss in fidelity and
so uncertainty

Example - CCTV
The UK is awash with CCTV cameras
but many of them are of such poor
quality that identifying an individual is
very difficult
This is improving and high quality HD
cameras are becoming more
affordable and less storage is being
done on recycled VHS tapes

CCTV

CCTV and photo

Audio
People can be recognised from
their voice and words can be
identified from the dialogue
Most speech (phone etc.) is heavily
compressed to save space and this
can compromise the processing
Of course it is possible to
deliberately disguise a voice with
gadgets made for the purpose

Audio - dialogue
It is possible to capture a
conversation and analyse this for a
number of items of interest:
Age of the speaker
Nationality accents etc.
Angry, stressed, frightened..
Expressing a view or opinion
Being ironic etc.

Ethics
Of course it is possible to gather data
that is in the public domain and this is
increasingly useful
Most governments can covertly
monitor their citizens sometimes
after a legal application
There are of course ethical issues in
gathering data without the knowledge
of the individual see later lectures

Ethics

Is it ethical to gather data from an


individual provided that you dont look
at it without legal process?

Official hacking
One important source of data is of
course that obtained by statesponsored hacking
In this way many nations are turning
to an offensive mode of dealing with
cybercrime
The Flame virus gathers data and is
20x more sophisticated than Stuxnet

Intelligence analysis
Within both the security and
commercial worlds the analysis of the
masses of data, to extract meaning,
to inform decisions, is becoming more
sophisticated
The remainder of this lecture
considers data sources from the
intelligence analysis viewpoint

Literal and Non-Literal Sources


All data sources can be categorised
as either literal or non-literal
Within these two categories further
classifications occur
A taxonomy of sources has been
developed and this can assist in
giving appropriate weight to each
piece of data

Literal Sources
In a form suitable for human
communication.

Open Source
Human
Communication
Cyber

Open Source OSINT


Publicly available information
Internet

The largest repository (not surprisingly)


But what about the quality?
Intentionally misleading?
Perhaps a good starting point for searching
Is this material easily overlooked BECAUSE it is
not classified?

Online databases
Overload, how to extract useful information

Commercial
Imagery satellites
Commercial databases
All for a price!

archives

Human - HUMINT
HUMINT focuses on humans and their access
to information (takes time to acquire)
Often best method of dealing with illicit
networks or for finding:
Opponents plans, trade secrets, certain indicators
And tip-offs

Often gathered by working with others


Liaison relationships with other intelligence networks
Elicitation (drawing out information from
conversations)
Emigrs (legal) / defectors (illegal)
Clandestine sources (spies, moles etc)
Sampling (e.g. a poll to get an indication of opinion)

Communication COMINT
Generally a governmental thing rather than
private (illegal)
The interception, processing and reporting of an
opponents communications
E.g. voice, fax, data comms, internet, any other
deliberate transmission
Collected by aircraft, satellites, ground bases, sea etc.
Insights into plans/intentions (people, organisations,
financial, facilities, budgets, procedures etc)
Relationships? Classified projects?

Intensive on labour to translate the


communications
Radio comms, code cracking, encryption/decryption

Microphone (wire)/audio (radio), telephone


tapping, bugging, satellite storage (illegal to
use?), liaison relationships

Cyber
Collection from an information system or
network (a mix of humint/comint/osint)
Becoming a rich source of intelligence
E.g. target personnel databases for personal
information and possible recruitment as
HUMINT
Low risk of obtaining it rather than spying
The hacker is the offender (and usually
wins), defense is much harder
Does the defender think like the attacker?
Large systems have more vulnerabilities

Cyber
Gain access, exploit with tools, remove
evidence
Survey possible networks, ping a network
(for vulnerabilities), hack it (install software
backdoors), use backdoors to sustain
collection
Sustained collection uses:

Trojan horses
Worms (entirely concealed)
Rootkits (software to avoid detection)
keystroke loggers

Design social engineering attacks

Nonliteral Sources
Require human interpretation

Remote and In situ


Imaging
Radiofrequency
Radar
Geophysical / Nuclear
Material and Materiel
Biometrics

Remote and In-situ


Remote Sensing from satellites of Earth
or vice versa
Can cover large areas quickly
In-situ sensors detect changes in water,
air, earth immediately in the vicinity of the
sensor
E.g. an aircraft carrying such a sensor to
measure effects of a nuclear test on the
atmosphere
They dont possess the broad-area search of
remote sensors i.e. smaller ranges
E.g tracking the trace element signature in
the wake of a submarine

Imaging IMINT
Visible Photography
Camera, aircraft, spacecraft
Open source
Ever zoomed into your own house (or someone
elses) using Google Earth?

Photography/video (handheld)
Imaging radar (mounted on craft)
Electro-optical imaging (not good when cloudy)
Radiometry/spectrometry (heat related
emission)
Spectral imaging (combines above two)

Passive Radio Frequency


These are emitted during normal human
activity
ELINT (Electronic e.g. a motorist detecting
police radar), FISINT (Foreign
Instrumentation)

ELINT useful for tactical intelligence


E.g. tracking a vehicle by pinpointing the
location of the radars they carry

FISINT is telemetry, a means of


deliberately sending signal data back to
ground sites during/after failure of, say,
an aircraft

Radar
Tracking of targets - satellites,
missiles, ships, aircraft, other
vehicles in combat
E.g. a missile trajectory can be
detected by radar
We are all familiar (and thankful) for
radar navigation during inclement
weather
Although the quality of radar imagery
does not match optical it is very good
in poor visual conditions

Geophysical / Nuclear
Collection, processing, exploitation of
environmental disturbances transmitted
through earth, water or atmosphere
E.g. magnetic sensing of vehicles,
submarines

Acoustic (ACINT or ACOUSTINT)


emissions
Seismic intelligence from underground/
water explosions (akin to earthquake
data)
Nuclear radiation detectors

Materiel/ Materials
Usually clandestine and HUMINT
Materials
Particulate, trace elements, effluence,
debris
Nuclear, chemical, biological issues
DNA

Materiel
Equipment, apparatus, supplies
Stealing competitors sample products

Biometrics
Capturing of a persons physical or behavioural
characteristics that identify them
Morse code did this in World War II fist

Face, voice, iris, fingerprint, ear, vein, DNA,


odour!, thermal, gait, hand, palm, typing, soft
biometrics like height, weight, clothes, hair
Research is moving towards multi-modal systems
to integrate the uni-modal systems
Intelligence for border controls in particular
How good are the detection methods?

Summary
Categorisation of data sources
Primary, secondary, tertiary

Storage
Databases
Data warehouses
Social media

Security and business intelligence


Security Sources
Literal: Osint, Humint, Comint, cyber
Nonliteral: measurement

Imint. Elint, Fisint


Radar, photographic
Geophysical
biometrics

Readings
R Clark Intelligence Analysis
Chapter 6 for security part of lecture

Wikipedia is good for definitions of


all acronyms (HUMINT etc) and
further reading

You might also like