RICT User Guide PredictandClassify
RICT User Guide PredictandClassify
User Documentation
This document provides a guide to the interactive Predict and Classify process within RICT.
It contains:
Note that it is focused on how the tool operates rather than the science/rationale behind it.
In order to carry out a Predict & Classify run, the following data needs to be
provided/specified:
EV data needs to be provided in order to feed into the Prediction process which
calculates ‘expected’ index values.
Currently, the following data is required for each site (although this may change in
future if new sets of Predictive Environmental Variables (PEVs) are created):
Sites are classified for each relevant index by dividing Observed Value by Expected
Value to obtain an Environmental Quality Index (EQI). This is then compared against
limits to obtain a classification status (e.g. High).
Therefore, Observed Values need to be provided for each Site for each Index that is
to be classified.
In order for the Prediction and Classification run to be carried out in the way the user
wishes, a number of settings need to be provided. The key ones are:
Note that Number of Iterations can also be provided as a setting, but if not provided,
it will be set to the default value held within the administration section.
The classification process takes account of Bias when varying the Observed Values
and so Bias Values need to be provided for each Index for the Season Code relevant
to the Run.
If no Bias values are provided for an Index then zero used … what if no Bias file?
E.g. defaults file does not exist?
The classification process compares EQIs against limits and so the limits to be used
for the run need to be provided for each index.
3. Options for Providing the Data
The input/output formats for RICT are XML. However, there are a number of options for
providing/specifying the required data.
The XML file(s) must conform to the specified XML schema and an
example of a valid file for one site is provided in Appendix 1.
The user can either create a new XML file(s) or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.
Once created the file(s) can be uploaded to RICT during the run – see
Section 4.
It is expected that many users will maintain their EV data in Excel, and so a
special RICT Data Entry spreadsheet has been created that enables XML
files to be generated from Excel that can be processed by RICT.
More details are being provided on this separately but, briefly, the EV data
has to be entered/copied into the appropriate columns in the ‘Environmental
Variables’ worksheet and then the ‘Start Here’ worksheet is used to generate
the XML format file.
The generated file(s) can then be uploaded to RICT during the run – see
Section 4.
Note that the RICT Data Entry Spreadsheet has been set up so that data from
an existing RIVPACS EV file can be cut and pasted into the spreadsheet if
required.
Note also that the RICT Data Entry Spreadsheet is only applicable for
existing EVs. If new EVs are introduced in future, then a new Data Entry
Spreadsheet will be required and a change made to RICT to recognise the
new data.
As for 3.1 c) except for entering/copying the data into the ‘Observed and
Expected’ worksheet.
As for 3.1 d)
The system has a default settings file defined which contains the settings
that will be used for a run if no other settings file is provided – see Appendix
3 for example.
It is possible to change the file that is defined as the Default Settings file via
the Administration function (see separate guide). This might be useful if a
number of future runs are to have the same settings.
The user can either create a new Settings file or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.
Note that it will still be possible to amend individual settings prior to
scheduling a run.
Note also that the Settings for a particular run are subsequently saved as an
XML format file. Therefore, a new Settings file could be created by using
the Default Settings file and then amending the required settings prior to
scheduling the run.
The system has a default bias file defined which contains the bias values
that will be used for a run if no other bias file is provided – see Appendix 4
for example.
It is possible to change the file that is defined as the Default Bias file via the
Administration function (see separate guide). This might be useful if a
number of future runs are to have the same bias values.
Note that it will still be possible to amend bias values prior to scheduling a
run.
Rather than change the Default Bias file, it is possible to upload a Bias File
for use during the particular run. This would be useful if the required bias
values are significantly different from the defaults.
The user can either create a new Bias file or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.
Note that it will still be possible to amend bias values prior to scheduling a
run.
Note also that the bias values for a particular run are subsequently saved as
an XML format file. Therefore, a new Bias file could be created by using the
Default Bias file and then amending the required values prior to scheduling
the run.
The system has a default limits file defined which contains the limits that
will be used for a run if no other limits file is provided – see Appendix 5 for
example.
It is possible to change the file that is defined as the Default Limits file via
the Administration function (see separate guide). This might be useful if a
number of future runs are to have the same limits.
Note that it will still be possible to amend limits prior to scheduling a run.
Rather than change the Default Limits file, it is possible to upload a Limits
File for use during the particular run. This would be useful if the required
limits are significantly different from the defaults.
The user can either create a new Limits file or amend an existing file.
Information about how this can be done is being provided separately but, for
example, ‘Notepad++’ can easily be used to open, amend and save an
existing XML file.
Note that it will still be possible to amend limits prior to scheduling a run.
Note also that the limits for a particular run are subsequently saved as an
XML format file. Therefore, a new Limits file could be created by using the
Default Limits file and then amending the required values prior to
scheduling the run.
4. Overview of Carrying out a Predict and Classify Run
The stages for carrying out a Predict and Classify Run are as follows:
Full details of how to access and log in to RICT are provided in a separate document.
However, it basically involves typing in the relevant URL to your browser and then
entering your Username and Password.
Click on Run Menu which will result in a page similar to the following being
displayed:
Click on ‘Create a New Run’ which will result in the following page being displayed:
d) Select Run Type
Click on ‘Predict and Classify’ which will result in the following page being
displayed:
If any files are to be uploaded then click on Browse, which will result in a page
similar to the following being displayed:
Then navigate to the required file using normal Windows functionality and either
double-click on the filename or select the filename and click on Open.
The file will then be uploaded to the RICT input area and a page similar to the
following will be displayed. As part of the upload RICT will check to see if the
format is recognised and, if so, the type of file will be displayed.
The above process should be repeated for all files that are to be uploaded.
Once all files have been uploaded then click on Continue. This will result in the files
being processed and validated. A page similar to the following is then displayed:
f) Amend any Data
This option will normally be used to manually enter data but can also be
used to amend any data that has been loaded or add more data files to the
run.
The settings for the run will either have been taken from an uploaded
settings file or the default settings file if no file has been uploaded.
This option can be used to amend the settings if required. Note that these
will only be applicable for the current run. Also note that the settings used
will be saved in a settings file that can then be used for future runs if
required.
Once any required data has been amended, then click on Schedule Run.
This will result in the run being scheduled and the Run Menu being displayed with
the new run at the top – see below.
Note that there is an option to delay the run for a specified period if necessary (e.g. if
it is a large run that is best scheduled outwith normal hours)
The run will initially be displayed with an ‘in progress’ icon and the page will refresh
automatically until the run is complete, when a ‘complete’ icon will be displayed.
Run In progress:
Run complete:
h) View/Extract Results
Once the job is complete then the results can be viewed/extracted as follows:
- Reports
Detail to be added…
- Visualise
Note that a more detailed guide of the Run Menu is provided in a separate document.
Example 1 - Upload XML EVs and OE files (use defaults for rest)
Example 2 - Manual Input of EVs and OEs (use defaults for rest)
Appendix 1 – Sample XML Environmental Variable File
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Datasets NS1:noNamespaceSchemaLocation="ev.xsd" xmlns:NS1="http://www.w3.org/2001/XMLSchema-instance">
<Creator>WQMASTER</Creator>
<Creation_Date>2008-02-14</Creation_Date>
<Name>Single Site</Name>
<Dataset ID="9875" Year="2007">
<Name>9875</Name>
<EV ID="NGR_LETTERS">
<Description>NGR_LETTERS</Description>
<Value>NT</Value>
</EV>
<EV ID="NGR_EAST">
<Description>NGR_EAST</Description>
<Value>08192</Value>
</EV>
<EV ID="NGR_NORTH">
<Description>NGR_NORTH</Description>
<Value>36934</Value>
</EV>
<EV ID="ALTITUDE">
<Description>ALTITUDE</Description>
<Value>190</Value>
</EV>
<EV ID="SLOPE">
<Description>SLOPE</Description>
<Value>1.1</Value>
</EV>
<EV ID="DISCHARGE">
<Description>DISCHARGE</Description>
<Value>3</Value>
</EV>
<EV ID="DIST_FROM_SOURCE">
<Description>DIST_FROM_SOURCE</Description>
<Value>11.4</Value>
</EV>
<EV ID="MEAN_WIDTH">
<Description>MEAN_WIDTH</Description>
<Value>2.625</Value>
</EV>
<EV ID="MEAN_DEPTH">
<Description>MEAN_DEPTH</Description>
<Value>40</Value>
</EV>
<EV ID="ALKALINITY">
<Description>ALKALINITY</Description>
<Value>80.9581</Value>
</EV>
<EV ID="BOULDER_COBBLES">
<Description>BOULDER_COBBLES</Description>
<Value>12.6667</Value>
</EV>
<EV ID="PEBBLES_GRAVEL">
<Description>PEBBLES_GRAVEL</Description>
<Value>46.6667</Value>
</EV>
<EV ID="SAND">
<Description>SAND</Description>
<Value>30.8333</Value>
</EV>
<EV ID="SILT_CLAY">
<Description>SILT_CLAY</Description>
<Value>9.8333</Value>
</EV>
</Dataset>
</Datasets>
Appendix 2 – Sample XML Observed/Expected Index File
<Datasets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="oei.xsd">
<Creator>WQMASTER</Creator>
<Creation_Date>2008-03-06</Creation_Date>
<Name>SEPA_3</Name>
<Dataset Site_ID="9875" Year="2007">
<Index ID="ASPT" Name="ASPT">
<Observed_Value>5.655172413793103448</Observed_Value>
</Index>
<Index ID="BMWP" Name="BMWP">
<Observed_Value>164</Observed_Value>
</Index>
<Index ID="NTAXA" Name="NTAXA">
<Observed_Value>29</Observed_Value>
</Index>
</Dataset>
<Dataset Site_ID="10480" Year="2007">
<Index ID="ASPT" Name="ASPT">
<Observed_Value>4.25</Observed_Value>
</Index>
<Index ID="BMWP" Name="BMWP">
<Observed_Value>51</Observed_Value>
</Index>
<Index ID="NTAXA" Name="NTAXA">
<Observed_Value>12</Observed_Value>
</Index>
</Dataset>
<Dataset Site_ID="11030" Year="2007">
<Index ID="ASPT" Name="ASPT">
<Observed_Value>6.739130434782608696</Observed_Value>
</Index>
<Index ID="BMWP" Name="BMWP">
<Observed_Value>155</Observed_Value>
</Index>
<Index ID="NTAXA" Name="NTAXA">
<Observed_Value>23</Observed_Value>
</Index>
</Dataset>
</Datasets>
Appendix 3 – Sample Settings File
<Datasets xmlns:xdb="http://xmlns.oracle.com/xdb" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="settings.xsd">
<Dataset ID="0" Name="Rict">
<Setting Name="End_Group_Set">
<Value>3</Value>
</Setting>
<Setting Name="Season">
<Value>5</Value>
</Setting>
<Setting Name="Indices_Set">
<Value>2</Value>
</Setting>
<Setting Name="PEV_Set">
<Value>1</Value>
</Setting>
<Setting Name="Output_File_Prefix">
<Value>(Date)_rict_(Run_ID)_</Value>
</Setting>
<Setting Name="Run_Name">
<Value>Sepa_(Run_ID)</Value>
</Setting>
<Setting Name="Multi-Year">
<Value>N</Value>
</Setting>
<Setting Name="Ref Adjust">
<Value>Y</Value>
</Setting>
<Setting Name="Predict_Taxa">
<Value>N</Value>
</Setting>
<Setting Name="Predict_Taxonomic_Level">
<Value>TL1</Value>
</Setting>
<Setting Name="Simulation Iterations">
<Value>500</Value>
</Setting>
</Dataset>
</Datasets>
Appendix 4 – Sample Bias File
<?xml version="1.0" encoding="WINDOWS-1252" standalone='no'?>
<Datasets NS0:noNamespaceSchemaLocation="bias.xsd" xmlns:NS0="http://www.w3.org/2001/XMLSchema-instance">
<Dataset Index_Name="NTAXA" Season_ID="1">
<Value>1.62</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="2">
<Value>1.62</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="3">
<Value>1.62</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="4">
<Value>1.6524</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="5">
<Value>1.6524</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="6">
<Value>1.6524</Value>
</Dataset>
<Dataset Index_Name="NTAXA" Season_ID="7">
<Value>1.7982</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="1">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="2">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="3">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="4">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="5">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="6">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="ASPT" Season_ID="7">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="1">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="2">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="3">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="4">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="5">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="6">
<Value>0</Value>
</Dataset>
<Dataset Index_Name="BMWP" Season_ID="7">
<Value>0</Value>
</Dataset>
</Datasets>
Appendix 5 – Sample Limits File
<Datasets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="limits.xsd">
<Dataset Type="Default" ID="Default" Description="Default Limit Set">
<Index NAME="NTAXA">
<Bucket Classification="H" ID="H" RANK="1">
<Lower_Bound Operator="gte">.8879</Lower_Bound>
<Upper_Bound Operator="gte">10</Upper_Bound>
</Bucket>
<Bucket Classification="G" ID="G" RANK="2">
<Upper_Bound Operator="lt">.8879</Upper_Bound>
<Lower_Bound Operator="gte">.7417</Lower_Bound>
</Bucket>
<Bucket Classification="M" ID="M" RANK="3">
<Upper_Bound Operator="lt">.7417</Upper_Bound>
<Lower_Bound Operator="gte">.5954</Lower_Bound>
</Bucket>
<Bucket Classification="P" ID="P" RANK="4">
<Upper_Bound Operator="lt">.5954</Upper_Bound>
<Lower_Bound Operator="gte">.491</Lower_Bound>
</Bucket>
<Bucket Classification="B" ID="B" RANK="5">
<Upper_Bound Operator="lt">.491</Upper_Bound>
<Lower_Bound Operator="gte">0</Lower_Bound>
</Bucket>
</Index>
<Index NAME="ASPT">
<Bucket Classification="H" ID="H" RANK="1">
<Lower_Bound Operator="gte">1.0059</Lower_Bound>
<Upper_Bound Operator="gte">5</Upper_Bound>
</Bucket>
<Bucket Classification="G" ID="G" RANK="2">
<Upper_Bound Operator="lt">1.0059</Upper_Bound>
<Lower_Bound Operator="gte">.8918</Lower_Bound>
</Bucket>
<Bucket Classification="M" ID="M" RANK="3">
<Upper_Bound Operator="lt">.8918</Upper_Bound>
<Lower_Bound Operator="gte">.7778</Lower_Bound>
</Bucket>
<Bucket Classification="P" ID="P" RANK="4">
<Upper_Bound Operator="lt">.7778</Upper_Bound>
<Lower_Bound Operator="gte">.6533</Lower_Bound>
</Bucket>
<Bucket Classification="B" ID="B" RANK="5">
<Upper_Bound Operator="lt">.6533</Upper_Bound>
<Lower_Bound Operator="gte">0</Lower_Bound>
</Bucket>
</Index>
</Dataset>
</Datasets>